bug

randolf.geist's picture

Pending Statistics

This is just a quick heads-up to those that plan to use the Pending Statistics feature that has been introduced in Oracle 11.1.

It looks like that in all currently available versions that support this feature Pending Statistics have not been implemented consequently for all possible DBMS_STATS calls, so you have to be very careful which calls you use. Having enabled the pending statistics for a particular table you might start to manipulate the statistics under the impression that the modifications performed are not reflected in the actual dictionary statistics (by "dictionary statistics" in this case I don't mean the statistics of the data dictionary objects themselves but the actual statistics of database objects stored in the data dictionary) but only in the pending statistics area allowing you to test statistics modifications in an isolated environment using the OPTIMIZER_USE_PENDING_STATISTICS parameter on session level.

You therefore might be in for a surprise to find out that this holds true only for a limited set of DBMS_STATS calls, but not for all.

This effectively means that particular changes to the statistics will be effective immediately although pending statistics have been enabled.

In particular manipulations of the statistics using the SET_*_STATS procedures of DBMS_STATS seem to ignore the pending statistics settings and still update the dictionary statistics immediately without further notice.

This is rather unfortunate since this means that Pending Statistics can not be used in a straightforward way to test user-defined statistics which can be very helpful under certain circumstances but require extensive testing before using them on a live system.

But also other calls, like gathering statistics only for a particular set of columns show an unexpected behaviour: It looks like that both statistics get modified, the pending statistics area, but also the dictionary statistics.

Note that setting the GLOBAL preferences (DBMS_STATS.SET_GLOBAL_PREFS) for PUBLISH to FALSE seems to fix this particular issue - in that case only the pending statistics get updated, but the dictionary statistics are left unchanged. This fix does not apply to the SET_*_STATS procedures unfortunately, those seem to always update the dictionary statistics.

The worst thing however is that the statistics history that is automatically maintained since Oracle 10g does not reflect these (unintended) changes properly, so you can not easily recover from the potentially unwanted modifications by calling DBMS_STATS.RESTORE_TABLE_STATS.

Finally I was obviously able to activate the pending statistics using DBMS_STATS.RESTORE_TABLE_STATS - you can rather clearly see this behaviour when using the SET_GLOBAL_PREFS('PUBLISH', 'FALSE') variant of the following script.

The following is a small demonstration of the issues encountered - please note that it modifies the GLOBAL preferences for the PUBLISH setting if you intend to run this test by yourself.

set echo on timing on linesize 130 tab off trimspool on

alter session set nls_date_format = 'DD.MM.YYYY HH24:MI:SS';

drop table t purge;

create table t
as
select * from all_objects
where rownum <= 1000;

exec dbms_stats.set_global_prefs('PUBLISH', 'TRUE')

select dbms_stats.get_prefs('PUBLISH', null, 'T') from dual;

-- Our baseline, no histograms, basic column statistics for all columns
exec dbms_stats.gather_table_stats(null, 'T', estimate_percent => null, cascade => false, method_opt => 'FOR ALL COLUMNS SIZE 1')

-- Verify the result
select
num_distinct
, density
, last_analyzed
, num_buckets
, user_stats
from
user_tab_col_statistics
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

-- Enable pending statistics for table T
-- You can try these different calls
--
-- The GATHER_*_STATS procedures seem to behave correctly
-- only when setting the GLOBAL PREFS to FALSE
--
-- "Correctly" means that the results are reflected in the
-- pending area only but not in the dictionary statistics
--
-- Note that the SET_*_STATS procedures seem to ignore the setting
-- always and publish directly to the dictionary
-- no matter what the PUBLISH setting is on any level (TABLE, GLOBAL)
--
-- exec dbms_stats.set_global_prefs('PUBLISH', 'FALSE')
exec dbms_stats.set_table_prefs(null, 'T', 'PUBLISH', 'FALSE')
-- exec dbms_stats.set_schema_prefs(user, 'PUBLISH', 'FALSE')

-- Verify the current setting, statistics will not be published
select dbms_stats.get_prefs('PUBLISH', null, 'T') from dual;

-- Wait for two seconds to make the LAST_ANALYZED column meaningful
exec dbms_lock.sleep(2)

-- This is supposed to go to the pending statistics area
exec dbms_stats.gather_table_stats(null, 'T', estimate_percent => null, cascade => false, method_opt => 'FOR ALL COLUMNS SIZE 254')

-- Yes, it worked, the dictionary statistics are not modified
select
num_distinct
, density
, last_analyzed
, num_buckets
, user_stats
from
user_tab_col_statistics
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

-- The pending statistics area contains now the new statistics including histograms
select
num_distinct
, density
, last_analyzed
from
user_col_pending_stats
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

select
count(*)
from
user_tab_histgrm_pending_stats
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

select
count(*)
from
user_tab_histograms
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

-- Wait for two seconds to make the LAST_ANALYZED column meaningful
exec dbms_lock.sleep(2)

-- Let's gather statistics only for the OBJECT_NAME column
exec dbms_stats.gather_table_stats(null, 'T', estimate_percent => null, cascade => false, method_opt => 'FOR COLUMNS OBJECT_NAME SIZE 1')

-- Oops, why do my dictionary statistics reflect that change (Note the LAST_ANALYZED column)
-- Except for you set the GLOBAL preferences for PUBLISH to FALSE
select
num_distinct
, density
, last_analyzed
, num_buckets
, user_stats
from
user_tab_col_statistics
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

-- I do have now the statistics updated in both, pending statistics and dictionary statistics
-- Except for you set the GLOBAL preferences for PUBLISH to FALSE
select
num_distinct
, density
, last_analyzed
from
user_col_pending_stats
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

select
count(*)
from
user_tab_histgrm_pending_stats
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

select
count(*)
from
user_tab_histograms
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

-- Wait for two seconds to make the LAST_ANALYZED column meaningful
exec dbms_lock.sleep(2)

-- Let's recreate the histogram only on the OBJECT_NAME column
exec dbms_stats.gather_table_stats(null, 'T', estimate_percent => null, cascade => false, method_opt => 'FOR COLUMNS OBJECT_NAME SIZE 254')

-- Oops, I did it again...
-- Except for you set the GLOBAL preferences for PUBLISH to FALSE
select
num_distinct
, density
, last_analyzed
, num_buckets
, user_stats
from
user_tab_col_statistics
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

select
num_distinct
, density
, last_analyzed
from
user_col_pending_stats
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

select
count(*)
from
user_tab_histgrm_pending_stats
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

select
count(*)
from
user_tab_histograms
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

-- Wait for two seconds to make the LAST_ANALYZED column meaningful
exec dbms_lock.sleep(2)

-- Let's define a manually crafted NDV and DENSITY value
-- Again I expect this to go to the pending statistics area
declare
srec dbms_stats.statrec;
novals dbms_stats.numarray;
distcnt number;
avgclen number;
nullcnt number;
density number;
begin
dbms_stats.get_column_stats(null, 't', 'object_name', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
dbms_stats.set_column_stats(
ownname=>null,
tabname=>'t',
colname=>'object_name',
distcnt=>distcnt*100,
nullcnt=>nullcnt,
srec=>srec,
avgclen=>avgclen,
density=>density/100
);
end;
/

-- Nope, no change here
select
num_distinct
, density
, last_analyzed
from
user_col_pending_stats
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

-- But I just changed it in the dictionary statistics
-- Even in case of setting the GLOBAL preference to FALSE
select
num_distinct
, density
, last_analyzed
, num_buckets
, user_stats
from
user_tab_col_statistics
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

-- And what is even worse: The statistics history does not reflect all these changes to the dictionary statistics
select table_name, stats_update_time from USER_TAB_STATS_HISTORY where table_name = 'T';

exec dbms_stats.restore_table_stats(null, 'T', systimestamp)

-- But which statistics have been restored now?
-- It looks like this actually restored the PENDING statistics
-- according to the LAST_ANALYZED information??
select
num_distinct
, density
, last_analyzed
, num_buckets
, user_stats
from
user_tab_col_statistics
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

select
count(*)
from
user_tab_histgrm_pending_stats
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

select
count(*)
from
user_tab_histograms
where
table_name = 'T'
and column_name = 'OBJECT_NAME';

exec dbms_stats.set_global_prefs('PUBLISH', 'TRUE')

And this is what I get from running this on 11.1.0.7, 11.2.0.1 or 11.2.0.2:

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> alter session set nls_date_format = 'DD.MM.YYYY HH24:MI:SS';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> drop table t purge;

Table dropped.

Elapsed: 00:00:00.20
SQL>
SQL> create table t
2 as
3 select * from all_objects
4 where rownum <= 1000;

Table created.

Elapsed: 00:00:00.35
SQL>
SQL> exec dbms_stats.set_global_prefs('PUBLISH', 'TRUE')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> select dbms_stats.get_prefs('PUBLISH', null, 'T') from dual;

DBMS_STATS.GET_PREFS('PUBLISH',NULL,'T')
----------------------------------------------------------------------------------------------------------------------------------
TRUE

Elapsed: 00:00:00.00
SQL>
SQL> -- Our baseline, no histograms, basic column statistics for all columns
SQL> exec dbms_stats.gather_table_stats(null, 'T', estimate_percent => null, cascade => false, method_opt => 'FOR ALL COLUMNS SIZE 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.09
SQL>
SQL> -- Verify the result
SQL> select
2 num_distinct
3 , density
4 , last_analyzed
5 , num_buckets
6 , user_stats
7 from
8 user_tab_col_statistics
9 where
10 table_name = 'T'
11 and column_name = 'OBJECT_NAME';

NUM_DISTINCT DENSITY LAST_ANALYZED NUM_BUCKETS USE
------------ ---------- ------------------- ----------- ---
995 .001005025 18.01.2011 18:58:24 1 NO

Elapsed: 00:00:00.01
SQL>
SQL> -- Enable pending statistics for table T
SQL> -- You can try these different calls
SQL> --
SQL> -- The GATHER_*_STATS procedures seem to behave correctly
SQL> -- only when setting the GLOBAL PREFS to FALSE
SQL> --
SQL> -- "Correctly" means that the results are reflected in the
SQL> -- pending area only but not in the dictionary statistics
SQL> --
SQL> -- Note that the SET_*_STATS procedures seem to ignore the setting
SQL> -- always and publish directly to the dictionary
SQL> -- no matter what the PUBLISH setting is on any level (TABLE, GLOBAL)
SQL> --
SQL> -- exec dbms_stats.set_global_prefs('PUBLISH', 'FALSE')
SQL> exec dbms_stats.set_table_prefs(null, 'T', 'PUBLISH', 'FALSE')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01
SQL> -- exec dbms_stats.set_schema_prefs(user, 'PUBLISH', 'FALSE')
SQL>
SQL> -- Verify the current setting, statistics will not be published
SQL> select dbms_stats.get_prefs('PUBLISH', null, 'T') from dual;

DBMS_STATS.GET_PREFS('PUBLISH',NULL,'T')
----------------------------------------------------------------------------------------------------------------------------------
FALSE

Elapsed: 00:00:00.01
SQL>
SQL> -- Wait for two seconds to make the LAST_ANALYZED column meaningful
SQL> exec dbms_lock.sleep(2)

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.01
SQL>
SQL> -- This is supposed to go to the pending statistics area
SQL> exec dbms_stats.gather_table_stats(null, 'T', estimate_percent => null, cascade => false, method_opt => 'FOR ALL COLUMNS SIZE 254')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.99
SQL>
SQL> -- Yes, it worked, the dictionary statistics are not modified
SQL> select
2 num_distinct
3 , density
4 , last_analyzed
5 , num_buckets
6 , user_stats
7 from
8 user_tab_col_statistics
9 where
10 table_name = 'T'
11 and column_name = 'OBJECT_NAME';

NUM_DISTINCT DENSITY LAST_ANALYZED NUM_BUCKETS USE
------------ ---------- ------------------- ----------- ---
995 .001005025 18.01.2011 18:58:24 1 NO

Elapsed: 00:00:00.01
SQL>
SQL> -- The pending statistics area contains now the new statistics including histograms
SQL> select
2 num_distinct
3 , density
4 , last_analyzed
5 from
6 user_col_pending_stats
7 where
8 table_name = 'T'
9 and column_name = 'OBJECT_NAME';

NUM_DISTINCT DENSITY LAST_ANALYZED
------------ ---------- -------------------
995 .00101 18.01.2011 18:58:26

Elapsed: 00:00:00.01
SQL>
SQL> select
2 count(*)
3 from
4 user_tab_histgrm_pending_stats
5 where
6 table_name = 'T'
7 and column_name = 'OBJECT_NAME';

COUNT(*)
----------
255

Elapsed: 00:00:00.01
SQL>
SQL> select
2 count(*)
3 from
4 user_tab_histograms
5 where
6 table_name = 'T'
7 and column_name = 'OBJECT_NAME';

COUNT(*)
----------
2

Elapsed: 00:00:00.01
SQL>
SQL> -- Wait for two seconds to make the LAST_ANALYZED column meaningful
SQL> exec dbms_lock.sleep(2)

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.01
SQL>
SQL> -- Let's gather statistics only for the OBJECT_NAME column
SQL> exec dbms_stats.gather_table_stats(null, 'T', estimate_percent => null, cascade => false, method_opt => 'FOR COLUMNS OBJECT_NAME SIZE 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.14
SQL>
SQL> -- Oops, why do my dictionary statistics reflect that change (Note the LAST_ANALYZED column)
SQL> -- Except for you set the GLOBAL preferences for PUBLISH to FALSE
SQL> select
2 num_distinct
3 , density
4 , last_analyzed
5 , num_buckets
6 , user_stats
7 from
8 user_tab_col_statistics
9 where
10 table_name = 'T'
11 and column_name = 'OBJECT_NAME';

NUM_DISTINCT DENSITY LAST_ANALYZED NUM_BUCKETS USE
------------ ---------- ------------------- ----------- ---
995 .001005025 18.01.2011 18:58:29 1 NO

Elapsed: 00:00:00.01
SQL>
SQL> -- I do have now the statistics updated in both, pending statistics and dictionary statistics
SQL> -- Except for you set the GLOBAL preferences for PUBLISH to FALSE
SQL> select
2 num_distinct
3 , density
4 , last_analyzed
5 from
6 user_col_pending_stats
7 where
8 table_name = 'T'
9 and column_name = 'OBJECT_NAME';

NUM_DISTINCT DENSITY LAST_ANALYZED
------------ ---------- -------------------
995 .001005025 18.01.2011 18:58:29

Elapsed: 00:00:00.01
SQL>
SQL> select
2 count(*)
3 from
4 user_tab_histgrm_pending_stats
5 where
6 table_name = 'T'
7 and column_name = 'OBJECT_NAME';

COUNT(*)
----------
0

Elapsed: 00:00:00.01
SQL>
SQL> select
2 count(*)
3 from
4 user_tab_histograms
5 where
6 table_name = 'T'
7 and column_name = 'OBJECT_NAME';

COUNT(*)
----------
2

Elapsed: 00:00:00.01
SQL>
SQL> -- Wait for two seconds to make the LAST_ANALYZED column meaningful
SQL> exec dbms_lock.sleep(2)

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.01
SQL>
SQL> -- Let's recreate the histogram only on the OBJECT_NAME column
SQL> exec dbms_stats.gather_table_stats(null, 'T', estimate_percent => null, cascade => false, method_opt => 'FOR COLUMNS OBJECT_NAME SIZE 254')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.34
SQL>
SQL> -- Oops, I did it again...
SQL> -- Except for you set the GLOBAL preferences for PUBLISH to FALSE
SQL> select
2 num_distinct
3 , density
4 , last_analyzed
5 , num_buckets
6 , user_stats
7 from
8 user_tab_col_statistics
9 where
10 table_name = 'T'
11 and column_name = 'OBJECT_NAME';

NUM_DISTINCT DENSITY LAST_ANALYZED NUM_BUCKETS USE
------------ ---------- ------------------- ----------- ---
995 .00101 18.01.2011 18:58:32 254 NO

Elapsed: 00:00:00.01
SQL>
SQL> select
2 num_distinct
3 , density
4 , last_analyzed
5 from
6 user_col_pending_stats
7 where
8 table_name = 'T'
9 and column_name = 'OBJECT_NAME';

NUM_DISTINCT DENSITY LAST_ANALYZED
------------ ---------- -------------------
995 .00101 18.01.2011 18:58:32

Elapsed: 00:00:00.01
SQL>
SQL> select
2 count(*)
3 from
4 user_tab_histgrm_pending_stats
5 where
6 table_name = 'T'
7 and column_name = 'OBJECT_NAME';

COUNT(*)
----------
255

Elapsed: 00:00:00.01
SQL>
SQL> select
2 count(*)
3 from
4 user_tab_histograms
5 where
6 table_name = 'T'
7 and column_name = 'OBJECT_NAME';

COUNT(*)
----------
255

Elapsed: 00:00:00.01
SQL>
SQL> -- Wait for two seconds to make the LAST_ANALYZED column meaningful
SQL> exec dbms_lock.sleep(2)

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.01
SQL>
SQL> -- Let's define a manually crafted NDV and DENSITY value
SQL> -- Again I expect this to go to the pending statistics area
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 begin
9 dbms_stats.get_column_stats(null, 't', 'object_name', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
10 dbms_stats.set_column_stats(
11 ownname=>null,
12 tabname=>'t',
13 colname=>'object_name',
14 distcnt=>distcnt*100,
15 nullcnt=>nullcnt,
16 srec=>srec,
17 avgclen=>avgclen,
18 density=>density/100
19 );
20 end;
21 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.17
SQL>
SQL> -- Nope, no change here
SQL> select
2 num_distinct
3 , density
4 , last_analyzed
5 from
6 user_col_pending_stats
7 where
8 table_name = 'T'
9 and column_name = 'OBJECT_NAME';

NUM_DISTINCT DENSITY LAST_ANALYZED
------------ ---------- -------------------
995 .00101 18.01.2011 18:58:32

Elapsed: 00:00:00.01
SQL>
SQL> -- But I just changed it in the dictionary statistics
SQL> -- Even in case of setting the GLOBAL preference to FALSE
SQL> select
2 num_distinct
3 , density
4 , last_analyzed
5 , num_buckets
6 , user_stats
7 from
8 user_tab_col_statistics
9 where
10 table_name = 'T'
11 and column_name = 'OBJECT_NAME';

NUM_DISTINCT DENSITY LAST_ANALYZED NUM_BUCKETS USE
------------ ---------- ------------------- ----------- ---
99500 .0000101 18.01.2011 18:58:34 254 YES

Elapsed: 00:00:00.01
SQL>
SQL> -- And what is even worse: The statistics history does not reflect all these changes to the dictionary statistics
SQL> select table_name, stats_update_time from USER_TAB_STATS_HISTORY where table_name = 'T';

TABLE_NAME STATS_UPDATE_TIME
------------------------------ ---------------------------------------------------------------------------
T 18-JAN-11 06.58.24.391000 PM +01:00

Elapsed: 00:00:00.01
SQL>
SQL> exec dbms_stats.restore_table_stats(null, 'T', systimestamp)

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.29
SQL>
SQL> -- But which statistics have been restored now?
SQL> -- It looks like this actually restored the PENDING statistics
SQL> -- according to the LAST_ANALYZED information??
SQL> select
2 num_distinct
3 , density
4 , last_analyzed
5 , num_buckets
6 , user_stats
7 from
8 user_tab_col_statistics
9 where
10 table_name = 'T'
11 and column_name = 'OBJECT_NAME';

NUM_DISTINCT DENSITY LAST_ANALYZED NUM_BUCKETS USE
------------ ---------- ------------------- ----------- ---
995 .00101 18.01.2011 18:58:32 254 NO

Elapsed: 00:00:00.01
SQL>
SQL> select
2 count(*)
3 from
4 user_tab_histgrm_pending_stats
5 where
6 table_name = 'T'
7 and column_name = 'OBJECT_NAME';

COUNT(*)
----------
255

Elapsed: 00:00:00.01
SQL>
SQL> select
2 count(*)
3 from
4 user_tab_histograms
5 where
6 table_name = 'T'
7 and column_name = 'OBJECT_NAME';

COUNT(*)
----------
255

Elapsed: 00:00:00.01
SQL>
SQL> exec dbms_stats.set_global_prefs('PUBLISH', 'TRUE')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01
SQL>

It is also interesting to note that, although Oracle has added an array of new dictionary views that allow to access the pending statistics area, these views are inconsistent with the existing statistics-related views in terms of naming conventions, columns and behaviour.

For example the *_col_pending_stats view does not have a NUM_BUCKETS column, and the corresponding *_tab_histgrm_pending_stats view for histogram details shows 0 rows if basic column statistics have been defined but no histogram whereas the original *_tab_histograms returns two rows if basic column statistics have been collected representing the low and high value of the column.

Summary

The Pending Statistics feature clearly shows some unexpected behaviour. In particular you better don't rely on it preventing the dictionary statistics from being updated by DBMS_STATS calls with the PUBLISH attribute set to FALSE.

Some of this clearly looks like a bug but I couldn't find a corresponding bug entry yet in My Oracle Support.

Note that this post does not cover the actual usage of Pending Statistics by the optimizer - I haven't done any extensive testing in this regard, but some quick checks showed that it seems to work as expected, which means that the optimizer picks statistics for those tables that have Pending Statistics defined when setting OPTIMIZER_USE_PENDING_STATISTICS = TRUE, but still uses the statistics from the dictionary for those that don't have any pending statistics defined.

randolf.geist's picture

Hash Aggregation

Oracle introduced in Oracle 10g the hash aggregation as a new feature. It can be used for both GROUP BY and UNIQUE operations (HASH GROUP BY and HASH UNIQUE respectively) and is by default the preferred aggregation method if there is no particular reason that lets the cost based optimizer prefer the sort based aggregation (SORT GROUP BY and SORT UNIQUE), for example if the GROUP BY is followed by an ORDER BY on the same expression (using the SORT GROUP BY in such cases is not always beneficial by the way, see Guy Harrison's blog for an example).

Ever since its introduction from time to time I've heard complaints about performance degradations of aggregation operations that are based on the new hash aggregation algorithm compared to the previously used sort based aggregations.

Now there may be many potential reasons for such performance degradations (and possibly many of them might not have anything to do with the hash aggregation) but here is a surprising revelation that might explain why some of them were indeed caused by the switch to the new algorithm: The hash aggregation operation does not work very well together with the automatic PGA management (WORKAREA_SIZE_POLICY = AUTO, default since 9i). The fundamental defect is that it is not able to dynamically resize to a larger workarea size when using automatic PGA management and therefore remains more or less at its initial expected size based on the estimates at optimization time.

This effectively means that the efficiency of the hash aggregation operation when using automatic PGA management is heavily dependant on the cardinality estimates at optimization time - in case of estimates in the right ballpark, the memory used at execution time will correspond to the actual requirements at runtime, but in case of bad estimates, the operation potentially uses far less memory than available and unnecessarily spills to disk.

Let's start with a simple script to demonstrate the issue:

set echo on timing on

show parameter pga
show parameter processes

-- alter session set workarea_size_policy = manual sort_area_size = 40000000;

drop table t1 purge;

drop table t2 purge;

create table t1
as
select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

create table t2
as
select distinct user_id from t1;

exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

column low_val new_value low_value
column high_val new_value high_value

select
max(user_id) + 1 as low_val
, max(user_id) + max(user_id) - min(user_id) + 1 as high_val
from
t2;

-- Invalidate any cursors using T1
comment on table t1 is '';

alter session set statistics_level = all;

-- alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

-- alter session set "_smm_trace" = 65535;

select /*+ full(@inner t1)
full(@inner t2)
full(@inner t3)
leading(@inner t2 t3 t1)
use_nl(@inner t3)
use_hash(@inner t1)
no_swap_join_inputs(@inner t1)
use_hash_aggregation(@inner)
*/
max(cnt)
from
(
select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
from t1, t2, t2 t3
where t1.user_id = t2.user_id
group by t1.object_id, t3.user_id
)
;

-- alter session set "_smm_trace" = 0;

set pagesize 0 linesize 200 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

set pagesize 14

-- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
declare
srec dbms_stats.statrec;
novals dbms_stats.numarray;
distcnt number;
avgclen number;
nullcnt number;
density number;
srec2 dbms_stats.statrec;
begin
dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
srec2.epc := 2;
novals := dbms_stats.numarray(
&low_value,
&high_value
);
srec2.bkvals := null;
dbms_stats.prepare_column_values(srec2,novals);
dbms_stats.set_column_stats(
ownname=>null,
tabname=>'t1',
colname=>'user_id',
distcnt=>distcnt,
nullcnt=>nullcnt,
srec=>srec2,
avgclen=>avgclen,
density=>density
);
end;
/

-- Invalidate any cursors using T1
comment on table t1 is '';

alter session set statistics_level = all;

-- alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

-- alter session set "_smm_trace" = 65535;

select /*+ full(@inner t1)
full(@inner t2)
full(@inner t3)
leading(@inner t2 t3 t1)
use_nl(@inner t3)
use_hash(@inner t1)
no_swap_join_inputs(@inner t1)
use_hash_aggregation(@inner)
*/
max(cnt)
from
(
select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
from t1, t2, t2 t3
where t1.user_id = t2.user_id
group by t1.object_id, t3.user_id
)
;

-- alter session set "_smm_trace" = 0;

set pagesize 0 linesize 200 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

I first start with showing the current PGA_AGGREGATE_TARGET (P_A_T) and PROCESSES setting.

I create then two tables as sample data that will be joined and already collect information about the minimum and maximum value of the join column.

Any cursors that are dependent on table T1 will then be invalidated to make sure that a reoptimization of the next query will take place (of course running the complete script invalidates such cursors anyway due to the drop and re-create of the tables).

I then run a simple join followed by a group by operation. I've added a cartesian join by the way, but its only purpose is to ensure that the generated row source is sufficiently large to give the group by something to do.

I have used hints to ensure that I always get the same execution plan even if I later on manipulate the statistics to see the effect of incorrect cardinality estimates.

After getting the actual runtime execution plan along with execution statistics I modify the column statistics of one of the tables' join column in such a way that the optimizer thinks that there is no overlap between the join column values and therefore computes a very low join cardinality.

After making sure again that a reoptimization will take place I run the same statement again with the same data volume and the same hints in place.

And this is what I get for all major versions that are currently out there (10.2.0.4, 10.2.0.5, 11.1.0.7, 11.2.0.1 and 11.2.0.2):

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 200M
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
job_queue_processes integer 10
log_archive_max_processes integer 2
processes integer 150
SQL>
SQL> -- alter session set workarea_size_policy = manual sort_area_size = 40000000;
SQL>
SQL> drop table t1 purge;
drop table t1 purge
*
ERROR at line 1:
ORA-00942: table or view does not exist

Elapsed: 00:00:00.01
SQL>
SQL> drop table t2 purge;
drop table t2 purge
*
ERROR at line 1:
ORA-00942: table or view does not exist

Elapsed: 00:00:00.00
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.07
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.06
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.09
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
27 53

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.04
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> -- alter session set tracefile_identifier = 'smm_trace_correct_cardinality';
SQL>
SQL> -- alter session set "_smm_trace" = 65535;
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.71
SQL>
SQL> -- alter session set "_smm_trace" = 0;
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner
t2 t3 t1) use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner
t1) use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */ count(*)
as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by t1.object_id,
t3.user_id )

Plan hash value: 3134842094

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.72 | 139 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.72 | 139 | | | |
| 2 | VIEW | | 1 | 570K| 810K|00:00:00.43 | 139 | | | |
| 3 | HASH GROUP BY | | 1 | 570K| 810K|00:00:00.43 | 139 | 35M| 5521K| 38M (0)|
|* 4 | HASH JOIN | | 1 | 807K| 810K|00:00:00.02 | 139 | 1348K| 1348K| 1093K (0)|
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29907 | 30000 |00:00:00.01 | 55 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")

29 rows selected.

Elapsed: 00:00:00.12
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 27,
old 14: &high_value
new 14: 53

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.03
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> -- alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';
SQL>
SQL> -- alter session set "_smm_trace" = 65535;
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:01.37
SQL>
SQL> -- alter session set "_smm_trace" = 0;
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner t1) use_hash_aggregation(@inner)
*/ max(cnt) from ( select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id =
t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 3134842094

------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.37 | 139 | 2715 | 2715 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.37 | 139 | 2715 | 2715 | | | | |
| 2 | VIEW | | 1 | 1 | 810K|00:00:00.62 | 139 | 2715 | 2715 | | | | |
| 3 | HASH GROUP BY | | 1 | 1 | 810K|00:00:00.62 | 139 | 2715 | 2715 | 35M| 5521K| 7010K (1)| 23552 |
|* 4 | HASH JOIN | | 1 | 1 | 810K|00:00:00.01 | 139 | 0 | 0 | 1348K| 1348K| 1167K (0)| |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29907 | 30000 |00:00:00.01 | 55 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")

28 rows selected.

Elapsed: 00:00:00.03
SQL>

As it can be seen with an estimate in the right ballpark the HASH GROUP BY operation completes in memory (very close to the 20% PGA_AGGREGATE_TARGET maximum size of a single workarea from 10.2 on with automatic PGA management for PGA_AGGREGATE_TARGET < 500M, for more information see Joze Senegacnik's paper on the internals of automatic PGA management).

However repeating exactly the same operation with the fudged statistics it spills to disk and uses only 7M, although it could have used up to 40M (given that there is no concurrent workload).

The same does not happen when repeating this experiment with other operations that use a workarea - the most obvious one being a SORT GROUP BY, as can be seen from this output:

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 200M
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
job_queue_processes integer 10
log_archive_max_processes integer 2
processes integer 150
SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.04
SQL>
SQL> drop table t2 purge;

Table dropped.

Elapsed: 00:00:00.04
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 15) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.06
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.03
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.09
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
15 29

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> -- alter session set tracefile_identifier = 'smm_trace_correct_cardinality';
SQL>
SQL> -- alter session set "_smm_trace" = 65535;
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 no_use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.54
SQL>
SQL> -- alter session set "_smm_trace" = 0;
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 94tqnspjuzk8x, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner
t2 t3 t1) use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner
t1) no_use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */
count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by
t1.object_id, t3.user_id )

Plan hash value: 2068941295

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.53 | 103 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.53 | 103 | | | |
| 2 | VIEW | | 1 | 315K| 450K|00:00:00.37 | 103 | | | |
| 3 | SORT GROUP BY | | 1 | 315K| 450K|00:00:00.37 | 103 | 28M| 1913K| 25M (0)|
|* 4 | HASH JOIN | | 1 | 446K| 450K|00:00:00.01 | 103 | 1348K| 1348K| 1097K (0)|
| 5 | NESTED LOOPS | | 1 | 225 | 225 |00:00:00.01 | 48 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 15 | 15 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 15 | 15 | 225 |00:00:00.01 | 45 | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29787 | 30000 |00:00:00.01 | 55 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")

29 rows selected.

Elapsed: 00:00:00.03
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 15,
old 14: &high_value
new 14: 29

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> -- alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';
SQL>
SQL> -- alter session set "_smm_trace" = 65535;
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 no_use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.56
SQL>
SQL> -- alter session set "_smm_trace" = 0;
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 94tqnspjuzk8x, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner
t2 t3 t1) use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner
t1) no_use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */
count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by
t1.object_id, t3.user_id )

Plan hash value: 2068941295

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.55 | 103 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.55 | 103 | | | |
| 2 | VIEW | | 1 | 1 | 450K|00:00:00.38 | 103 | | | |
| 3 | SORT GROUP BY | | 1 | 1 | 450K|00:00:00.38 | 103 | 28M| 1913K| 25M (0)|
|* 4 | HASH JOIN | | 1 | 1 | 450K|00:00:00.01 | 103 | 1348K| 1348K| 1053K (0)|
| 5 | NESTED LOOPS | | 1 | 225 | 225 |00:00:00.01 | 48 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 15 | 15 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 15 | 15 | 225 |00:00:00.01 | 45 | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29787 | 30000 |00:00:00.01 | 55 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")

29 rows selected.

Elapsed: 00:00:00.03
SQL>

All I've done is to reduce the data set, because the SORT GROUP BY in this case required more memory for the same amount of data than the HASH GROUP BY, in order to prevent the operation to spill to disk with a 200M PGA_AGGREGATE_TARGET setting and change the USE_HASH_AGGREGATION hint to NO_USE_HASH_AGGREGATION.

As you can see, the operation completes both times in memory and uses the same amount of memory no matter what the estimates look like.

I've tested the serial execution of the most common workarea based operations like HASH JOIN, SORT ORDER BY, WINDOW SORT, SORT UNIQUE and all of them were able to dynamically resize the workarea in cases where the initial estimated size was too small.

If you carefully check then you'll notice that I haven't mentioned the HASH UNIQUE operation yet, and later on you'll see why.

A cunning feature of the automatic PGA management comes to help, however, which is a kind of "feedback loop" for the workareas based on statistics maintained by the automatic PGA memory management, and indeed when the HASH GROUP BY cursor based on the incorrect cardinality estimate gets shared (not invalidated) and re-executed, the next execution will look like this:

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality_repeated_execution';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.76
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner t1)
use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id,
t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 3134842094

------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.75 | 139 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.75 | 139 | | | | |
| 2 | VIEW | | 1 | 1 | 810K|00:00:00.44 | 139 | | | | |
| 3 | HASH GROUP BY | | 1 | 1 | 810K|00:00:00.44 | 139 | 35M| 5521K| 38M (0)| |
|* 4 | HASH JOIN | | 1 | 1 | 810K|00:00:00.01 | 139 | 1348K| 1348K| 1412K (0)| |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29836 | 30000 |00:00:00.01 | 55 | | | | |
------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")

28 rows selected.

Elapsed: 00:00:00.03
SQL>

So Oracle this time based the workarea memory requirements on the feedback from the previous execution and therefore allocated sufficient memory to complete the operation without spilling to disk.

Notice however that the feedback loop unfortunately does not work as desired when the workarea execution fails due to insufficient TEMP space. Ideally the feedback loop should allow the subsequent executions to grab more memory specifically in such cases, but at present it doesn't - the failure obviously prevents an update of the statistics and therefore subsequent executions continue to fail since they still use the same amount of memory.

You can test this by simply assigned a very small TEMP tablespace to the user executing the query so that the second execution fails due to insufficient TEMP space. If you repeat the execution in this case, it will fail again and keeps doing so.

What happens if I invalidate the cursor and repeat the execution?

SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.01
SQL> set echo on timing on
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality_repeated_exec_invalid';

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;
1

Elapsed: 00:00:01.29
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner t1) use_hash_aggregation(@inner)
*/ max(cnt) from ( select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id =
t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 3134842094

------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.28 | 139 | 2715 | 2715 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.28 | 139 | 2715 | 2715 | | | | |
| 2 | VIEW | | 1 | 1 | 810K|00:00:00.61 | 139 | 2715 | 2715 | | | | |
| 3 | HASH GROUP BY | | 1 | 1 | 810K|00:00:00.61 | 139 | 2715 | 2715 | 35M| 5521K| 7067K (1)| 23552 |
|* 4 | HASH JOIN | | 1 | 1 | 810K|00:00:00.01 | 139 | 0 | 0 | 1348K| 1348K| 1129K (0)| |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29836 | 30000 |00:00:00.01 | 55 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")

28 rows selected.

Elapsed: 00:00:00.03
SQL>

Back to square one - with the invalidation the statistics are gone, too, so the bad cardinality estimate again lead to the suboptimal execution of the HASH GROUP BY.

So you might think, why bother? Only the first execution of a cursor without workarea execution statistics will be affected by the problem, subsequent execution of the same cursor will benefit from the statistics from the previous executions.

The problem however is, that this is fine for applications that share cursors. Unfortunately applications that peform heavy duty aggregations like data warehouses typically do not share cursors, since they do not care about the optimization overhead and deliberately use literals to provide as much information to the optimizer as possible.

Also these heavy duty aggregations usually use Parallel Execution features, and as you'll see from the output of the same test case, if I run the HASH GROUP BY in parallel by simply setting table T1 to parallel degree 2, a similar problem occurs - so Parallel Execution is affected as well.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 200M
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
job_queue_processes integer 10
log_archive_max_processes integer 2
processes integer 150
SQL>
SQL> -- alter session set workarea_size_policy = manual sort_area_size = 40000000;
SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.01
SQL>
SQL> drop table t2 purge;

Table dropped.

Elapsed: 00:00:00.04
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.06
SQL>
SQL> alter table t1 parallel 2;

Table altered.

Elapsed: 00:00:00.03
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.03
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
27 53

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.42
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner t1)
use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id,
t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 464898991

--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | O/1/M |
--------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.43 | 87 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.43 | 87 | | | |
| 2 | PX COORDINATOR | | 1 | | 2 |00:00:00.43 | 87 | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10002 | 0 | 1 | 0 |00:00:00.01 | 0 | | | |
| 4 | SORT AGGREGATE | | 2 | 1 | 2 |00:00:00.81 | 0 | | | |
| 5 | VIEW | | 2 | 575K| 810K|00:00:00.48 | 0 | | | |
| 6 | HASH GROUP BY | | 2 | 575K| 810K|00:00:00.48 | 0 | 35M| 5521K| 2/0/0|
| 7 | PX RECEIVE | | 2 | 814K| 810K|00:00:00.01 | 0 | | | |
| 8 | PX SEND HASH | :TQ10001 | 0 | 814K| 0 |00:00:00.01 | 0 | | | |
|* 9 | HASH JOIN | | 2 | 814K| 810K|00:00:00.01 | 89 | 1348K| 1348K| 2/0/0|
| 10 | BUFFER SORT | | 2 | | 1458 |00:00:00.01 | 0 | 29696 | 29696 | 2/0/0|
| 11 | PX RECEIVE | | 2 | 729 | 1458 |00:00:00.01 | 0 | | | |
| 12 | PX SEND BROADCAST | :TQ10000 | 0 | 729 | 0 |00:00:00.01 | 0 | | | |
| 13 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | | | |
| 14 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | | | |
| 15 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | | | |
| 16 | PX BLOCK ITERATOR | | 2 | 30164 | 30000 |00:00:00.01 | 89 | | | |
|* 17 | TABLE ACCESS FULL | T1 | 18 | 30164 | 30000 |00:00:00.01 | 89 | | | |
--------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

9 - access("T1"."USER_ID"="T2"."USER_ID")
17 - access(:Z>=:Z AND :Z<=:Z)

38 rows selected.

Elapsed: 00:00:00.03
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 27,
old 14: &high_value
new 14: 53

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:01.17
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1) use_nl(@inner t3)
use_hash(@inner t1) no_swap_join_inputs(@inner t1) use_hash_aggregation(@inner) */ max(cnt) from ( select /*+
qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 464898991

------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | O/1/M | Max-Tmp |
------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.16 | 87 | 0 | 0 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.16 | 87 | 0 | 0 | | | | |
| 2 | PX COORDINATOR | | 1 | | 2 |00:00:01.16 | 87 | 0 | 0 | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10002 | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | 0 | | | | |
| 4 | SORT AGGREGATE | | 2 | 1 | 2 |00:00:02.25 | 0 | 2460 | 2460 | | | | |
| 5 | VIEW | | 2 | 1 | 810K|00:00:01.53 | 0 | 2460 | 2460 | | | | |
| 6 | HASH GROUP BY | | 2 | 1 | 810K|00:00:01.53 | 0 | 2460 | 2460 | 35M| 5521K| | 11264 |
| 7 | PX RECEIVE | | 2 | 1 | 810K|00:00:00.01 | 0 | 0 | 0 | | | | |
| 8 | PX SEND HASH | :TQ10001 | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | 0 | | | | |
|* 9 | HASH JOIN | | 2 | 1 | 810K|00:00:00.01 | 89 | 0 | 0 | 1348K| 1348K| 2/0/0| |
| 10 | BUFFER SORT | | 2 | | 1458 |00:00:00.01 | 0 | 0 | 0 | 29696 | 29696 | 2/0/0| |
| 11 | PX RECEIVE | | 2 | 729 | 1458 |00:00:00.01 | 0 | 0 | 0 | | | | |
| 12 | PX SEND BROADCAST | :TQ10000 | 0 | 729 | 0 |00:00:00.01 | 0 | 0 | 0 | | | | |
| 13 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 14 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 15 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
| 16 | PX BLOCK ITERATOR | | 2 | 30164 | 30000 |00:00:00.01 | 89 | 0 | 0 | | | | |
|* 17 | TABLE ACCESS FULL | T1 | 18 | 30164 | 30000 |00:00:00.01 | 89 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

9 - access("T1"."USER_ID"="T2"."USER_ID")
17 - access(:Z>=:Z AND :Z<=:Z)

37 rows selected.

Elapsed: 00:00:00.04
SQL>

So, if your database uses automatic PGA management and

- uses the hash aggregation HASH GROUP BY (or HASH UNIQUE, more on that in a moment)
- and does not share cursors

every execution that is based on bad cardinality estimates potentially has a problem with the hash aggregation because it might not make efficient use of the available memory.

The same applies to applications that share cursors, however in that case only the first execution after re-optimization / invalidation is affected.

So you might want to carefully check the runtime execution statistics of your critial hash aggregations.

Mind you, things could be worse, and that is where the HASH UNIQUE operation comes into the picture.

When I realised the issue with the HASH GROUP BY operation I was quite certain that the HASH UNIQUE operation will be affected in a similar way, since internally Oracle seems to use the same mechanism for both operations (In the SMM trace files both are called HASH GROUP BY).

To my surprise I noticed that in 10g versions below 10.2.0.5 and 11g versions below 11.2.0.1 (which means in this case 10.2.0.4 and 11.1.0.7 respectively, I didn't test other versions) the HASH UNIQUE operation suffers from an even more dramatic problem: The cardinality estimate is not considered and the initial workarea size is always based on minimum assumptions.

In passing, according to the SMM trace files that can be activated using the undocumented "_smm_trace" parameter it looks like that many of the sort-based workareas like SORT ORDER BY or WINDOW SORT seem to suffer from the same defect, since they however are able to dynamically resize and make use of the workarea statistics feedback they effectively work as expected, they just start with a too low workarea size estimate every time they are executed for the first time after re-optimization / invalidation.

The combination of the two issues - inability to dynamically resize and ignoring the optimizer estimates - leads to a dire result: Every first execution of a HASH UNIQUE operation in those versions will only use the minimum amount of memory. The following test case shows the problem:

set echo on timing on

show parameter pga
show parameter processes

-- alter session set workarea_size_policy = manual sort_area_size = 40000000;

drop table t1 purge;

drop table t2 purge;

create table t1
as
select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

create table t2
as
select distinct user_id from t1;

exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

column low_val new_value low_value
column high_val new_value high_value

select
max(user_id) + 1 as low_val
, max(user_id) + max(user_id) - min(user_id) + 1 as high_val
from
t2;

-- Invalidate any cursors using T1
comment on table t1 is '';

alter session set statistics_level = all;

alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

alter session set "_smm_trace" = 65535;

select /*+ full(@inner t1)
full(@inner t2)
full(@inner t3)
leading(@inner t2 t3 t1)
use_nl(@inner t3)
use_nl(@inner t1)
use_hash_aggregation(@inner)
*/
max(user_id)
from
(
select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
from t1, t2, t2 t3
where t1.user_id = t2.user_id
)
;

alter session set "_smm_trace" = 0;

set pagesize 0 linesize 200 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

set pagesize 14

-- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
declare
srec dbms_stats.statrec;
novals dbms_stats.numarray;
distcnt number;
avgclen number;
nullcnt number;
density number;
srec2 dbms_stats.statrec;
begin
dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
srec2.epc := 2;
novals := dbms_stats.numarray(
&low_value,
&high_value
);
srec2.bkvals := null;
dbms_stats.prepare_column_values(srec2,novals);
dbms_stats.set_column_stats(
ownname=>null,
tabname=>'t1',
colname=>'user_id',
distcnt=>distcnt,
nullcnt=>nullcnt,
srec=>srec2,
avgclen=>avgclen,
density=>density
);
end;
/

-- Invalidate any cursors using T1
comment on table t1 is '';

alter session set statistics_level = all;

alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

alter session set "_smm_trace" = 65535;

select /*+ full(@inner t1)
full(@inner t2)
full(@inner t3)
leading(@inner t2 t3 t1)
use_nl(@inner t3)
use_nl(@inner t1)
use_hash_aggregation(@inner)
*/
max(user_id)
from
(
select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
from t1, t2, t2 t3
where t1.user_id = t2.user_id
)
;

alter session set "_smm_trace" = 0;

set pagesize 0 linesize 200 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

Obviously it is very similar to the HASH GROUP BY test case, this time however I've used a DISTINCT clause and replaced the HASH JOIN with a NESTED LOOP which makes the generated trace files easier to read, since there is exactly one workarea involved in this execution.

And this is what I get from 11.1.0.7, this time using AMM, and therefore the PGA_AGGREGAT_TARGET has been set to 0 (You'll get the same result from 10.2.0.4 with a corresponding P_A_T setting):

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 0
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
global_txn_processes integer 1
job_queue_processes integer 1000
log_archive_max_processes integer 4
processes integer 150
SQL>
SQL> -- alter session set workarea_size_policy = manual sort_area_size = 40000000;
SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.01
SQL>
SQL> drop table t2 purge;

Table dropped.

Elapsed: 00:00:00.00
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.04
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.04
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.06
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
27 53

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_nl(@inner t1)
7 use_hash_aggregation(@inner)
8 */
9 max(user_id)
10 from
11 (
12 select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
13 from t1, t2, t2 t3
14 where t1.user_id = t2.user_id
15 )
16 ;

MAX(USER_ID)
------------
26

Elapsed: 00:00:01.79
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 7cnqz02uwcs1a, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2)
full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_nl(@inner t1)
use_hash_aggregation(@inner) */ max(user_id) from ( select /*+
qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id from t1,
t2, t2 t3 where t1.user_id = t2.user_id )

Plan hash value: 1541846686

------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.78 | 40179 | 1470 | 1470 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.78 | 40179 | 1470 | 1470 | | | | |
| 2 | VIEW | | 1 | 572K| 810K|00:00:00.97 | 40179 | 1470 | 1470 | | | | |
| 3 | HASH UNIQUE | | 1 | 572K| 810K|00:00:00.97 | 40179 | 1470 | 1470 | 25M| 3296K| 6029K (1)| 13312 |
| 4 | NESTED LOOPS | | 1 | 810K| 810K|00:00:00.01 | 40179 | 0 | 0 | | | | |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
|* 8 | TABLE ACCESS FULL | T1 | 729 | 1111 | 810K|00:00:00.03 | 40095 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - filter("T1"."USER_ID"="T2"."USER_ID")

30 rows selected.

Elapsed: 00:00:00.03
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 27,
old 14: &high_value
new 14: 53

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_nl(@inner t1)
7 use_hash_aggregation(@inner)
8 */
9 max(user_id)
10 from
11 (
12 select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
13 from t1, t2, t2 t3
14 where t1.user_id = t2.user_id
15 )
16 ;

MAX(USER_ID)
------------
26

Elapsed: 00:00:01.26
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 7cnqz02uwcs1a, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2)
full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_nl(@inner t1)
use_hash_aggregation(@inner) */ max(user_id) from ( select /*+
qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id from t1,
t2, t2 t3 where t1.user_id = t2.user_id )

Plan hash value: 1541846686

------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.26 | 40179 | 1470 | 1470 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.26 | 40179 | 1470 | 1470 | | | | |
| 2 | VIEW | | 1 | 1 | 810K|00:00:00.98 | 40179 | 1470 | 1470 | | | | |
| 3 | HASH UNIQUE | | 1 | 1 | 810K|00:00:00.98 | 40179 | 1470 | 1470 | 25M| 3296K| 6093K (1)| 13312 |
| 4 | NESTED LOOPS | | 1 | 1 | 810K|00:00:00.01 | 40179 | 0 | 0 | | | | |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
|* 8 | TABLE ACCESS FULL | T1 | 729 | 1 | 810K|00:00:00.01 | 40095 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - filter("T1"."USER_ID"="T2"."USER_ID")

30 rows selected.

Elapsed: 00:00:00.03
SQL>

As you can see no matter what the estimates looks like, on this system the first execution of a HASH UNIQUE will not get more than 6M - subsequent executions of the same cursor benefit from the statistics from the previous run as seen before.

Again, switching to a SORT UNIQUE and reducing the data set accordingly, the problem can not be reproduced:

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 0
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
global_txn_processes integer 1
job_queue_processes integer 1000
log_archive_max_processes integer 4
processes integer 150
SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.15
SQL>
SQL> drop table t2 purge;

Table dropped.

Elapsed: 00:00:00.07
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 15) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.06
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.04
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
15 29

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_nl(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 no_use_hash_aggregation(@inner)
9 */
10 max(user_id)
11 from
12 (
13 select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 )
17 ;

MAX(USER_ID)
------------
14

Elapsed: 00:00:00.73
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID ckb2sbz3y2z14, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2)
full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_nl(@inner t1)
no_swap_join_inputs(@inner t1)
no_use_hash_aggregation(@inner) */ max(user_id) from ( select
/*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id from
t1, t2, t2 t3 where t1.user_id = t2.user_id )

Plan hash value: 3828303002

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.71 | 12423 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.71 | 12423 | | | |
| 2 | VIEW | | 1 | 318K| 450K|00:00:00.62 | 12423 | | | |
| 3 | SORT UNIQUE | | 1 | 318K| 450K|00:00:00.62 | 12423 | 22M| 1744K| 20M (0)|
| 4 | NESTED LOOPS | | 1 | 450K| 450K|00:00:00.01 | 12423 | | | |
| 5 | NESTED LOOPS | | 1 | 225 | 225 |00:00:00.01 | 48 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 15 | 15 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 15 | 15 | 225 |00:00:00.01 | 45 | | | |
|* 8 | TABLE ACCESS FULL | T1 | 225 | 2000 | 450K|00:00:00.01 | 12375 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - filter("T1"."USER_ID"="T2"."USER_ID")

31 rows selected.

Elapsed: 00:00:00.06
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 15,
old 14: &high_value
new 14: 29

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.06
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_nl(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 no_use_hash_aggregation(@inner)
9 */
10 max(user_id)
11 from
12 (
13 select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 )
17 ;

MAX(USER_ID)
------------
14

Elapsed: 00:00:00.74
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID ckb2sbz3y2z14, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2)
full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_nl(@inner t1)
no_swap_join_inputs(@inner t1)
no_use_hash_aggregation(@inner) */ max(user_id) from ( select
/*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id from
t1, t2, t2 t3 where t1.user_id = t2.user_id )

Plan hash value: 3828303002

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.73 | 12423 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.73 | 12423 | | | |
| 2 | VIEW | | 1 | 1 | 450K|00:00:00.61 | 12423 | | | |
| 3 | SORT UNIQUE | | 1 | 1 | 450K|00:00:00.61 | 12423 | 22M| 1744K| 20M (0)|
| 4 | NESTED LOOPS | | 1 | 1 | 450K|00:00:00.01 | 12423 | | | |
| 5 | NESTED LOOPS | | 1 | 225 | 225 |00:00:00.01 | 48 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 15 | 15 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 15 | 15 | 225 |00:00:00.01 | 45 | | | |
|* 8 | TABLE ACCESS FULL | T1 | 225 | 1 | 450K|00:00:00.01 | 12375 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - filter("T1"."USER_ID"="T2"."USER_ID")

31 rows selected.

Elapsed: 00:00:00.01
SQL>
SQL> set doc off
SQL> doc
SQL>

In 10.2.0.5 and 11.2.0.1/11.2.0.2 this issue is apparently fixed and the cardinality estimates are used for an initial workarea size calculation - however this doesn't mean that the dynamic resize problem is fixed - it simply means that the HASH UNIQUE in those versions behaves exactly as the HASH GROUP BY and the memory usage for the first execution depends on the cardinality estimates.

Summary

If your database uses automatic PGA management then for the hash aggregation HASH GROUP BY / HASH UNIQUE operations every initial execution that is based on bad cardinality estimates potentially has a problem because it might not make efficient use of the available memory.

The same applies to applications that share cursors, however in that case only the initial execution after re-optimization / invalidation is affected, subsequent executions benefit from the workarea statistics feedback mechanism.

Furthermore in 10g versions below 10.2.0.5 and 11g versions below 11.2.0.1 the initial execution of a HASH UNIQUE operation ignores the cardinality estimates and will always be based on minimum assumptions.

So you might want to carefully check the runtime execution statistics of your critical hash aggregations.

Possible Workarounds

Different strategies are available as workarounds, depending on the situation:

- Upgrading to versions where the HASH UNIQUE operation at least considers the optimizer estimates might be beneficial

- Obviously good cardinality estimates are crucial, in that case and when using the correct Oracle versions you should be fine. Using the undocumented OPT_ESTIMATE hint (or the deprecated undocumented CARDINALITY hint) might help in cases where other options like manually crafted statistics are not able to help the optimizer to come up with reasonable cardinality estimates.

- Applications that are able to share cursors might not be too much affected due to the ability to use the workarea executions for subsequent executions of the same cursor

- The described problems disappear when switching to manual workarea size policy and allowing for sufficient memory of workarea. Interestingly the less obvious SORT_AREA_SIZE is used for the manual control of the HASH GROUP BY operation, and not the HASH_AREA_SIZE. Of course using manual PGA memory management system-wide is highly unrecommended, so this might only be a workaround in certain cases of large batch jobs where manual workarea sizes are used anyway. Also note, when switching to manual workareas be aware of a nasty bug that was introduced in the 10.2.0.3 patchset. For more information see MOS note "6053134.8: ALTER SESSION to set SORT_AREA_SIZE no honoured" and for example mine and Jonathen Lewis' post about the issue. According to the MOS note the bug has been fixed in 10.2.0.4.3, 10.2.0.5 and 11.1.0.7 / 11.2

- Obviously the hash aggregation can be avoided on statement level using the NO_USE_HASH_AGGREGATION hint or on session / system level using the _GBY_HASH_AGGREGATION_ENABLED parameter. Since the sort based aggregation can be less efficient (but note that it doesn't have to be, it depends on the individual grouping and data pattern) again this doesn't necessarily solve the problem since aggregate workareas might spill to disk which wouldn't be necessary when hash aggregation was used and worked as expected.

Final Note

Of course there are a lot of things that I haven't touched yet or only briefly, like Parallel Execution of the hash aggregations, workareas used for index sorts, and all the other details when the workarea spills to disk.

If you want to get an idea what kind of nasty things can happen in that case you might want to read Jonathan Lewis' post on analytic functions.

I also haven't tried yet to fiddle around with the undocumented _smm* related parameters if setting them to non-default values allows to work around the issue.

And of course then there are bugs like this one: Bug 6817844 - Multi pass sort with auto memory management even with plenty of PGA [ID 6817844.8], which is also mentioned in Jonathan's post.

As a final note, the odd TEMP tablespace I/O pattern issue that Jonathan describes in his post (Bug 9041800) is marked as fixed in the 11.2.0.2 patch set, but I haven't tested this yet.

Chris Antognini's picture

Workload System Statistics Bug in 11.2

Since the introduction of 11.2 I receive on a regular basis questions related to “strange” workload system statistics. The last email on that topic was sent to me yesterday. So, instead to send, again, a private reply, I decided to write this short post.

What’s wrong with 11.2 and workload system statistics?

Let’s have a look to the output of the following query:

SQL> SELECT pname, pval1
  2  FROM sys.aux_stats$
  3  WHERE sname = 'SYSSTATS_MAIN';

PNAME                  PVAL1
--------------- ------------
CPUSPEEDNW            1596.0
IOSEEKTIM                4.0
IOTFRSPEED            4096.0
SREADTIM             10900.3
MREADTIM              4525.8
CPUSPEED              1603.0
MBRC                     7.0
MAXTHR            17391616.0
SLAVETHR            413696.0

As you can see the SREADTIM and MREADTIM times are very high. In this case about three orders of magnitude of what you would expect from a regular system.

I’m not aware of the exact cause of this problem, but to me it seems that the statistics externalized in x$kcfio are broken. Anyway, in MOS there are several bugs related to it (9842771 and 9701256). Hence, it’s not a feature (e.g. a change in the unit of measure), it’s a bug. On my Linux test system I’m able to reproduce it on both 11.2.0.1 and 11.2.0.2. According to the bugs mentioned before, the problem is not limited to Linux.

Since with the DBMS_STATS package we are not able to gather correct statistics, the only advice I can give on that topic is that you have to manually set them to sensible values.

Update 2011-03-23

To fix the problem you can install the patch 9842771. It is available for 11.2.0.1 and 11.2.0.2. By the way, since the patch only provides a new version of the dbms_stats_internal package, the statistics externalized in x$kcfio are not broken… they just have another unit of measure.

Chris Antognini's picture

DBMS_AUTO_SQLTUNE: ORA-01748 and Documentation Bugs

As of 11.2.0.2 a new package, DBMS_AUTO_SQLTUNE, is available for accessing and configuring Automatic SQL Tuning. The package provides three features:

Execution of the Automatic SQL Tuning task (EXECUTE_AUTO_TUNING_TASK)
Generation of a report showing the output generated by the Automatic SQL Tuning tasks (REPORT_AUTO_TUNING_TASK)
Configuration of the Automatic SQL Tuning parameters (SET_AUTO_TUNING_TASK_PARAMETER)

In this post I would like to [...]

_connect_by_use_union_all

This is just a short note on the parameter introduced in the 11gR2 called _connect_by_use_union_all. I’ve noticed it for the first time in Doc ID 7210630.8, which gives a brief overview of the changes made to the way CBO generates plans for hierarchical queries. As usually happens, the change helps to one problem, but produces [...]

Chris Antognini's picture

Exadata Storage Server and the Query Optimizer – Part 4

When I started writing the series of posts about Exadata Storage Server and the query optimizer, I didn’t expect to write more than three posts (part 1, part 2, part 3). Of course, things change. Hence, here is part 4 to cover a couple of things that I learned in the next couple of months.
In [...]

randolf.geist's picture

Time Model Bug

Tasks that are performed via jobs in the database will be double accounted in the system time model that has been introduced with Oracle 10g.

So if you execute significant workload via DBMS_JOB or DBMS_SCHEDULER any system time model related statistic like DB Time, DB CPU etc. that gets recorded for that workload gets double accounted.

This bug is not particularly relevant since your top workloads will still be the same top workloads, because all other statistics (like Elapsed Time, CPU, Buffer Gets etc.) are not affected by the bug.

I mention it only here since the bug (see below for details) as of the time of writing can't yet be found on My Oracle Support in the bug database but I recently came across several AWR reports where the majority of workload was generated via job processes and therefore the time model statistics were effectively doubled.

It might help as a viable explanation if you sometimes wonder why an AWR or Statspack report only captures 50% or less of the recorded total DB Time or DB CPU and where this unaccounted time has gone. If a significant part of the workload during the reporting period has been performed by sessions controlled via DBMS_JOB or DBMS_SCHEDULER then probably most of the unaccounted time is actually not unaccounted but the time model statistics are wrong.

So if you have such an otherwise unexplainable unaccounted DB Time / DB CPU etc. you might want to check if significant workload during the reporting period was executed via the job system. Note that I don't say that this is the only possible explanation of such unaccounted time - there might be other reasons like uninstrumented waits, other bugs etc.

Of course all the percentages that are shown in the AWR / ADDM / Statspack reports that refer to "Percentage of DB Time" or "Percentage of DB CPU" will be too small in such cases.

If the majority of workload during the reporting period has been generated by jobs then you can safely assume that the time model statistics have to be divided by 2 (and the percentages have to be doubled). If you have a mixture of jobs and regular foreground sessions then it will be harder to derive the correct time model statistics.

Note that the "Active Session History" (ASH) is not affected by the bug - the ASH reports always were consistent in my tests regarding the DB Time (respectively the number of samples) and CPU time information.

The following simple test case can be used to reproduce the issue at will. Ideally you should have exclusive access to the test system since any other concurrent activity will affect the test results.

You might want to check the 1000000000 iterations of the simple PL/SQL loop on your particular CPU - on my test system this takes approx. 46 seconds to complete.

The first version assumes that a PERFSTAT user with an installed STATSPACK is present in the database since STATSPACK doesn't require an additional license. An AWR variant follows below.

alter session set nls_language = american nls_territory = america;

store set .settings replace

set echo on timing on define on

define iter="1000000000"

variable snap1 number

exec :snap1 := statspack.snap

declare
n_cnt binary_integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
end;
/

variable snap2 number

exec :snap2 := statspack.snap

/* Uncomment this if you want to test via DBMS_JOB
variable job_id number

begin
dbms_job.submit(:job_id, '
declare
n_cnt binary_integer;
n_status integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
n_status := dbms_pipe.send_message(''bg_job_complete'');
end;
');
end;
/

commit;
*/

/* Uncomment this if you want to test via DBMS_SCHEDULER */
begin
dbms_scheduler.create_job(
job_name => dbms_scheduler.generate_job_name
, job_type => 'PLSQL_BLOCK'
, job_action => '
declare
n_cnt binary_integer;
n_status integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
n_status := dbms_pipe.send_message(''bg_job_complete'');
end;
' , enabled => true);
end;
/

declare
pipe_status integer;
begin
pipe_status := dbms_pipe.receive_message('bg_job_complete');
end;
/

declare
pipe_id integer;
begin
pipe_id := dbms_pipe.remove_pipe('bg_job_complete');
end;
/

variable snap3 number

exec :snap3 := statspack.snap

rem set heading off pagesize 0 feedback off linesize 500 trimspool on termout off echo off verify off

prompt Enter PERFSTAT password

connect perfstat

column dbid new_value dbid noprint

select dbid from v$database;

column instance_number new_value inst_num noprint

select instance_number from v$instance;

column b_id new_value begin_snap noprint
column e_id new_value end_snap noprint

select :snap1 as b_id, :snap2 as e_id from dual;
define report_name=sp_foreground.txt

@?/rdbms/admin/sprepins

column dbid new_value dbid noprint

select dbid from v$database;

column instance_number new_value inst_num noprint

select instance_number from v$instance;

column b_id new_value begin_snap noprint
column e_id new_value end_snap noprint

select :snap2 as b_id, :snap3 as e_id from dual;
define report_name=sp_background.txt

@?/rdbms/admin/sprepins

undefine iter

@.settings

set termout on

Here is the same test case but with AWR reports (requires additional diagnostic license)

alter session set nls_language = american nls_territory = america;

store set .settings replace

set echo on timing on define on

define iter="1000000000"

column snap1 new_value awr_snap1 noprint

select dbms_workload_repository.create_snapshot as snap1 from dual;

declare
n_cnt binary_integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
end;
/

column snap2 new_value awr_snap2 noprint

select dbms_workload_repository.create_snapshot as snap2 from dual;

/* Uncomment this if you want to test via DBMS_JOB
variable job_id number

begin
dbms_job.submit(:job_id, '
declare
n_cnt binary_integer;
n_status integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
n_status := dbms_pipe.send_message(''bg_job_complete'');
end;
');
end;
/

commit;
*/

/* Uncomment this if you want to test via DBMS_SCHEDULER */
begin
dbms_scheduler.create_job(
job_name => dbms_scheduler.generate_job_name
, job_type => 'PLSQL_BLOCK'
, job_action => '
declare
n_cnt binary_integer;
n_status integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
n_status := dbms_pipe.send_message(''bg_job_complete'');
end;
' , enabled => true);
end;
/

declare
pipe_status integer;
begin
pipe_status := dbms_pipe.receive_message('bg_job_complete');
end;
/

declare
pipe_id integer;
begin
pipe_id := dbms_pipe.remove_pipe('bg_job_complete');
end;
/

column snap3 new_value awr_snap3 noprint

select dbms_workload_repository.create_snapshot as snap3 from dual;

set heading off pagesize 0 feedback off linesize 500 trimspool on termout off echo off verify off

spool awr_foreground.html

select
output
from
table(
sys.dbms_workload_repository.awr_report_html(
(select dbid from v$database)
, (select instance_number from v$instance)
, &awr_snap1
, &awr_snap2
)
);

spool off

spool awr_background.html

select
output
from
table(
sys.dbms_workload_repository.awr_report_html(
(select dbid from v$database)
, (select instance_number from v$instance)
, &awr_snap2
, &awr_snap3
)
);

spool off

spool awr_diff.html

select
output
from
table(
sys.dbms_workload_repository.awr_diff_report_html(
(select dbid from v$database)
, (select instance_number from v$instance)
, &awr_snap1
, &awr_snap2
, (select dbid from v$database)
, (select instance_number from v$instance)
, &awr_snap2
, &awr_snap3
)
);

spool off

undefine awr_snap1
undefine awr_snap2
undefine awr_snap3

undefine iter

column snap1 clear
column snap2 clear
column snap3 clear

@.settings

set termout on

And here is a sample snippet from a generated Statspack report on a single CPU system with nothing else running on the system:

Normal foreground execution:

STATSPACK report for

Database DB Id Instance Inst Num Startup Time Release RAC
~~~~~~~~ ----------- ------------ -------- --------------- ----------- ---
orcl112 1 05-Aug-10 08:21 11.2.0.1.0 NO

Host Name Platform CPUs Cores Sockets Memory (G)
~~~~ ---------------- ---------------------- ----- ----- ------- ------------
XXXX Microsoft Windows IA ( 1 0 0 2.0

Snapshot Snap Id Snap Time Sessions Curs/Sess Comment
~~~~~~~~ ---------- ------------------ -------- --------- ------------------
Begin Snap: 13 05-Aug-10 08:34:17 25 1.2
End Snap: 14 05-Aug-10 08:35:05 25 1.2
Elapsed: 0.80 (mins) Av Act Sess: 1.1
DB time: 0.87 (mins) DB CPU: 0.80 (mins)

Cache Sizes Begin End
~~~~~~~~~~~ ---------- ----------
Buffer Cache: 104M Std Block Size: 8K
Shared Pool: 128M Log Buffer: 6,076K

Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~ ------------------ ----------------- ----------- -----------
DB time(s): 1.1 2.4 0.09 3.99
DB CPU(s): 1.0 2.2 0.08 3.68

Execution via Job/Scheduler:

STATSPACK report for

Database DB Id Instance Inst Num Startup Time Release RAC
~~~~~~~~ ----------- ------------ -------- --------------- ----------- ---
orcl112 1 05-Aug-10 08:21 11.2.0.1.0 NO

Host Name Platform CPUs Cores Sockets Memory (G)
~~~~ ---------------- ---------------------- ----- ----- ------- ------------
XXXX Microsoft Windows IA ( 1 0 0 2.0

Snapshot Snap Id Snap Time Sessions Curs/Sess Comment
~~~~~~~~ ---------- ------------------ -------- --------- ------------------
Begin Snap: 14 05-Aug-10 08:35:05 25 1.2
End Snap: 15 05-Aug-10 08:35:53 24 1.3
Elapsed: 0.80 (mins) Av Act Sess: 1.9
DB time: 1.55 (mins) DB CPU: 1.54 (mins)

Cache Sizes Begin End
~~~~~~~~~~~ ---------- ----------
Buffer Cache: 104M Std Block Size: 8K
Shared Pool: 128M Log Buffer: 6,076K

Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~ ------------------ ----------------- ----------- -----------
DB time(s): 1.9 92.8 0.79 7.74
DB CPU(s): 1.9 92.1 0.78 7.68

As you might have guessed my single CPU test system has not been added a second CPU when performing the same task via DBMS_SCHEDULER / DBMS_JOB yet the time model reports (almost) 2 DB Time / DB CPU seconds and active sessions per second in that case.

I have reproduced the bug on versions 10.2.0.4, 11.1.0.7 and 11.2.0.1 but very likely all versions supporting the time model are affected.

A (non-public) bug "9882245 - DOUBLE ACCOUNTING OF SYS MODEL TIMINGS FOR WORKLOAD RUN THROUGH JOBS" has been filed for it, but the fix is not available yet therefore as far as I know it is not yet part of any available patch set / PSU.

Note that there seems to a different issue with the DB CPU time model component: If you have a system that reports more CPUs than sockets (for example a Power5, Power6 or Power7 based IBM server that reports 16 sockets / 32 CPUs) then the DB CPU component gets reduced by approximately 50%, which means it is divided by 2.

This means in combination with above bug that you end up with a doubled DB Time component for tasks executed via jobs, but the DB CPU time model component is in the right ballpark since the doubled DB CPU time gets divided by 2.

I don't know if the bug fix also covers this issue, so you might want to keep this in mind when checking any time model based information.

Chris Antognini's picture

Optimizer Mode Mismatch Does Not Prevent Sharing of Child Cursor!?!?

The aim of this post is to describe a strange (buggy) situation that I observed recently. But before doing that, I shortly summarize what a parent cursor and a child cursor are as well as when they can be shared. By the way, I borrowed this description from the pages 20/21 of my book. Hence, [...]

Chris Antognini's picture

How Good Are the Values Returned by DBMS_COMPRESSION.GET_COMPRESSION_RATIO?

According to the documentation the GET_COMPRESSION_RATIO procedure of the DBMS_COMPRESSION package can be used to assess the impact of different compression options for a given table. In other words, it allows us to find out the expected compression ratio for a given set of data without having to really create a compressed table. The question [...]

Missing wait event

Oracle 11.2.0.1 for Windows is out for some time. I’ve installed it on my 32-bit Windows XP machine because I like Windows – just to check that it’s actually working fine. Playing with TPC-H test using Hammerora I’ve noticed an anomaly in the way Oracle reports IO waits for some queries. Here is a test [...]

Syndicate content