## Statistics poll

Posted April 16, 2012Thanks to everyone for spreading the link and participating in the poll on statistics. As it was pointed out in the comments, the poll isn’t very specific, I know. My point was to get an idea how many people use the defaults (I considered Oracle version 11g) for stats collection, how many do a very simple and reasonable tweak to get more stable execution plans, and how many people prefer custom statistics management. Well, I think the results will not change much and here are as following (326 votes at the time of writing):

## Extended Stats

Posted April 12, 2012I’m very keen on the * 11g extended stats* feature, but I’ve just discovered a critical weakness in one of the implementation details that could lead to some surprising instability in execution plans. It’s a combination of “column group” statistics and “out of range” predicates. Let’s start with some sample data. (Note: I was running this test on 11.2.0.3):

create table t1 as with generator as ( select --+ materialize rownum id from dual connect by level <= 10000 ) select mod(rownum,100) col1, mod(rownum,10) col2 from generator v1, generator v2 where rownum <= 50000 ; begin dbms_stats.gather_table_stats( ownname => user, tabname =>'T1', method_opt => 'for all columns size 1' ); end; /

## METHOD_OPT

Posted April 9, 2012Here is a poll prompted by today twitter talks on the default METHOD_OPT value (which is ‘for all columns size auto’ since 10g) and automatic statistics gathering job (auto-task since 11g). CBO development team suggests to use the default job to gather statistics with default options. Do you use the job exactly this way with real production databases? I’m very interested to see some numbers. Please share the poll link http://poll.fm/3n7pn so that more people vote. Thanks.

## Incremental Partition Statistics Review

Posted January 15, 2012Introduction

Here is a summary of the findings while evaluating Incremental Partition Statistics that have been introduced in Oracle 11g.

The most important point to understand is that Incremental Partition Statistics are not "cost-free", so anyone who is telling you that you can gather statistics on the lowest level (partition or sub-partition in case of composite partitioning) without any noticeable overhead in comparison to non-incremental statistics (on the lowest level) is not telling you the truth.

Although this might be obvious I've already personally heard someone making such claims so it's probably worth to mention.

In principle you need to test on your individual system whether the overhead that is added to each statistics update on the lowest level outweighs the overhead of actually gathering statistics on higher levels, of course in particular on global level.

## NewDensity

Posted January 3, 2012A recent comment on a note I wrote some time ago about* faking histograms* asked about the calculations of

*in the latest versions of Oracle. As I read the question, I realised that I had originally supplied a formula for calculating*

**selectivity***, rather than*

**cardinality***, so I thought I’d supply a proper example.*

**selectivity**We’ll start with a script to create some data and stats – and I’m going to start with a script I wrote in Jan 2001 (which is why it happens to use the * analyze *command rather than

*, even though this example comes from an instance of 11.2.0.2).*

**dbms_stats.gather_table_stats**## I Wish

Posted December 16, 2011A couple of days ago I wrote about some things * I’d like to see in index statistics*, which means changing

*. Here’s an idea for*

**dbms_stats.gather_index_stats()***. I’d like to see the number of chained and migrated rows counted separately in*

**dbms_stats.gather_table_stats()****when we run the packaged procedure. The**

*dba_tables**, but the information is only collected if you use the*

**optimizer will use information about chained or migrated rows***command (and even then the two figures are summed into the single value*

**analyze***) .*

**chain_cnt**## I wish

Posted December 13, 2011Here are a few thoughts on * dbms_stats* – in particular the procedure

*.*

**gather_index_stats**The procedure counts the number of* used leaf blocks* and the number of distinct keys using a

*operation, which means you get an expensive aggregation operation when you gather stats on a large index. It would be nice efficiency feature if Oracle changed the code to use the new*

**count distinct***mechanism for these counts.*

**Approximate NDV**## System Stats

Posted September 12, 2011A quick collation – and warning – for 11.2

- MOS (Metalink):
- Wrong SREADTIM and MREADTIM statistics in AUX_STATS$ [ID 9842771.8] (needs MOS account)**Bug 9842771** - Comment from
in a**Sokrates****Charles Hooper blog** - Blog item by
**Christian Antognini** - Blog item by
**Randolf Geist**

Bottom line – be careful about what you do with system stats on 11.2

**Footnote:** the MOS link is a search string producing a list of references. I set it up like that because one of the articles referencing the bug is called *“Things to consider before upgrade to 11.2.0.2″* and it’s worth reading.

**Addendum:** one of the people on the two-day course I’ve just run in Berlin sent me a link for a quick note on how to set your own values for the system stats if you hit this bug. It’s actually quite a reasonable thing to do whether or not you hit the bug given the way that gathering the stats can produce unsuitable figures anyway: **setting system stats**. (I’ve also added their company blog to the links on the right, they have a number interesting items and post fairly regularly.)

## blevel=1

Posted August 9, 2011Here’s one of those quick answers I give sometimes on forums or newsgroups. I forget when I wrote this, and when, and what the specific question was – but it was something to do with rebuilding an index on a small table where data was constantly being deleted and inserted.

*Another problem with high insert/delete rates appears with very small indexes. *

*If you have a table that is small but constantly recycles its space you may also find you have an index where the number of leaf blocks puts you close to the borderline between having blevel = 1 and blevel = 2. If the size crosses that border occasionally and the statistics are updated to reflect the change – which is quite likely for a table subject to lots of updates and deletes if you have automatic stats collection enabled – then execution plans could change, resulting in dramatic changes in performance. *

*The workaround is fairly obvious – don’t let Oracle collect stats automatically on that table, instead create a stats-collection strategy for eliminating the change in blevel. For example, keep the stats locked except when you run your own code to deal with the stats, making sure that you overwrite the index blevel with 1 even if it has just crossed the boundary to 2.*

**Footnote:** the reason why a change from 1 to 2 is dramatic is because Oracle ignores the * blevel* in the optimizer arithmetic when it is set to 1; so the change from 1 to 2 actually has the impact of a change from zero to 2. Then the cost of a nested loop access is

*“cost of single access multiplied by number of times you do it”*– so the sudden appearance of a 2 in the formula gives an increment in cost of

*“2 * number of times you visit the table”*of your small table is the second table in a nested loop – and suddenly a nested loop becomes much more expensive without a real change in the data size.

**Footnote 2**: it should be obvious that you don’t need to rebuild the index once you know what the problem is; but since we’re talking about a small index with a * blevel* that is usually 1 it probably won’t take more than a fraction of a second to rebuild the index and there’s a fair chance you can find a safe moment to do it. In terms of complexity the solution is just as simple as the stats solution – so you might as well consider it. The only thing you need to be careful about is that you don’t happen to rebuild the index at a time when the

*is likely to be 2.*

**blevel****Footnote 3**: For an example of the type of code that will adjust the blevel of an index see * this URL*. (Note, the example talks about copying stats from one place to another – but the principle is the same.)

## Virtual bug

Posted June 30, 2011I’ve said in the past that one of the best new features, in my view, in 11g was the appearance of * proper virtual columns*; and I’ve also been very keen on the new

*that makes it viable to collect stats with the “auto_sample_size”.*

**“approximate NDV”**Who’d have guessed that if you put them both together, then ran a parallel stats collection **it would break**

The bug number Karen quotes (10013177.8) doesn’t (appear to) mention extended stats – but since virtual columns, function-based indexes, and extended stats share a number of implementation details I’d guess that they might be affected as well.