statistics

Jonathan Lewis's picture

Stats time

I don’t really remember how long it’s been since Oracle created an automatic log of how long a call to the dbms_stats package took, though it was probably some time in the 10g time-line. It wasn’t until it had been around for several years, though before I wrote little script (possibly prompted by a comment from Martin Widlake) that I’ve used occasionally since to see what’s been going on in the past, how variable stats collection times have been, and what unexpected dbms_stats call an application may have been making. Here’s what it currently looks like:

Jonathan Lewis's picture

Index Sanity

By popular demand (well, one person emailed me to ask for it) I’m going to publish the source code for a little demo I’ve been giving since the beginning of the millennium – it concerns indexes and the potential side effects that you can get when you drop an index that you’re “not using”. I think I’ve mentioned the effect several times in the history of this blog, but I can’t find an explicit piece of demo code, so here it is – starting at the conclusion – as a cut and paste from an SQL*Plus session running against an 11g instance:

Jonathan Lewis's picture

Stats History

From time to time we see a complaint on OTN about the stats history tables being the largest objects in the SYSAUX tablespace and growing very quickly, with requests about how to work around the (perceived) threat. The quick answer is – if you need to save space then stop holding on to the history for so long, and then clean up the mess left by the history that you have captured; on top of that you could stop gathering so many histograms because you probably don’t need them, they often introduce instability to your execution plans, and they are often the largest single component of the history (unless you are using incremental stats on partitioned objects***)

Jonathan Lewis's picture

Column Groups

Patrick Jolliffe alerted the Oracle-L list to a problem that appears when you combine fixed length character columns (i.e. char() or nchar())  with column group statistics. The underlying cause of the problem is the “blank padding” semantics that Oracle uses by default to compare varchar2 with char, so I’ll start with a little demo of that. First some sample data:

Jonathan Lewis's picture

Quiz Night

I was setting up a few tests on a copy of 12.1.0.2 recently when I made a mistake creating the table – I forgot to put in a couple of CAST() calls in the select list, so I just patched things up with a couple of “modify column” commands. Since I was planning to smash the table in all sorts of ways and it had taken me several minutes to create the data set (10 million rows) I decided to create a clean copy of the data so that I could just drop the original table and copy back the clean version – and after I’d done this I noticed something a little odd.

Here’s the code (cut down to just 10,000 rows), with a little output:

Jonathan Lewis's picture

Column Groups

I think the “column group” variant of extended stats is a wonderful addition to the Oracle code base, but there’s a very important detail about using the feature that I hadn’t really noticed until a question came up on the OTN database forum recently about a very bad join cardinality estimate.

The point is this: if you have a multi-column equality join and the optimizer needs some help to get a better estimate of join cardinality then column group statistics may help if you create matching stats at both ends of the join. There is a variation on this directive that helps to explain why I hadn’t noticed it before – multi-column indexes (with exactly the correct columns) have the same effect and, most significantly, the combination of  one column group and a matching multi-column index will do the trick.

Jonathan Lewis's picture

Column Groups

I think column groups can be amazingly useful in helping the optimizer to generate good execution plans because of the way they supply better details about cardinality; unfortunately we’ve already seen a few cases (don’t forget to check the updates and comments) where the feature is disabled, and another example of this appeared on OTN very recently.

Modifying the example from OTN to make a more convincing demonstration of the issue, here’s some SQL to prepare a demonstration:

Jonathan Lewis's picture

Clustering_factor

I had a recent conversation at Oracle OpenWorld 2015 about a locking anomaly in a 3-node RAC system which was causing unexpected deadlocks. Coincidentally, this conversation came about shortly after I had been listening to Martin Widlake talking about using the procedure dbms_stats.set_table_prefs() to adjust the way that Oracle calculates the clustering_factor for indexes. The juxtaposition of these two topics made me realise that the advice I had given in “Cost Based Oracle – Fundamentals” 10 years ago was (probably) incomplete, and needed some verification. The sticking point was RAC.

Jonathan Lewis's picture

Histogram Limit

A surprising question came up on OTN a couple of days ago:

Why does a query for “column = 999999999999999999” run slower than a query for “column > 999999999999999998” (that’s 18 digit numbers, if you don’t want to count them). With the equality predicate the query is very slow, with the range-based predicate perfomance is good.

Jonathan Lewis's picture

Estimate_percent

Here’s a live one from OTN – here are a couple of extracts from the problem statement:

We’re experiencing an issue where it seems that the query plan changes from day to day for a particular procedure that runs once a night.
It’s resulting in a performance variance of 10 second completion time vs 20 minutes (nothing in between).
It started occurring about 2 months ago and now it’s becoming more prevalent where the bad query plan is coming up more often.
I noticed that the query plans vary for a simple query.
We do run gather statistics every night. (DBMS_STATS.GATHER_SCHEMA_STATS (ownname=>sys_context( ‘userenv’, ‘current_schema’ ), estimate_percent => 1);)

The query and two execution plans look like this:

To prevent automated spam submissions leave this field empty.
Syndicate content