Indexing

Jonathan Lewis's picture

Bitmap join indexes

Here’s another of my “draft” notes that needs some exapansion and, most importantly, proof.

I have a fact table with a “status id” column that shows a massive skew. Unfortunately I have a dimension table that holds the “status code”, so (in theory, at least) I have to do a join from the statuses table to the fact table to find rows of a given status. Unfortunately the join hides the skew:

select  f.* 
from    facts f, statuses s
where   s.code = 'C'
and     f.status_id = s.status_id
;

The optimizer knows that the status_id column on the facts table has a highly skewed distribution and will create a histogram on it, but it can’t know which status code corresponds to which status_id, so the histogram doesn’t help in calculating the join cardinality.

Will a bitmap join index help ? Answer – NO.

Jonathan Lewis's picture

32K Columns

Oracle 12c has increased the maximum length of character-based columns to 32K bytes – don’t get too excited, they’re stored out of lines (so similar in cost to LOBs) and need some modification to the parameter file and data dictionary (starting the database in upgrade mode) before you can use them.

Richard Foote has a pair of articles on indexing such columns:

Jonathan Lewis's picture

Reverse Key

Here’s one of those little details which I would have said just couldn’t be true – except it’s in the manuals, and the manuals happen to be right.

Jonathan Lewis's picture

12c In-memory

I wrote a note about the 12c “In-Memory” option some time ago on the OTN Database forum and thought I’d posted a link to it from the blog. If I have I can’t find it now so, to avoid losing it, here’s a copy of the comments I made:

Juan Loaiza’s presentation is probably available on the Oracle site by now, but in outline: the in-memory component duplicates data (specified tables – perhaps with a restriction to a subset of columns) in columnar format in a dedicated area of the SGA. The data is kept up to date in real time, but Oracle doesn’t use undo or redo to maintain this copy of the data because it’s never persisted to disc in this form, it’s recreated in-memory (by a background process) if the instance restarts. The optimizer can then decide whether it would be faster to use a columnar or row-based approach to address a query.

Jonathan Lewis's picture

FBI decode

It probably won’t surprise many people to hear me say that the decode() function can be a bit of a nuisance; and I’ll bet that quite a lot of people have had trouble occasionally trying to get function-based indexes that use this function to behave properly. So (to put it all together and support the general directives that case is probably a better choice than decode() and that the cast() operator is an important thing to learn) here’s an example of how function-based indexes don’t always allow you to work around bad design/code. (Note: this is a model of a problem I picked up at a client site, stripped to a minimum – you have to pretend that I’m not allowed to fix the problem by changing code).

First we create some data and indexes, and gather all relevant stats:

Jonathan Lewis's picture

Virtual Integrity

A recent question on the Oracle-L list server described a problem with data coming in from SQL Server and an oddity with referential integrity failing on Oracle because (for example) a child row was in lower case while the parent was in upper.

This raised a few comments on how you might handle referential integrity while allowed case to differ. No doubt it’s been done before – by Tom Kyte if no-one else – but the first thought that crossed my mind was to use virtual columns:

Jonathan Lewis's picture

Bitmap / Btree

Here’s a little note that came about after I tweeted an idle thought on Twitter yesterday

  • 12c allows you to have multiple indexes on the same columns on a table, although only one of them is allowed to be visible at any one time – you can do the same with any recent versions of Oracle “almost”, and without the invisibility requirements. (Thanks to Jason Bucata for suggesting the critical detail on this one.)
  • 12c allows you to have “partial” indexing on partitioned tables -  you can do the same with earlier versions of Oracle “almost” but only if the indexes are local indexes or globally partitioned.
  • 12c doesn’t officially allow you to create an index that is a bitmap in the past and a btree in the present (yet) – although you can almost do this in any recent versions of Oracle.

How to use (almost) the same column definition for two indexes on the same table:

Jonathan Lewis's picture

Clustering_factor

The clustering_factor is one of the most important numbers (if not the most important number) affecting the optimizer’s choice of execution plan – it’s the thing that has the most significant effect on the optimizer’s decision on whether to choose a table scan or an index, and on which index to choose.

Jonathan Lewis's picture

Wrong Index

One of the sad things about trying to keep on top of Oracle is that there are so many little things that could go wrong and take a long time to identify. In part this is why I try to accumulate test cases for all the oddities and anomalies I come across as I travel around the world – if I’ve spent the time recreating a problem I’ll probably remember it the next time I see the symptoms.

Here’s a little threat that comes into play when a couple of events occur simultaneously, in this case: automatically selected indexes being rebuilt combined with an unfortunate choice of index definitions. Here’s a demonstration (running 11.2.0.3, 1MB uniform extents, 8KB block size, freelist management – first the symptoms, script, followed by results:

Jonathan Lewis's picture

Index rebuild 12c

Just one of those little snippets about 12c that might help someone.

Further to an earlier post, online rebuild works in 12c even when the key is “too long”. The internal code has changed completely, and there is no sign of the problematic journal table that caused the problem in earlier versions.

To prevent automated spam submissions leave this field empty.
Syndicate content