SQL

tanelpoder's picture

COUNT STOPKEY operation (the where ROWNUM

I was running tests on some very large tables on an Exadata cluster and found an interesting bug.

Instead of having to query 4Billion row tables, I have reproduced this case with a cartesian join…

Check this. I’m generating 8 Billion rows using a 3-way cartesian join of set of 2000 rows. So, this results in 2000 * 2000 * 2000 rows, which is 8 billion rows.

SQL>  with sq as (select null from dual connect by level <= 2000)
      select count(*)
      from sq a, sq b, sq c;

COUNT(*)
----------
8000000000

Everything worked well as you see. All 8 billion rows were nicely counted. Let’s modify this query a bit, by adding a WHERE rownum <= 8 000 000 000 predicate, which shouldn’t modify the outcome of my query as 8 billion rows is exactly what I have:

SQL> with sq as (select null from dual connect by level <= 2000)
     select count(*)
     from sq a, sq b, sq c
     where rownum <= 8000000000;

COUNT(*)
----------
4294967295

Ding! (as I’m unable to come up with a better word to present this problem :)

Seems like we have a problem! The COUNT operation thinks that we have only a bit over 4 billion rows returned from the query plan branches below it.

We clearly have a(n easily reproducible ) bug here!

What happens here is that whenever you use ROWNUM < N operator in a query predicate like I have done, you will introduce an additional step to the query plan (COUNT STOPKEY).

Check this simple query:

SELECT * FROM t WHERE ROWNUM <= 1000

Plan hash value: 508354683

--------------------------------------------
| Id  | Operation          | Name | E-Rows |
--------------------------------------------
|   0 | SELECT STATEMENT   |      |        |
|*  1 |  COUNT STOPKEY     |      |        |
|   2 |   TABLE ACCESS FULL| T    |     41 |
--------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=1000)

As the above plan explains, the “WHERE rownum <” predicate is applied in the execution plan line (row source) 1 – COUNT STOPKEY. You won’t see this line when you are just counting rows without any “ROWNUM <” predicate.

How this works is that the COUNT STOPKEY knows how many rows you want and will just keep calling its child function under it in the execution plan tree to get more and more rows, until the required amount of rows have been returned. And then the COUNT STOPKEY function just stops calling its child row sources and declares the end-of-data condition.

And here’s the catch – due to a bug, the variable used to keep track of number of processed rows by COUNT STOPKEY is only 4 bytes, even on 64bit platforms. So, it can hold values up to 4294967295 in it (the count returned above), which is 2 ^ 32 – 1. That wouldn’t be so much of a problem in practical real world applications, but what I don’t like is that the operation will silently return wrong data – it will just stop fetching more rows, even though we haven’t reached the ROWNUM <= 8 000 000 000 limit yet and there are rows to fetch from the child row-source, but COUNT STOPKEY declares end-of-data condition and returns ~4B as a result.

This is (obviously a bug) and now there’s a bug number for that as well (thanks to Greg Rahn for letting me know) – Bug# 10214991 (unfortunately it’s not public in MOS).

Now, there’s a reason why this bug has gone unnoticed for so long despite that 4+ billion-row tables have existed for long time (I worked first with such Oracle database in year 2000 – on Windows NT :).

  1. A real-life business query with a construct of WHERE ROWNUM <= X makes sense only when the data is ordered by some meaningful business attribute (a column). Otherwise you’ll get the query results in quite random order. Note that I’m talking about real, business queries here.
  2. The only right way to order data in SQL is via an ORDER BY clause. There is no other right way to do it, even though some hacks sometimes work (and will stop working after the next upgrade)
  3. Nobody (or not many people) have written queries like: give me the sum of 5 billion biggest purchases ordered by the selling price or give me the sum of last 5 billion purchases up to this date. If you replace the billion by just ten, hundred, or just a thousand, then hell yeah, such types of queries are being executed every day (or should I say minute, second).
    Whether using ROWNUM in your business queries (instead of actual date range or purchase price in $$$) is a good or bad design is a completely different question – the fact is that ROWNUM is used in such SQLs.

Now we’ll just need to wait until the data volumes (and reports) get so big that someone actually hits this problem in real life!

So, better patch your database before the end of this decade! :-)

Share

cary.millsap's picture

Virtual Seminar: "Systematic Oracle SQL Optimization in Real Life"

On November 18 and 19, I’ll be presenting along with Tanel Põder, Jonathan Lewis, and Kerry Osborne in a virtual (GoToWebinar) seminar called Systematic Oracle SQL Optimization in Real Life. Here are the essentials:

What: Systematic Oracle SQL Optimization in Real Life.
tanelpoder's picture

Read currently running SQL statement’s bind variable values using V$SQL_MONITOR.BIND_XML in Oracle 11.2

The title pretty much says it. In Oracle 11.2, if you have the Diag+Tuning Pack licenses and the SQL monitoring kicks in for your SQL statement, then instead of the old fashioned ERRORSTACK dump reading you can just query the V$SQL_MONITOR.BIND_XML to find the values and metadata of your SQL statement’s bind variables.

I’ve written an example here:

And a related comment – V$SQL_BIND_CAPTURE is not a reliable way for identifying the current bind variable values in use. Oracle’s bind capture mechanism does not capture every single bind variable into SGA (it would slow down apps which run lots of short statements with bind variables). The bind capture only selectively samples bind values, during the first execution of a new cursor and then every 15 minutes from there (controlled by _cursor_bind_capture_interval parameter), assuming that new executions of that same cursor are still being started (the capture happens only when execution starts, not later during the execution).

Share

tanelpoder's picture

Announcing the first E2SN Virtual Conference with Jonathan Lewis, Cary Millsap, Kerry Osborne and me – 18-19 Nov!

Yes, it’s official. I’m organizing a virtual conference with some of THE top speakers in the world. The topic is Systematic Oracle SQL Optimization (in real world)

The dates are 18-19 November, the conference lasts for 4 hours on both days, so you’ll be able to still get some work done as well (and immediately apply the knowledge acquired!).

Well, none of the speakers need introduction, but just in case you’ve lived in space for last 20 years, here are the links to their blogs :)

I can tell you, (at least the first 3) people in the above list ROCK!!!

And all of them are OakTable members too :)

This conference will have 4 x 1.5 hour sessions, each delivered by a separate speaker. We aim to systematically cover the path of:

  1. Finding out where is the performance problem (and which SQLs cause it)
  2. Finding out what is the problem SQL execution plan doing and which part of it is slow
  3. How to write and fix your code so that the optimizer wouldn’t hate your SQL
  4. How to fix the SQL execution plan performance problem when you can’t touch the application code!

And as this is the first (pilot) virtual conference, then the price is awesome, especially if you get the early bird rate by signing up before 1. November!

So, check out the abstracts, details, agenda and sign up here!

#000000;">#0000ff;">http://tech.e2sn.com/virtual-conferences

P.S. I expect this event to be awesome!

Share

_connect_by_use_union_all

This is just a short note on the parameter introduced in the 11gR2 called _connect_by_use_union_all. I’ve noticed it for the first time in Doc ID 7210630.8, which gives a brief overview of the changes made to the way CBO generates plans for hierarchical queries. As usually happens, the change helps to one problem, but produces [...]

tanelpoder's picture

Q: The most fundamental difference between HASH and NESTED LOOP joins?

So, what do you think is the most fundamental difference between NESTED LOOPS and HASH JOINS?

This is not a trick question. You’re welcome to write your opinion in the comments section – and I’ll follow up with an article about it (my opinion) later today…

Update: The answer article is here:

http://blog.tanelpoder.com/2010/10/06/a-the-most-fundamental-difference-between-hash-and-nested-loop-joins/

Share/Bookmark

tanelpoder's picture

Execution plan Quiz: Shouldn’t these row sources be the other way around ;-)

Here’s a little trick question. Check out the execution plan below.

What the hell, shouldn’t the INDEX/TABLE access be the other way around?!

Also, how come it’s TABLE ACCESS FULL (and not by INDEX ROWID) in there?

This question is with a little gotcha, but can you come up with a query which produced such plan? ;-)

----------------------------------------------
| Id  | Operation          | Name   | E-Rows |
----------------------------------------------
|   0 | SELECT STATEMENT   |        |        |
|*  1 |  INDEX RANGE SCAN  | PK_EMP |      1 |
|*  2 |   TABLE ACCESS FULL| EMP    |      1 |
----------------------------------------------

Share/Bookmark

ORA-01719 is partially relaxed

You most likely have seen this error before: ORA-01719: outer join operator (+) not allowed in operand of OR or IN Cause: An outer join appears in an or clause. Action: If A and B are predicates, to get the effect of (A(+) or B), try (select where (A(+) and not B)) union all (select [...]

CONNECT BY oddity

This week I’ve seen an issue with a CONNECT BY query: for some reason Oracle 10.2.0.4 decided to build a weird plan (the query is weird too, but that’s not my point here ). An explanation of why that happened looks interesting, so here it is. Set up: drop table t2 cascade constraints purge; drop [...]

tanelpoder's picture

Non-trivial performance problems

Gwen Shapira has written an article about a good example of a non-trivial performance problem.

I’m not talking about anything advanced here (such as bugs or problems arising at OS/Oracle touchpoint) but that sometimes the root cause of a problem (or at least the reason why you notice this problem now) is not something deeply technical or related to some specific SQL optimizer feature or a configuration issue. Instead of focusing on the first symptom you see immediately, it pays off to take a step back and see how the problem task/application/SQL is actually used by the users or client applications.

In other words, talk to the users, ask how exactly they experience the problem and then drill down from there.

Share/Bookmark

To prevent automated spam submissions leave this field empty.
Syndicate content