Oakies Blog Aggregator

Franck Pachot's picture

We still need partitions and indexes, and I will continue to commute by train even when…

We still need indexes, and I will continue to commute by train even when self-driving cars become reality…

When the Autonomous Data Warehouse Cloud Service had been announced, Oracle came with this surprising idea that we do not need to create Indexes, Partitions and Materialized views for our analytic queries. It was even blocked in ADW and recently released but not recommended. Automatic indexing is for ATP and the message for ADW is: you don’t need indexes for your analytic queries.

In my opinion, and even with the best performance in non-index access, we will still need index range scans. And even when it is not the fastest access path. Because the fastest response time is not the first performance criteria for the end-user.

Full Table Scan vs. Index access

Here is a diagram I used to explain the cost of Full Table Scan vs. Index Access, depending on the number of rows to return.

FTS vs. Index cost / number of rows

Full Table Scan depends only on the size of the table. It has the same cost to return zero, one, half, or all the rows. Index Access is linearly proportional to the number of rows. The optimizer has a hard job to determine the point of inflection. And anyway, around the point of inflection, no method is really optimal. I always recommend providing structures that make one of those two access paths obviously the best.

The idea of the Autonomous Data Warehouse is to lower the cost of the Full Table Scan, thanks to a combination of hardware and software optimizations. Of course, it cannot be lowered to compete with an index unique access retrieving one row, like for OLTP ‘select * where PK=’. But the inflection point can be moved down enough so that Full Table Scan response time is correct for any analytic query. Then, with only one access method, the optimizer job is easier, less prone to errors. We don’t have to design indexes, and all this can be branded as autonomous tuning.

Markus Winand — https://use-the-index-luke.com/

Before going to the problem of having no indexes to range scan, I’ll try to summarize quickly the properties of Full Table Scans and Index access. There’s also an excellent read to go into the detail: https://use-the-index-luke.com/ by Markus Winand.

There are two access paths to read table rows:

  • Full Table Scan, which is always available (at the exception of IOTs - Index Organized Tables, as there’s no table) where all the formatted blocks of a Heap table are read, without a specific order.
  • Access by ROWID where a sorted structure (as an Index) or hash function provides the physical address where to get the rows we need, without having to scan all blocks.

Full Table Scan is efficient because it can to read rows as it is physically the most efficient: contiguous blocks, in large I/O calls, in parallel, asynchronous, bypassing all caches. However, it has to read a lot more than needed. Without any determined order. Then the rows have to be filtered out later. And often sorted or hashed later, to be joined, ordered or deduplicated. Many features have improved Full Table Scan to keep it efficient even when we do not need a lot of rows: Storage Indexes, Zone Maps, SmartScan, Compression, Column Store, In-Memory, HCC,… Those features keep Full Table Scans still efficient for analytic queries where we query only a subset of rows and columns.

Yes, Full Table Scan on ADW is fast. But from my experience, ‘fast’ response time is not the most important criteria for the end-user. They want predictable response time.

Users prefer predictability over fast response time

You may think that Full Table Scan is predictable. That’s because you know the size of the table. But the end-user doesn’t know if the table is compressed, or full of free space. They do not realize how it can be longer when requesting the result to be ordered. They do not understand that querying today’s orders is not faster than querying the whole week. They cannot predict the response time because it is not proportional to the result they expect.

And there’s more: when many people run the same Full Table Scan, even when expecting different data, they are in competition and all response time will be longer because of buffer busy waits. When you query at a time where there’s a lot of activity on the storage, the response time will also be longer. All this cannot be predicted by the end-user. And the improvements such as Storage Index, In-Memory, Compression,… are not predictable by nature as they depend on the activity. They adapt from the previous queries and concurrent workload.

Even when Index Access response time is higher, it stays more predictable because it processes only the required rows. The time will be the same when the table grows by 200%. Or when 5 users are running similar queries. The cost is proportional to the number of rows returned: the end-user will accept easily that querying one week takes x7 more time than for one day. If it takes 5 minutes, they will manage it because they expect this longer duration. They don’t want the fastest response time, they just want a predictable time.

Autonomous Optimization as presented by Larry Ellison

The features of the Oracle Autonomous Database are presented with an analogy with cars, to push on the self-driving idea. For OLTP, with the Autonomous Transaction Processing, Oracle develops Automatic Indexing. And the analogy is: indexes are like new roads. The optimizer evaluates the execution plans (plans are like driving directions) with the goal to choose the one with the fastest response time. There’s no decision about the most predictable, the one which is less subject to contention with others, or the one which has a response time proportional to the result set. The CBO chose the lower cost, and the CBO cost is the estimated response time. With this goal, an optimized table scan looks the best.

But my goal, keeping the response time predictable by the end-user, is different. I’ll continue the analogy with the cars. Here is what Google Maps estimates for my morning commute to work:


Car: 35 min — 1 h 5 min

This is how ‘predictable’ is the commute by car at that time - the peak hour around Geneva. Typically between 35 and 65 minutes. It is, on average, faster than the train. But the time is not linearly proportional to the kilometers. Most of the time is spent in the few ‘red’ kilometers. That’s my analogy with Full Table Scan. We can drive fast, very fast. But the time it takes to reach the destination is uncertain.

And here is what Google Maps estimates for the same commute by train:


Train: 1 h — 1 h 1 min

This is completely predictable, proportional to the distance, and with nearly no variations. Index Access is like my commute by train: most of the time not faster than driving, but finally preferable because of its stability and predictability. And with good design, Index Access can be as precise as Swiss trains ;)

You can guess that I take the train to go to work. I don’t care that’s it takes longer, because I know it and can manage it — like opening my laptop and writing a blog post — so that it is not wasted time. And I know exactly when I’ll arrive at work for a meeting, or at home to pick up the kids.

Index, partitions, materialized views

If predictability is your first criteria, then your optimization should not rely on caches, parallelism, compression, storage indexes, in-memory population, result cache,… Of course, all these features are welcome when they kick-in at execution time, but they will not help you to forecast a predictable response time. The predictable optimizations are those for which the cost is proportional to the expected result, where we can forecast performance:

  • Partitions clearly reduce the cost of Full Table Scan with something predictable. The partition pruning has a business meaning. If the user queries a specific time range or geographical area, she knows that the time to retrieve will be proportional to it. Partitions, when properly designed, also keep the cost independent from the table growth.
  • Materialized views give this end-user predictability for aggregations. When a user queries for the monthly amount of sales in the year, she expects a 12 rows result, and she may not imagine that millions of rows have to be scanned for this. With a materialized summary, the response time will meet the volume of the result.
  • Indexes, when properly designed, will range scan exactly the values required for the query. And they will not add additional time for sorting or deduplicating the rows. Index access will give the best user experience because the retrieval is proportional to the cost imagined by the user. And Top-N queries is an important part of analytic queries. Only an index range scan answer a Top-N query without reading all the rows.

Index access is preferable when we query a limited number of rows. Full Table Scan is better when we need to read a large part of the table. The optimizer can choose the best one, but it is based on estimations. If you are in the area around the inflection point, you have the risk of execution plan change between two plans that are not optimal. This is where Cardinality Feedback can be good or bad. You should never leave critical queries in this zone. In the diagram at the top of this post, there’s a dotted line for ‘covering index’ which pushes the inflection point to a cardinality where it is still an alternative to FTS for a high number of rows. This is when the index contains all required columns and there’s no need to go to the table (and no dependency with the clustering factor, which is difficult to predict by the end-user).

In my opinion

With the price of a limited human-labor index design, we can provide speed and predictability to our users. Expensive hardware and software can compete with the former, but not the latter. And this is why I think we will always need indexes. The reason is the same as why I prefer to commute by train: predictability of the time.

By the way, there’s another reason after predictability and speed: green computing. Look at the resource (CPU and I/O) usage efficiency and guess which execution plan has the lower carbon footprint. Full Table Scan in Parallel Query, like all the trendy Big Data solutions where you scale by adding more nodes, is awesome and fast, but extremely expensive on resources. Indexes, when correctly designed, will read only what is needed to get the result, with predictable response time.

Jonathan Lewis's picture

QC vs. PX

One last post before closing down for the Christmas break.

Here’s a little puzzle with a remarkably easy and obvious solution that Ivica Arsov presented at the UKOUG Tech2018 conference. It’s a brilliant little puzzle that makes a very important point, because it reminded me that most problems are easy and obvious only after you’ve seen them at least once. If you you’ve done a load of testing and investigation into something it’s easy to forget that there may be many scenarios you haven’t even thought of testing – so when you see the next puzzle your mind follows all the things you’ve done previously and doesn’t think that you might be looking at something new.

In this case I had to wait until the end of the presentation to discover how “easy and obvious” the solution was. Here’s a query with its results, all I’m going to do is join a session (from v$session) with all its parallel execution slaves by looking for the matching qcsid in v$px_session:


break on server_group skip 1 duplicate
 
select
        px.sid, px.qcsid, 
        px.server_group, px.server_set, px.server#,
        ss.sql_id
from
        V$px_session px,
        v$session ss
where
        ss.username = 'TEST_USER'
and     ss.sid = px.sid
order by
        px.server_group nulls first, px.server_set, px.server#
;

     QCSID        SID SERVER_GROUP SERVER_SET    SERVER# SQL_ID
---------- ---------- ------------ ---------- ---------- -------------
       357        357                                    b4wg6286xn324

       357        125            1          1          1 bppfad1y1auhj
       357        246                       1          2 bppfad1y1auhj
       357        364                       1          3 bppfad1y1auhj

       357          7            2          1          1 5vdbyjy0c7dam
       357        133                       1          2 5vdbyjy0c7dam
       357        253                       1          3 5vdbyjy0c7dam

As you can see session 357 is reported as a query coordinator session, with two parallel server groups of 3 slave processes each. Strangely, though, the co-ordinator and the two groups of parallel query slaves are reported different SQL_IDs; this is probably contrary to the experience that most of us have had. When a parallel query (or DML or DDL statement) is executing the query co-ordinator and all its slave processes should report the same SQL_ID – so what’s happening here.

Little pause for thought …
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
… and no doubt some of you were immediately aware of the probable explanation. It’s very simple if you’ve come across the phenomenon before. Here’s the SQL that allowed me (from another session) to capture this result:


rem
rem     Script: px_qc_joke_2.sql
rem     Author: Jonathan Lewis
rem     Dated:  Dec 2018
rem

create table t1 nologging 
as
select ao.*
from 
        all_objects ao, 
        (
         select rownum id
         from   dual 
         connect by level <= 10 ) -- > comment to avoid wordpress format issue
;

create table t2 nologging as select * from t1;
create table t3 nologging as select * from t1;

begin
        dbms_stats.gather_table_stats(user,'t1', method_opt=>'for all columns size 1');
        dbms_stats.gather_table_stats(user,'t2', method_opt=>'for all columns size 1');
        dbms_stats.gather_table_stats(user,'t3', method_opt=>'for all columns size 1');
end;
/


prompt  =====================
prompt  Starting PL/SQL block
prompt  Prepare to query v$
prompt  =====================

declare
        cursor c1 is select /*+ parallel (t1 3) */ object_id from t1;
        cursor c2 is select /*+ parallel (t2 3) */ object_id from t2;
        m_id1 number;
        m_id2 number;
begin
        open c1;
        fetch c1 into  m_id1;

        open c2;
        fetch c2 into  m_id2;

        for r in (select owner from t3 where object_id = least(m_id1,m_id2)) loop
                dbms_output.put_line(r.owner);
                dbms_lock.sleep(3);
        end loop;

        dbms_output.put_line(m_id1 || ' - ' || m_id2); 

        close c2;
        close c1;
end;
/

I’ve opened two cursors on parallel queries inside an anonymous PL/SQL block. The driving session is the query co-ordinator for two different parallel queries at the same time because it’s keeping two cursors open, and it’s also executing the cursor that is the driving query block. If we check v$sql for the three SQL_IDs reported from v$px_session this is what we see:


SQL_ID
-------------
SQL_TEXT
--------------------------------------------------------------------------------
b4wg6286xn324
declare  cursor c1 is select /*+ parallel (t1 3) */ object_id from t1;  cursor c
2 is select /*+ parallel (t2 3) */ object_id from t2;  m_id1 number;  m_id2 numb
er; begin  open c1;  fetch c1 into  m_id1;   open c2;  fetch c2 into  m_id2;   f
or r in (select owner from t3 where object_id = least(m_id1,m_id2)) loop   dbms_
output.put_line(r.owner);   dbms_lock.sleep(1);  end loop;   dbms_output.put_lin
e(m_id1 || ' - ' || m_id2);   close c2;  close c1; end;

bppfad1y1auhj
SELECT /*+ parallel (t1 3) */ OBJECT_ID FROM T1

5vdbyjy0c7dam
SELECT /*+ parallel (t2 3) */ OBJECT_ID FROM T2


Apart from the warning that it’s easy to be misled by a problem because you keep thinking of all the cases you’ve seen before there’s another important point behind this little quiz. It’s often said that when you run parallel queries you may actually use “2 * DOP” parallel query slaves – this is true (though for more complicated queries you may get multiple DFO trees at once, each with its “2 * DOP” slaves) – it’s worth remembering that even with very simple queries a single session can have many cursors open at once, holding “2 * DOP” slave for each one, and ruin every other session’s response time because every other session ends up running serial queries.

connor_mc_d's picture

Your AskTOM Top 10 for 2018

Here we go folks…here are the top 10 most viewed questions in AskTOM this year!

We’ll count them down from 10 to 1

10) Inserting values into a table with ‘&’

Viewed 80,000 times.

First asked in 2012, this actually is not a database issue but a client issue. Many client tools view ampersand as a substitution variable, and this they intercept the execution before the command is sent to the database. Full details here

9) What is a SID, how to change it, how to find out what it is

Viewed 95,000 times.

A timeless classic. This question was asked in June 2000 – only 1 month after AskTOM went live. Given that most of our documentation contains, and most people are aware of the terms “instance” and “database”, I’ll never really understand why we didn’t go with ORACLE_INSTANCE instead of ORACLE_SID. Full details here.

8) How to Update millions or records in a table

Viewed 100,000 times.

Asked back in 2002, I can understand the popularity of this one. Perhaps the most common performance tuning issue is data conversion or data loading. Sometimes you just need to forget the word UPDATE and re-load the data – it can be much faster. Full details here.

7) How to connect SQLPlus without tnsnames.ora

Viewed 110,000 times.

Is it that much of a big deal to have a tnsnames.ora file? I guess so. Nowadays the EZ connect syntax makes connecting to any database trivial. Full details here.

6) How To FULL DB EXPORT/IMPORT

Viewed 130,000 times.

Another understandable one for the top 10 because of the transition from old style export/import to data pump around the timeframe that this question was first asked. There is also the classic chicken-and-egg issue to deal with, namely, that you need a database in order to import, but won’t a full import than clash with all the existing objects? Full details here.

5) Converting CLOBS TO VARCHAR

Viewed 132,000 times.

Once a golden oldie from 2001! It seems an odd proposition – if you have a CLOB, there’s a good chance you have it because it exceeds the allowable size for VARCHAR2. So converting back down seems a risky exercise. Full details here.

4) ORA-12560: TNS:protocol adapter error

Viewed 135,000 times.

Ah yes, not our finest hour this one. Try to use a database on Windows that has not been started, and do you get an error saying “Database not started”? No. You get a crytpic TNS error. Full details here.

3) Format the Number for display

Viewed 143,000 times.

There is such a thing as “too much of a good thing”. And perhaps our incredible flexibility when it comes to formatting data, and the range of format masks is tough for people to swallow. Full details here.

2) IN (vs) EXISTS and NOT IN (vs) NOT EXISTS

Viewed 144,000 times.

Well, there’s a simple reason this gem keeps on coming back year after year after year. For years, books on performance tuning and blogs on performance tuning would claim that EXISTS was better than IN, or vice versa. Amazingly it has never been the case that one was guaranteed to be better than the other, and more alarmingly a straight swap from one to the other can even impact the results you get back. I think we’ll see this guy in the list every year…..forever. Full details here.

And finally we have arrived at the number 1 most viewed question this year. Which begs me to ask the question:

Why do so many of you need to know how many are in each table? Smile

1) Finding the number of rows in each table by a single sql

Viewed 510,000 times.

Wow. Half a million people need to know how many rows are in their database tables. I’ve no idea why, because the moment you count them, you’re already out of date. Full details here.

And there you … our top 10 for 2018.

Have a great festive season!

Jonathan Lewis's picture

Transitive Closure

This is a follow-up to a note I wrote nearly 12 years ago, looking at the problems of transitive closure (or absence thereof) from the opposite direction. Transitive closure gives the optimizer one way of generating new predicates from the predicates you supply in your where clause (or, in some cases, your constraints); but it’s a mechanism with some limitations. Consider the following pairs of predicates:


    t1.col1 = t2.col2
and t2.col2 = t3.col3

    t1.col1 = t2.col2
and t2.col2 = 'X'

A person can see that the first pair of predicate allows us to infer that “t1.col1 = t3.col3” and the second pair of predicates allows us to infer that “t1.col1 = ‘X'”. The optimizer is coded only to recognize the second inference. This has an important side effect that can have a dramatic impact on performance in a way that’s far more likely to appear if your SQL is generated by code. Consider this sample data set (reproduced from the 2006 article):

rem
rem     Script:         transitive_loop.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2006
rem     Purpose:
rem
rem     Last tested
rem             12.2.0.1
rem

create table t1 
as
select
        mod(rownum,100) col1,
        rpad('x',200)   v1
from
        all_objects
where   
        rownum <= 2000
;

create table t2
as
select
        mod(rownum,100) col2,
        rpad('x',200)   v2
from
        all_objects
where   
        rownum <= 2000
;

create table t3
as
select
        mod(rownum,100) col3,
        rpad('x',200)   v3
from
        all_objects
where   
        rownum <= 2000
;

-- gather stats if necessary

set autotrace traceonly explain

prompt  =========================
prompt  Baseline - two hash joins
prompt  =========================

select 
        t1.*, t2.*, t3.*
from
        t1, t2, t3
where
        t2.col2 = t1.col1
and     t3.col3 = t2.col2
;

prompt  ================================================
prompt  Force mismatch between predicates and join order
prompt  ================================================

select 
        /*+
                leading(t1 t3 t2)
        */
        t1.*, t2.*, t3.*
from
        t1, t2, t3
where
        t2.col2 = t1.col1
and     t3.col3 = t2.col2
;

The first query simply joins the tables in the from clause order on a column we know will have 20 rows for each distinct value, so the result sets will grow from 2,000 rows to 40,000 rows to 800,000 rows. Looking at the second query we would like to think that when we force Oracle to use the join order t1 -> t3 -> t2 it would be able to use the existing predicates to generate the predicate “t3.col3 = t1.col1” and therefore be able to do the same amount of work as the first query (and, perhaps, manage to produce the same final cardinality estimate).

Here are the two plans, taken from an instance of 12.2.0.1:


=========================
Baseline - two hash joins
=========================

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |   800K|   466M|    48  (38)| 00:00:01 |
|*  1 |  HASH JOIN          |      |   800K|   466M|    48  (38)| 00:00:01 |
|   2 |   TABLE ACCESS FULL | T3   |  2000 |   398K|    10   (0)| 00:00:01 |
|*  3 |   HASH JOIN         |      | 40000 |    15M|    21   (5)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| T1   |  2000 |   398K|    10   (0)| 00:00:01 |
|   5 |    TABLE ACCESS FULL| T2   |  2000 |   398K|    10   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T3"."COL3"="T2"."COL2")
   3 - access("T2"."COL2"="T1"."COL1")

================================================
Force mismatch between predicates and join order
================================================

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |   800K|   466M| 16926   (3)| 00:00:01 |
|*  1 |  HASH JOIN            |      |   800K|   466M| 16926   (3)| 00:00:01 |
|   2 |   TABLE ACCESS FULL   | T2   |  2000 |   398K|    10   (0)| 00:00:01 |
|   3 |   MERGE JOIN CARTESIAN|      |  4000K|  1556M| 16835   (2)| 00:00:01 |
|   4 |    TABLE ACCESS FULL  | T1   |  2000 |   398K|    10   (0)| 00:00:01 |
|   5 |    BUFFER SORT        |      |  2000 |   398K| 16825   (2)| 00:00:01 |
|   6 |     TABLE ACCESS FULL | T3   |  2000 |   398K|     8   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."COL2"="T1"."COL1" AND "T3"."COL3"="T2"."COL2")

As you can see, there’s a dramatic difference between the two plans, and a huge difference in cost (though the predicted time for both is still no more than 1 second).

The first plan, where we leave Oracle to choose the join order, builds an in-memory hash table from t3, then joins t1 to t2 with a hash table and uses the result to join to t3 by probing the in-memory hash table.

The second plan, where we force Oracle to use a join order that (I am pretending) we believe to be a better join order results in Oracle doing a Cartesian merge join between t1 and t3 that explodes the intermediate result set up to 4 million rows (and the optimizer’s estimate is correct) before eliminating a huge amount of redundant data.

As far as performance is concerned, the first query took 0.81 seconds to generate its result set, the second query took 8.81 seconds. In both cases CPU time was close to 100% of the total time.

As a follow-up demo I added the extra predicate “t3.col3 = t1.col1” to the second query, allowing the optimizer to use a hash join with the join order t1 -> t3 -> t2, and this brought the run time back down (with a slight increase due to the extra predicate check on the second join).

Summary

The choice of columns in join predicates may stop Oracle from choosing the best join order because it is not able to use transitive closure to generate all the extra predicates that the human eye can see. If you are using programs to generate SQL rather than writing SQL by hand you are more likely to see this limitation resulting in some execution plans being less efficient than they could be.

 

 

 

 

pete's picture

Virtual Patching or Good Security Design instead?

I got an email from someone recently who asked me about virtual patching for Oracle as they were running an out of date version of Oracle and were thinking that virtual patching maybe a good solution to make their database....[Read More]

Posted by Pete On 19/12/18 At 01:32 PM

Richard Foote's picture

UKOUG “Lifetime Achievement Award” Speaker Award

I was recently very honoured and flattered to have received the “Lifetime Achievement Award” Speaker Award from the UKOUG. I have only managed to get to the excellent UKOUG Tech conferences on a couple of occasions, so it was both a thrill and a surprise to be so honoured. Unfortunately, I wasn’t able to make […]

Jonathan Lewis's picture

NULL predicate

People ask me from time to time if I’m going to write another book on the Cost Based Optimizer – and I think the answer has to be no because the product keeps growing so fast it’s not possible to keep up and because there are always more and more little details that might have been around for years and finally show up when someone asks me a question about some little oddity I’ve never noticed before.

The difficult with the “little oddities” is the amount of time you could spend trying to work out whether or not they matter and if it’s worth writing about them. Here’s a little example to show what I mean – first the data set:


rem
rem     Script:         null_filter.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2018
rem     Purpose:
rem
rem     Last tested
rem             18.3.0.0
rem             12.1.0.2
rem

create table t1
nologging
as
select  *
from    all_objects
where   rownum <= 50000 -- > comment to avoid wordpress format issue
;

insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
commit;

create index t1_i1 on t1(object_type, data_object_id, object_id, created);

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                cascade     => true,
                method_opt  => 'for all columns size 1'
        );
end;
/

It’s a simple data set with a single index. The only significant thing about the index is that the second column (data_object_id) is frequently null. This leads to a little quirk in the execution plans for a very similar pair of statements:


set serveroutput off
alter session set statistics_level = all;

select
        object_name, owner
from
        t1
where
        object_type = 'TABLE'
and     data_object_id = 20002
and     object_id = 20002
and     created > trunc(sysdate - 90)
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

select
        object_name, owner
from
        t1
where
        object_type = 'TABLE'
and     data_object_id is null
and     object_id = 20002
and     created > trunc(sysdate - 90)
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

How much difference would you expect in the execution plans for these two queries? There is, of course, the side effect of the “is null” predicate disabling the “implicit column group” that is the index distinct_keys value, but in this case I’ve got a range-based predicate on one of the columns so Oracle won’t be using the distinct_keys anyway.

Of course there’s the point that you can’t use the equality operator with null, you have to use “is null” – and that might make a difference, but how ? Here are the two execution plan:


----------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |      1 |        |      0 |00:00:00.01 |       3 |      1 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |      1 |      1 |      0 |00:00:00.01 |       3 |      1 |
|*  2 |   INDEX RANGE SCAN                  | T1_I1 |      1 |      1 |      0 |00:00:00.01 |       3 |      1 |
----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_TYPE"='TABLE' AND "DATA_OBJECT_ID"=20002 AND "OBJECT_ID"=20002 AND
              "CREATED">TRUNC(SYSDATE@!-90))

-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |      1 |        |      0 |00:00:00.01 |       3 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |      1 |      1 |      0 |00:00:00.01 |       3 |
|*  2 |   INDEX RANGE SCAN                  | T1_I1 |      1 |      1 |      0 |00:00:00.01 |       3 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_TYPE"='TABLE' AND "DATA_OBJECT_ID" IS NULL AND "OBJECT_ID"=20002 AND
              "CREATED">TRUNC(SYSDATE@!-90))
       filter(("OBJECT_ID"=20002 AND "CREATED">TRUNC(SYSDATE@!-90)))

The query with the predicate “data_object_id is null” repeats the object_id and sysdate predicates as access predicates and filter predicates. This seems a little surprising and a potential performance threat. In the first query the run_time engine will hit the correct index leaf block in exactly the right place very efficiently and then walk along it supplying every rowid to the parent operator until it hits the end of the range.

With the “is null” plan the run-time engine will be checking the actual value of object_id and created for every index entry on the way – how much extra CPU will this use and, more importantly, might Oracle start with the first index entry where object_type = ‘TABLE’ and data_object_id is null and walk through every index entry that has that null checking for the correct object_id as it goes ?

That last question is the reason for running the query with rowsource execution stats enabled. The first query did a single physical read while the second didn’t have to, but the more important detail is that both queries did the same number of buffer gets – and there is, by the way, a set of eight rows where the object_id and data_object_id are  20,002, but they were created several years ago so the index range scan returns no rows in both cases.

Based on that comparison, how do we show that Oracle has not walked all the way from the first index entry where object_type = ‘TABLE’ and data_object_id is null checking every entry on the way or, to put it another way, has Oracle really managed to prune down the index range scan to the minimum “wedge” indicated by the presence of the predicates “OBJECT_ID”=20002 AND “CREATED”>TRUNC(SYSDATE@!-90) as access predicates?

Let’s just count the number of leaf blocks that might be relevant, using the sys_op_lbid() function (last seen here) that Oracle uses internally to count the number of leaf blocks in an index. First we get the index object_id, then we scan it to see how many leaf blocks hold entries that match our object_type and data_object_id predicates but appear in the index before our target value of 20,002:


column object_id new_value m_index_id

select
        object_id
from
        user_objects
where
        object_type = 'INDEX'
and     object_name = 'T1_I1'
;

select  distinct sys_op_lbid(&m_index_id, 'L', rowid)
from    t1
where   object_type    = 'TABLE'
and     data_object_id is null
and     object_id      < 20002
;


SYS_OP_LBID(159271
------------------
AAAm4nAAFAAACGDAAA
AAAm4nAAFAAACF9AAA
AAAm4nAAFAAACGCAAA
AAAm4nAAFAAACF/AAA
AAAm4nAAFAAACF+AAA
AAAm4nAAFAAACGFAAA
AAAm4nAAFAAACGEAAA
AAAm4nAAFAAACGGAAA

8 rows selected.


This tells us that there are 8 leaf blocks in the index that we would have to range through before we found object_id 20,002 and we would have seen 8 buffer gets, not 3 in the rowsource execution stats, if Oracle had not actually been clever with its access predicates and narrowed down the wedge of the index it was probing.

Bottom line: for a multi-column index there seems to be a difference in execution plans between “column is null” and “column = constant” when the column is one of the earlier columns in the index – but even though the “is null” option results in some access predicates re-appearing as filter predicates in the index range scan the extra workload is probably not significant – Oracle still uses the minimum number of index leaf blocks in the index range scan.

 

Franck Pachot's picture

Oracle VPD as a safeguard for DML

A new blog post on the Databases at CERN blog about using VPD Row-Level Security (DBMS_RLS) as a safeguard for the privileged users who need to bypass the application and run SQL directly: https://db-blog.web.cern.ch/blog/franck-pachot/2018-12-oracle-vpd-safeguard-dml

Of course, your data should be guarded behind a hard shell (See Bryn Llewellyn presentation https://community.oracle.com/docs/DOC-1018915) but there may be some exceptional reasons to directly modify data with SQL because some information was not originally supposed to be changed, and then the application has no GUI or API for this. If all security was implemented through the application, everything is now possible when directly connected and a mistake (like a where clause predicate lost in ac copy-paste) can be critical. Flashback features are awesome to react to this kind of error, but VPD rules can be used as a proactive safeguard by allowing, by default, only a subset of data to be touched.

Oracle VPD as a safeguard for DML

martin.bach's picture

Little things worth knowing: the latest public-yum-ol7.repo configuration file enables an upgrade to UEK R5

For a little while now I have been using Ansible for all installation/configuration tasks I do in the lab. I can’t really be bothered to do these things by typing commands anymore: once you get the hang of Ansible, you can develop an urge to automate everything. As part of my playbook installing the Oracle database on Oracle Linux, I replace /etc/yum.repos.d/public-yum-ol7.repo with the current version from Oracle’s server to make sure I have the latest and greatest software available.

This step is not needed outside playground/lab environments. For “real” deployments you would use a repository file provisioned during build time, pointing to your local Spacewalk/Satellite/insert favourite tool server. I should briefly point out this is my lab system and I’m perfectly happy to point yum at Oracle’s public YUM servers. If this were a system outside my own lab I’d of course use proper channels to maintain software releases on my  servers. These however are outside the scope of this post.

The setup

I haven’t downloaded the latest Oracle Linux 7.6 ISO yet so I’m starting off on Oracle Linux 7.4 with Kernel UEK 4. I know, it’s a looooong upgrade. Here is my current configuration:

[root@server4 ~]# cat /etc/os-release
NAME="Oracle Linux Server"
VERSION="7.4"
ID="ol"
VERSION_ID="7.4"
PRETTY_NAME="Oracle Linux Server 7.4"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:oracle:linux:7:4:server"
HOME_URL="https://linux.oracle.com/"
BUG_REPORT_URL="https://bugzilla.oracle.com/"
ORACLE_BUGZILLA_PRODUCT="Oracle Linux 7"
ORACLE_BUGZILLA_PRODUCT_VERSION=7.4
ORACLE_SUPPORT_PRODUCT="Oracle Linux"
ORACLE_SUPPORT_PRODUCT_VERSION=7.4

[root@server4 ~]# rpm -q kernel-uek
kernel-uek-4.1.12-94.3.9.el7uek.x86_64
[root@server4 ~]#

Oracle’s Unbreakable Enterprise Kernel Release 4 is based on the upstream release 4.1.x. 

Upgrading!

With the latest version of the /etc/yum.repos.d/public-yum-ol7.repo in place I upgrade the freshly provisioned system – after it has been created using virt-install and my kickstart file – to “latest” using yum upgrade …

A little later, when the system was upgraded and my prompt returned, I rebooted the VM. When it came up, I was a little surprised to see a change in kernel version:

[root@server4 ~]# reboot
Connection to server4 closed by remote host.
Connection to server4 closed.
[martin@server1 ~]$
[martin@server1 ~]$ ssh root@server4
root@server4's password:
Last login: Sat Dec 15 12:16:38 2018 from 192.168.100.1
[root@server4 ~]# uname -r
4.14.35-1818.5.4.el7uek.x86_64
[root@server4 ~]#

So what happened? First of all, I have 2 UEK kernels on my system now. 

[root@server4 ~]# rpm -qi kernel-uek
Name : kernel-uek
Version : 4.1.12
Release : 94.3.9.el7uek
Architecture: x86_64
Install Date: Thu 13 Dec 2018 19:17:37 GMT
Group : System Environment/Kernel
Size : 160287394
License : GPLv2
Signature : RSA/SHA256, Sat 15 Jul 2017 05:06:17 BST, Key ID 72f97b74ec551f03
Source RPM : kernel-uek-4.1.12-94.3.9.el7uek.src.rpm
Build Date : Sat 15 Jul 2017 04:54:05 BST
Build Host : x86-ol7-builder-02.us.oracle.com
Relocations : (not relocatable)
Vendor : Oracle America
URL : http://www.kernel.org/
Summary : The Linux kernel
Description :
The kernel package contains the Linux kernel (vmlinuz), the core of any
Linux operating system. The kernel handles the basic functions
of the operating system: memory allocation, process allocation, device
input and output, etc.
Name : kernel-uek
Version : 4.14.35
Release : 1818.5.4.el7uek
Architecture: x86_64
Install Date: Sat 15 Dec 2018 12:24:50 GMT
Group : System Environment/Kernel
Size : 61690109
License : GPLv2
Signature : RSA/SHA256, Tue 04 Dec 2018 05:22:41 GMT, Key ID 72f97b74ec551f03
Source RPM : kernel-uek-4.14.35-1818.5.4.el7uek.src.rpm
Build Date : Tue 04 Dec 2018 05:12:59 GMT
Build Host : ca-build85.us.oracle.com
Relocations : (not relocatable)
Vendor : Oracle America
URL : http://www.kernel.org/
Summary : Oracle Unbreakable Enterprise Kernel Release 5
Description :
The kernel package contains the Linux kernel (vmlinuz), the core of any
Linux operating system. The kernel handles the basic functions
of the operating system: memory allocation, process allocation, device
input and output, etc.
[root@server4 ~]#

The first entry in the list – 4.1.12 – is UEK R4 whilst the second – 4.14.35 – is UEK R5. This can easily be explained looking at the enabled repositories:

[root@server4 ~]# yum repolist
Loaded plugins: ulninfo
repo id repo name status
ol7_UEKR5/x86_64 Latest Unbreakable Enterprise Kernel Release 5 for Oracle Linux 7Server (x86_64) 115
ol7_latest/x86_64 Oracle Linux 7Server Latest (x86_64) 11,708
repolist: 11,823

When comparing with another Oracle Linux 7.4 installation using the stock public-yum-ol7.repo I can see this instead:

[root@rac18pri1 ~]# yum repolist
Loaded plugins: ulninfo
repo id repo name status
ol7_UEKR4/x86_64 Latest Unbreakable Enterprise Kernel Release 4 for Oracle Linux 7Server (x86_64) 124
ol7_latest/x86_64 Oracle Linux 7Server Latest (x86_64) 11,708
repolist: 11,832

While I appreciate being upgraded to the latest kernel automatically, there might be people out there who don’t. The good news is that my system still had the ability to boot to UEK R4 should I need to:

https://martincarstenbach.files.wordpress.com/2018/12/181214-boot-screen... 150w, https://martincarstenbach.files.wordpress.com/2018/12/181214-boot-screen... 300w" sizes="(max-width: 719px) 100vw, 719px" />
This figure shows the interactive boot menu after the UEK R5 update with UEK R5 highlighted as the default kernel

If you have to go back to UEK R4, you might be able to change the default entry back to UEK R4, reboot, and see what happens. .

connor_mc_d's picture

It’s all downhill past 30

Yes, it sounds like a lamentation of the aging process, and from my own experience, it is definitely true that the wheels started falling off for my knees, back and plenty of other body parts once I turned 30. But that is perhaps a legacy of too much alcohol, not eating well in my youth and failing to stretch rather than any particular chronological milestone Smile.

But this post is not about that. This one is about the magical 30 character limit on identifiers that we finally lifted in version 12c. For static SQL, this is obviously a no impact change – you either define your tables and columns with longer names or you don’t, and the static SQL you write reflects that position.

But dynamic SQL is a different proposition, in particular, this discovery I made with DBMS_SQL recently. DBMS_SQL has a number of package data types that reflect the maximum identifier length in the database, for example:


  type desc_rec is record (
        col_type            binary_integer := 0,
        col_max_len         binary_integer := 0,
        col_name            varchar2(32)   := '',
        col_name_len        binary_integer := 0,
        col_schema_name     varchar2(32)   := '',
        col_schema_name_len binary_integer := 0,
        col_precision       binary_integer := 0,
        col_scale           binary_integer := 0,
        col_charsetid       binary_integer := 0,
        col_charsetform     binary_integer := 0,
        col_null_ok         boolean        := TRUE);

For dynamic SQL where we need to determine the column names dynamically with an arbitrary SQL statement, DBMS_SQL uses this data type in it’s API calls. The example below lists out some user names from the database and dynamically derives the column names (even though we obviously know them in advance for this simple example)


SQL> set serverout on format wrapped
SQL> declare
  2    c clob := 'select username, created from dba_users';
  3    l_cur     int := dbms_sql.open_cursor;
  4    l_val     varchar2(4000);
  5    l_status  int;
  6    l_desc    dbms_sql.desc_tab;
  7    l_cnt     int;
  8    l_longcol int := 0;
  9  begin
 10    dbms_sql.parse(l_cur,c,dbms_sql.native );
 11    dbms_sql.describe_columns(l_cur,l_cnt,l_desc);
 12    for i in 1 .. l_cnt loop
 13      dbms_sql.define_column(l_cur,i,l_val,4000);
 14      l_longcol := greatest(l_longcol,length(l_desc(i).col_name));
 15    end loop;
 16    l_status := dbms_sql.execute(l_cur);
 17    while dbms_sql.fetch_rows(l_cur) > 0
 18    loop
 19      for i in 1 .. l_cnt
 20      loop
 21        dbms_sql.column_value(l_cur, i,l_val);
 22        dbms_output.put_line(
 23          rpad(l_desc(i).col_name,l_longcol)||': '||l_val );
 24      end loop;
 25      dbms_output.new_line;
 26    end loop;
 27  exception
 28    when others then
 29      dbms_output.put_line(dbms_utility.format_error_backtrace);
 30      raise;
 31  end;
 32  /
USERNAME: SYS
CREATED : 08-MAR-17

USERNAME: SYSTEM
CREATED : 08-MAR-17

USERNAME: XS$NULL
CREATED : 08-MAR-17

USERNAME: OJVMSYS
CREATED : 08-MAR-17

USERNAME: SYSADMIN
CREATED : 19-JUL-18

It would seem a trivial change to increase the size limit as the version increases from 11g to 12c, but DBMS_SQL is caught between a rock and a hard place here. If you do change the limit, then you risk issues with backward compatibility for customers. If you do not change the limit, then you run into problems like the demo below:


SQL> create table t ( using_my_new_cool_long_column_names_in_12c )
  2  as select 1 from dual;

Table created.

SQL>
SQL>
SQL> set serverout on format wrapped
SQL> declare
  2    c clob := 'select * from t';
  3    l_cur     int := dbms_sql.open_cursor;
  4    l_val     varchar2(4000);
  5    l_status  int;
  6    l_desc    dbms_sql.desc_tab;
  7    l_cnt     int;
  8    l_longcol int := 0;
  9  begin
 10    dbms_sql.parse(l_cur,c,dbms_sql.native );
 11    dbms_sql.describe_columns(l_cur,l_cnt,l_desc);
 12    for i in 1 .. l_cnt loop
 13      dbms_sql.define_column(l_cur,i,l_val,4000);
 14      l_longcol := greatest(l_longcol,length(l_desc(i).col_name));
 15    end loop;
 16    l_status := dbms_sql.execute(l_cur);
 17    while dbms_sql.fetch_rows(l_cur) > 0
 18    loop
 19      for i in 1 .. l_cnt
 20      loop
 21        dbms_sql.column_value(l_cur, i,l_val);
 22        dbms_output.put_line(
 23          rpad(l_desc(i).col_name,l_longcol)||': '||l_val );
 24      end loop;
 25      dbms_output.new_line;
 26    end loop;
 27  exception
 28    when others then
 29      dbms_output.put_line(dbms_utility.format_error_backtrace);
 30      raise;
 31  end;
 32  /
ORA-06512: at "SYS.DBMS_SQL", line 2084
ORA-06512: at line 11

declare
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error: character string buffer too small
ORA-06512: at line 30
ORA-06512: at "SYS.DBMS_SQL", line 2084
ORA-06512: at line 11

The solution here is a simple one. The existing definitions in DBMS_SQL were left unchanged to preserve that backward compatibility, and additional structures were added to handle longer column names. We need simply alter our code to use the new “desc_tab2” data type and it’s accompanying “describe_columns2” call.


SQL> declare
  2    c clob := 'select * from t';
  3    l_cur     int := dbms_sql.open_cursor;
  4    l_val     varchar2(4000);
  5    l_status  int;
  6    l_desc    dbms_sql.desc_tab2;
  7    l_cnt     int;
  8    l_longcol int := 0;
  9  begin
 10    dbms_sql.parse(l_cur,c,dbms_sql.native );
 11    dbms_sql.describe_columns2(l_cur,l_cnt,l_desc);
 12    for i in 1 .. l_cnt loop
 13      dbms_sql.define_column(l_cur,i,l_val,4000);
 14      l_longcol := greatest(l_longcol,length(l_desc(i).col_name));
 15    end loop;
 16    l_status := dbms_sql.execute(l_cur);
 17    while dbms_sql.fetch_rows(l_cur) > 0
 18    loop
 19      for i in 1 .. l_cnt
 20      loop
 21        dbms_sql.column_value(l_cur, i,l_val);
 22        dbms_output.put_line(
 23          rpad(l_desc(i).col_name,l_longcol)||': '||l_val );
 24      end loop;
 25      dbms_output.new_line;
 26    end loop;
 27  exception
 28    when others then
 29      dbms_output.put_line(dbms_utility.format_error_backtrace);
 30      raise;
 31  end;
 32  /
USING_MY_NEW_COOL_LONG_COLUMN_NAMES_IN_12C: 1


PL/SQL procedure successfully completed.

In fact, a column name could always be more than 32 characters because an un-aliased column expression can yield column names of almost arbitrary length, so these additions were made several releases ago, but 12c had brought that issue to the fore because even “normal” column names can now exceed the previous limit.

Now if I only I could add a “2” to the end of my aging body to get to handle years up to 128 just like the column names in 12c Smile

To prevent automated spam submissions leave this field empty.