Oakies Blog Aggregator

connor_mc_d's picture

Work Life Travel balance

I thought about writing a post on juggling work commitments, travel with my job and time at home with children and family. And then I came across this post from community friend Robin Moffatt.

https://rmoff.net/2019/02/08/travelling-for-work-with-kids-at-home/

And … well … it’s exactly the thoughts, sentiments and motivation that I wanted to put in writing. Pretty much exactly. (Except the part about a spreadsheet – I’m a DBA, we don’t use spreadsheets…ever! Smile). Robin will propbably be pissed with me saying this (because he writes some great tech stuff) but I reckon this post is his best work. It captures so well the importance of … the important stuff. This quote in particular is spot on:

In ten years who will give two monkeys if you presented at SuperKool Konference 2008? Conferences come and go. Kids just go

So true.

Hence I don’t need to write a post, because Robin has captured it perfectly. The only augmentation I have is due to the differences in ages of our children. Robin’s are 5 and 7, and mine are now 12 and 14. I’ve never been a huge fan of getting children onto the tech gadget bandwagon too soon, but now that mine are both in secondary school, I’ve changed that tune a little and got them a phone each. Obviously that comes with risks and the drawbacks that all kids tend to exhibit once they have phone (I’m looking at you Instagram and Snapchat!). But (for me) the positives of being able to reach out to the boys at a moment’s notice when I’m away outweighs that. Now I can immediately share with them the experiences of overseas adventures, and hopefully pique their interest to one day travel and explore the world when they’re older. Or even if it’s just a “Hey, I’ve just landed in transit in Dubai” it’s a reminder to them that their Dad is thinking about them when he’s away. Similarly, if they’re concerned about something or even stuck on their Maths homework, a quick Whatsapp or Telegram and we’re hooked up at least academically albeit not geographically. My next plan is to get them to occasionally come to these work trips with me; such experiences I think will be more benefit than the smattering of school days they might miss.

So kudos to Robin. If you travel for work, or even if you just work long hours – that post is well worth your time.

Richard Foote's picture

“Oracle Indexing Internals and Best Practices” Seminar – Berlin 8-9 May: DOAG Website

Just a short note to say that DOAG have now a registration page for my upcoming “Oracle Indexing Internals and Best Practices” seminar running in Berlin, Germany on 8-9 May 2019. For all the details regarding this acclaimed educational experience and how to book your place, please visit: https://www.doag.org/de/eventdetails?tx_doagevents_single[id]=577320 Please mention you heard this seminar […]

Richard Foote's picture

Unique Indexes Force Hints To Be “Ignored” Part I (What’s Really Happening)

As I was compiling my new “Oracle Diagnostics and Performance Tuning” seminar, I realised there were quite a number of indexing performance issues I haven’t discussed here previously. The following is a classic example of what difference a Unique Index can have over a Non-Unique index, while also covering the classic myth that Oracle sometimes […]

connor_mc_d's picture

LISTAGG hits prime time

It’s a simple requirement. We want to transform this:


SQL> select deptno, ename
  2  from   emp
  3  order by 1,2;

    DEPTNO ENAME
---------- ----------
        10 CLARK
        10 KING
        10 MILLER
        20 ADAMS
        20 FORD
        20 JONES
        20 SCOTT
        20 SMITH
        30 ALLEN
        30 BLAKE
        30 JAMES
        30 MARTIN
        30 TURNER
        30 WARD

into this:


    DEPTNO MEMBERS
---------- -------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

Dinosaurs like myself remember the terrible trouble we used to go to in order to solve this seemingly simple problem. We could use the MODEL clause,


SQL> select deptno , rtrim(ename,',') enames
  2  from ( select deptno,ename,rn
  3         from emp
  4         model
  5         partition by (deptno)
  6         dimension by (
  7            row_number() over
  8             (partition by deptno order by ename) rn
  9               )
 10         measures (cast(ename as varchar2(40)) ename)
 11         rules
 12         ( ename[any]
 13             order by rn desc = ename[cv()]||','||ename[cv()+1])
 14         )
 15   where rn = 1
 16   order by deptno;

    DEPTNO ENAMES
---------- ----------------------------------------
        10 CLARK,KING,MILLER
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

or we could use hierarchical trickery,


SQL> select deptno,
  2         substr(max(sys_connect_by_path(ename, ',')), 2) members
  3  from (select deptno, ename,
  4               row_number ()
  5                   over (partition by deptno order by empno) rn
  6        from emp)
  7  start with rn = 1
  8  connect by prior rn = rn - 1
  9  and prior deptno = deptno
 10  group by deptno
 11  /

    DEPTNO MEMBERS
---------- ---------------------------------------------------------
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        10 CLARK,KING,MILLER

or we could build our own aggregation routine from the ground up,


SQL> create or replace type string_agg_type as object
  2  (
  3     total varchar2(4000),
  4  
  5     static function
  6          ODCIAggregateInitialize(sctx IN OUT string_agg_type )
  7          return number,
  8  
  9     member function
 10          ODCIAggregateIterate(self IN OUT string_agg_type ,
 11                               value IN varchar2 )
 12          return number,
 13  
 14     member function
 15          ODCIAggregateTerminate(self IN string_agg_type,
 16                                 returnValue OUT  varchar2,
 17                                 flags IN number)
 18          return number,
 19  
 20     member function
 21          ODCIAggregateMerge(self IN OUT string_agg_type,
 22                             ctx2 IN string_agg_type)
 23          return number
 24  );
 25  /

Or we had some sort of personality disorder Smile then we could resort to manipulating some XML via XMLDB.


SQL> select deptno,
  2     xmltransform
  3     ( sys_xmlagg
  4        ( sys_xmlgen(ename)
  5        ),
  6       xmltype
  7       (
  8         '<?xml version="1.0"?>
 10            
 11              
 12                ,
 13            
 14          '
 15       )
 16    ).getstringval() members
 17  from emp
 18  group by deptno;

    DEPTNO MEMBERS
---------- --------------------------------------------------------
        10 CLARK,MILLER,KING,
        20 SMITH,FORD,ADAMS,SCOTT,JONES,
        30 ALLEN,JAMES,TURNER,BLAKE,MARTIN,WARD,

Thankfully all of these were solved once we made it to 11g with the simple LISTAGG function


SQL> select deptno,
  2         listagg( ename, ',')
  3              within group (order by empno) members
  4  from   emp
  5  group  by deptno;

    DEPTNO MEMBERS
---------- -----------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

I’ve always liked LISTAGG because the function reads like the business requirement we are trying to meet:

“AGG-regate a LIST of ENAMEs in sequence of EMPNO within the grouping of DEPTNO”

But there has been once problem with LISTAGG since it’s arrival in 11g, and that is what to do with duplicate data. Duplicates can either just look messy, for example, when I swap out ENAME for JOB:


SQL> select deptno
 2          listagg(job,',') within group ( order by job) as jobs
 3   from   scott.emp
 4   group by deptno
 5   order by 1;
 
   DEPTNO JOBS
--------- --------------------------------------------------
       10 CLERK,MANAGER,PRESIDENT
       20 ANALYST,ANALYST,CLERK,CLERK,MANAGER
       30 CLERK,MANAGER,SALESMAN,SALESMAN,SALESMAN,SALESMAN

Or they could be even worse in the situation where the number of duplicates results in data exceeding allowing database limits. In the example below, there are hundreds of objects each with the same object type within a schema, and hence the aggregation blows the length limit for a varchar2.


SQL> select owner
 2          listagg(object_type,',') within group 
 3              ( order by object_id ) as types
 4   from   all_ojects
 5   group by owner
 6   order by 1;
ERROR:
ORA-01499: result of string concatenation is too long

With 19c, our (technical) prayers have been answered with the long awaited arrival of the DISTINCT extension to the syntax.


SQL> select deptno,
  2        listagg(distinct job,',') within group ( order by job ) as jobs
  3  from   scott.emp
  4  group by deptno
  5  order by 1;

    DEPTNO JOBS
---------- ------------------------------------------------------------
        10 CLERK,MANAGER,PRESIDENT
        20 ANALYST,CLERK,MANAGER
        30 CLERK,MANAGER,SALESMAN

3 rows selected.

SQL> select owner,
  2        listagg(distinct object_type,',') within group ( order by object_type ) as types
  3  from   all_objects
  4  group by owner
  5  order by 1;

OWNER
------------------------------
TYPES
------------------------------------------------------------------------------------------------------------
------------
APPQOSSYS
SYNONYM,TABLE

AUDSYS
INDEX PARTITION,LIBRARY,PACKAGE,PACKAGE BODY,TABLE,TABLE PARTITION,VIEW

CTXSYS
FUNCTION,INDEX,INDEXTYPE,LIBRARY,OPERATOR,PACKAGE,PACKAGE BODY,PROCEDURE,SEQUENCE,TABLE,TYPE,TYPE BODY,VIEW

etc

davidkurtz's picture

Regular Expression Functions are Considered To Be Non-Deterministic from Oracle 12.2

You cannot put non-deterministic functions into virtual columns or function-based index.  Regular expressions have always been sensitive to National Language Support (NLS) settings because what is a character is different in different languages.  However, from Oracle 12.2. regular expressions are considered to be non-deterministic because Oracle has fixed bug 20804063.

Background 

A developer came to me with the following query, complaining it was slow.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">SELECT …
FROM …
AND LOWER(REGEXP_REPLACE(ASCIISTR(my_column), '[^a-zA-Z0-9]')) ='2732018'

Obviously, functions on columns in predicates prevent the use of indexes, so I suggested creating and indexing a virtual column (slightly more elegant and almost the same as a function-based index but without needing the function in the code to match the index).
This works fine in Oracle 12.1, but they came back saying it didn't in 12.2.  Note the error messages in bold below.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">DROP TABLE t PURGE
/
CREATE TABLE t AS
SELECT TO_CHAR(sysdate-level,'YYYY/DDD') a
FROM dual
CONNECT BY LEVEL < 1000
/

CREATE INDEX t1 on t (REGEXP_REPLACE(a,'[^a-zA-Z0-9]'))
/
ORA-01743: only pure functions can be indexed

DROP INDEX t1
/
ALTER TABLE t
ADD b AS (REGEXP_REPLACE(a,'[^a-zA-Z0-9]')) VIRTUAL
/
ORA-54002: only pure functions can be specified in a virtual column expression

I eventually came across this question and answer by Connor McDonald on AskTom: Adding constraint with REGEXP_REPLACE fails that explains that Oracle has fixed bug 20804063 in 12.2.  This fix affects virtual columns, function-based indexes (because they also create virtual columns), and constraints.
There are a number of workarounds, but none are completely satisfactory.

Workaround 1: Deterministic Function 

As Connor suggests, you can work around this unwanted behaviour by creating your own PL/SQL function and telling the database it is deterministic. This is not simply lying to the database.  Not all regular expressions are actually NLS sensitive, it depends what you are doing.  However, it does appear that the validation is applied regardless. If the function is NLS sensitive that might have unwanted consequences, including incorrect query results. You would have to decide whether you can live with the risk depending on what is actually in the column and function in question.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">CREATE OR REPLACE FUNCTION my_regexp_replace(p1 VARCHAR2, p2 VARCHAR2) 
RETURN varchar2
DETERMINISTIC IS
BEGIN
RETURN REGEXP_REPLACE(p1, p2);
END;
/
show error

CREATE INDEX t1 ON t (my_regexp_replace(a,'[^a-zA-Z0-9]'))
/
DROP INDEX t1
/
ALTER TABLE t
ADD b AS (my_REGEXP_REPLACE(a,'[^a-zA-Z0-9]')) VIRTUAL
/
CREATE INDEX t1 ON t (b)
/

Workaround 2: Refresh on Commit Materialized View 

Another possible workaround would be a materialized view that refreshes on commit, with the expression creating an additional column that is then indexed.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">CREATE TABLE t AS 
SELECT rownum n
, TO_CHAR(sysdate-level,'YYYY/DDD') a
FROM dual
CONNECT BY LEVEL < 1000
/
ALTER TABLE t ADD CONSTRAINT t_pk PRIMARY KEY (n)
/
CREATE MATERIALIZED VIEW LOG ON t
/

CREATE MATERIALIZED VIEW T1_MV
REFRESH ON COMMIT
FAST
WITH PRIMARY KEY
ENABLE QUERY REWRITE AS
SELECT t.*
,REGEXP_REPLACE(a,'[^a-zA-Z0-9]') b
FROM t
/

CREATE INDEX t1_mv_b ON t1_mv(b);

INSERT INTO t
SELECT ROWNUM+1000 n
, TO_CHAR(sysdate-level-1000,'YYYY/DDD') a
FROM dual
CONNECT BY LEVEL < 1000
/
commit
/

set autotrace on
SELECT * FROM t
WHERE REGEXP_REPLACE(a,'[^a-zA-Z0-9]') like '201720%';

And Oracle does indeed rewrite the query to use the materialised view and then uses the index on the materialized view.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 80%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">Plan hash value: 3543552962

----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 21 | 3 (0)| 00:00:01 |
| 1 | MAT_VIEW REWRITE ACCESS BY INDEX ROWID BATCHED| T1_MV | 1 | 21 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_MV_B | 2 | | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

However, this solution has the same weakness of being dependent on NLS settings – it is just that there is no validation to stop you! There is the further threat that the refresh on commit could become a performance problem if there is intense DDL on the underlying table.

Workaround 3: Use a Non-NLS Dependent Function 

Ideally, it would be better to use a non-NLS dependent function instead of a REGEXP% function. However, this may be easier said than done.  Regular expressions solve problems that are hard to do with basic string handling functions and may require a PL/SQL function anyway to hold procedural code.  Although in this relatively simple example it is possible to remove the unwanted characters with the translate and replace functions. There is no NLS restriction here.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">SELECT a
, REGEXP_REPLACE(a,'[^0-9]')
, REPLACE(TRANSLATE(a,'/',' '),' ','')
FROM t
WHERE rownum <= 10
/

A REGEXP_REPLACE(A,'[^0-9]') REPLACE(TRANSLATE(A,'/',''),'','
-------- -------------------------------- --------------------------------
2019/031 2019031 2019031
2019/030 2019030 2019030
2019/029 2019029 2019029
2019/028 2019028 2019028
2019/027 2019027 2019027
2019/026 2019026 2019026
2019/025 2019025 2019025
2019/024 2019024 2019024
2019/023 2019023 2019023
2019/022 2019022 2019022

And you can put this into a function-based index or virtual column thus

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">DROP INDEX t1 
/
ALTER TABLE t ADD b AS (REPLACE(TRANSLATE(a,'/',' '),' ','')) VIRTUAL
/
CREATE INDEX t1 ON t (b)
/
Uwe Hesse's picture

Recover dropped tables with Virtual Access Restore in #Exasol

The technique to recover only certain objects from an ordinary backup is called Virtual Access Restore. Means you create a database from backup that contains only the minimum elements needed to access the objects you request. This database is then removed afterwards.

Let’s see an example. This is my initial setup:

EXAoperation Database pagehttps://uhesse.files.wordpress.com/2019/02/virt_access1.png?w=1240&h=270 1240w, https://uhesse.files.wordpress.com/2019/02/virt_access1.png?w=150&h=33 150w, https://uhesse.files.wordpress.com/2019/02/virt_access1.png?w=300&h=65 300w, https://uhesse.files.wordpress.com/2019/02/virt_access1.png?w=768&h=167 768w, https://uhesse.files.wordpress.com/2019/02/virt_access1.png?w=1024&h=223 1024w" sizes="(max-width: 620px) 100vw, 620px" />

One database in a 2+1 cluster. Yes it’s tiny because it lives on my notebook in VirtualBox. See here how you can get that too.

It uses the data volume v0000 and I took a backup into the archive volume v0002 already.

EXAoperation volumeshttps://uhesse.files.wordpress.com/2019/02/virt_access2.png?w=1238&h=480 1238w, https://uhesse.files.wordpress.com/2019/02/virt_access2.png?w=150&h=58 150w, https://uhesse.files.wordpress.com/2019/02/virt_access2.png?w=300&h=116 300w, https://uhesse.files.wordpress.com/2019/02/virt_access2.png?w=768&h=298 768w, https://uhesse.files.wordpress.com/2019/02/virt_access2.png?w=1024&h=397 1024w" sizes="(max-width: 620px) 100vw, 620px" />

I have a schema named RETAIL there with the table SALES:

RETAIL.SALEShttps://uhesse.files.wordpress.com/2019/02/virt_access3.png?w=150&h=61 150w, https://uhesse.files.wordpress.com/2019/02/virt_access3.png?w=300&h=121 300w, https://uhesse.files.wordpress.com/2019/02/virt_access3.png 649w" sizes="(max-width: 620px) 100vw, 620px" />

By mistake, that table gets dropped:

drop tablehttps://uhesse.files.wordpress.com/2019/02/virt_access4.png?w=150 150w, https://uhesse.files.wordpress.com/2019/02/virt_access4.png?w=300 300w" sizes="(max-width: 430px) 100vw, 430px" />

And I’m on AUTOCOMMIT, otherwise this could be rolled back in Exasol. Virtual Access Restore to the rescue!

First I need another data volume:

second data volumehttps://uhesse.files.wordpress.com/2019/02/virt_access5.png?w=1240&h=240 1240w, https://uhesse.files.wordpress.com/2019/02/virt_access5.png?w=150&h=29 150w, https://uhesse.files.wordpress.com/2019/02/virt_access5.png?w=300&h=58 300w, https://uhesse.files.wordpress.com/2019/02/virt_access5.png?w=768&h=149 768w, https://uhesse.files.wordpress.com/2019/02/virt_access5.png?w=1024&h=198 1024w" sizes="(max-width: 620px) 100vw, 620px" />

Notice the size of the new volume: It is smaller than the overall size of the backup respectively the size of the “production database”! I did that to prove that space is not much of a concern here.

Then I add a second database to the cluster that uses that volume. The connection port (8564) must be different from the port used by the first database and the DB RAM in total must not exceed the licensed size, which is limited to 4 GB RAM in my case:

second databasehttps://uhesse.files.wordpress.com/2019/02/virt_access6.png?w=1238&h=296 1238w, https://uhesse.files.wordpress.com/2019/02/virt_access6.png?w=150&h=36 150w, https://uhesse.files.wordpress.com/2019/02/virt_access6.png?w=300&h=72 300w, https://uhesse.files.wordpress.com/2019/02/virt_access6.png?w=768&h=184 768w, https://uhesse.files.wordpress.com/2019/02/virt_access6.png?w=1024&h=245 1024w" sizes="(max-width: 620px) 100vw, 620px" />

I did not start that database because for the restore procedure it has to be down anyway. Clicking on the DB Name and then on the Backups button gets me here:

Foreign database backupshttps://uhesse.files.wordpress.com/2019/02/virt_access7.png?w=150&h=59 150w, https://uhesse.files.wordpress.com/2019/02/virt_access7.png?w=300&h=117 300w, https://uhesse.files.wordpress.com/2019/02/virt_access7.png?w=768&h=300 768w, https://uhesse.files.wordpress.com/2019/02/virt_access7.png 979w" sizes="(max-width: 620px) 100vw, 620px" />

No backup shown yet because I didn’t take any backups with exa_db2. Clicking on Show foreign database backups:

Backup choicehttps://uhesse.files.wordpress.com/2019/02/virt_access8.png?w=1238&h=424 1238w, https://uhesse.files.wordpress.com/2019/02/virt_access8.png?w=150&h=51 150w, https://uhesse.files.wordpress.com/2019/02/virt_access8.png?w=300&h=103 300w, https://uhesse.files.wordpress.com/2019/02/virt_access8.png?w=768&h=263 768w, https://uhesse.files.wordpress.com/2019/02/virt_access8.png?w=1024&h=351 1024w" sizes="(max-width: 620px) 100vw, 620px" />

The Expiration date must be empty for a Virtual Access Restore, so I just remove it and click Apply. Then I select the Restore Type as Virtual Access and click Restore:

Virtual Access Restorehttps://uhesse.files.wordpress.com/2019/02/virt_access9.png?w=1240&h=414 1240w, https://uhesse.files.wordpress.com/2019/02/virt_access9.png?w=150&h=50 150w, https://uhesse.files.wordpress.com/2019/02/virt_access9.png?w=300&h=100 300w, https://uhesse.files.wordpress.com/2019/02/virt_access9.png?w=768&h=257 768w, https://uhesse.files.wordpress.com/2019/02/virt_access9.png?w=1024&h=342 1024w" sizes="(max-width: 620px) 100vw, 620px" />

This will automatically start the second database:

Two databases in one clusterhttps://uhesse.files.wordpress.com/2019/02/virt_access10.png?w=1240&h=308 1240w, https://uhesse.files.wordpress.com/2019/02/virt_access10.png?w=150&h=37 150w, https://uhesse.files.wordpress.com/2019/02/virt_access10.png?w=300&h=74 300w, https://uhesse.files.wordpress.com/2019/02/virt_access10.png?w=768&h=190 768w, https://uhesse.files.wordpress.com/2019/02/virt_access10.png?w=1024&h=254 1024w" sizes="(max-width: 620px) 100vw, 620px" />

I connect to exa_db2 with EXAplus, where the Schema Browser gives me the DDL for the table SALES:

ExaPlus Schema Browser get DDLhttps://uhesse.files.wordpress.com/2019/02/virt_access11.png?w=150&h=79 150w, https://uhesse.files.wordpress.com/2019/02/virt_access11.png?w=300&h=157 300w, https://uhesse.files.wordpress.com/2019/02/virt_access11.png?w=768&h=402 768w, https://uhesse.files.wordpress.com/2019/02/virt_access11.png?w=1024&h=536 1024w, https://uhesse.files.wordpress.com/2019/02/virt_access11.png 1041w" sizes="(max-width: 620px) 100vw, 620px" />

I take that to exa_db1 and run it there, which gives me the table back but empty. Next I create a connection from exa_db1 to exa_db2 and import the table

create connection exa_db2 
to '192.168.43.11..13:8564' 
user 'sys' identified by 'exasol';

import into retail.sales 
from exa at exa_db2 
table retail.sales;

This took about 2 Minutes:

Importhttps://uhesse.files.wordpress.com/2019/02/virt_access12.png?w=150 150w, https://uhesse.files.wordpress.com/2019/02/virt_access12.png?w=300 300w" sizes="(max-width: 362px) 100vw, 362px" />

The second database and then the second data volume can now be dropped. Problem solved!

 

Jonathan Lewis's picture

Descending Problem

I’ve written in the past about oddities with descending indexes ( here, here, and here, for example) but I’ve just come across a case where I may have to introduce a descending index that really shouldn’t need to exist. As so often happens it’s at the boundary where two Oracle features collide. I have a table that handles data for a large number of customers, who record a reasonable number of transactions per year, and I have a query that displays the most recent transactions for a customer. Conveniently the table is partitioned by hash on the customer ID, and I have an index that starts with the customer_id and transaction_date columns. So here’s my query or, to be a little more accurate, the client’s query – simplified and camouflaged:


select  /*+ gather_plan_statistics */
        *
from    (
             select
                    v1.*,
                    rownum rn
             from   (
                             select   /*
                                         no_eliminate_oby
                                         index_rs_desc(t1 (customer_id, transaction_date))
                                      */
                                      t1.*
                             from     t1
                             where    customer_id = 50
                             and      transaction_date >= to_date('1900-01-01','yyyy-mm-dd')
                             order by transaction_date DESC
                ) v1
                where  rownum <= 10 -- > comment to avoid WordPress format issue
         )
where    rn >= 1
;

You’ll notice some hinting – the /*+ gather_plan_statistics */ will allow me to report the rowsource execution stats when I pull the plan from memory, and the hints in the inline view (which I’ve commented out in the above) will force a particular execution plan – walking through the index on (company_id, transaction_date) in descending order.

If I create t1 as a simple (non-partitioned) heap table I get the following plan unhinted (I’ve had to edit a “less than or equal to” symbol to avoid a WordPress format issue):

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |       |      1 |        |    14 (100)|     10 |00:00:00.01 |      14 |
|*  1 |  VIEW                           |       |      1 |     10 |    14   (0)|     10 |00:00:00.01 |      14 |
|*  2 |   COUNT STOPKEY                 |       |      1 |        |            |     10 |00:00:00.01 |      14 |
|   3 |    VIEW                         |       |      1 |     10 |    14   (0)|     10 |00:00:00.01 |      14 |
|   4 |     TABLE ACCESS BY INDEX ROWID | T1    |      1 |    340 |    14   (0)|     10 |00:00:00.01 |      14 |
|*  5 |      INDEX RANGE SCAN DESCENDING| T1_I1 |      1 |     10 |     3   (0)|     10 |00:00:00.01 |       4 |
----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN">=1)
   2 - filter(ROWNUM .LE. 10)
   5 - access("CUSTOMER_ID"=50 AND "TRANSACTION_DATE" IS NOT NULL AND "TRANSACTION_DATE">=TO_DATE('
              1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


Notice the descending range scan of the index – just as I wanted it – the minimal number of buffer visits, and only 10 rows (and rowids) examined from the table. But what happens if I recreate t1 as a hash-partitioned table with local index – here’s the new plan, again without hinting the SQL:


----------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                      | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                               |       |      1 |        |   207 (100)|     10 |00:00:00.01 |     138 |       |       |          |
|*  1 |  VIEW                                          |       |      1 |     10 |   207   (1)|     10 |00:00:00.01 |     138 |       |       |          |
|*  2 |   COUNT STOPKEY                                |       |      1 |        |            |     10 |00:00:00.01 |     138 |       |       |          |
|   3 |    VIEW                                        |       |      1 |    340 |   207   (1)|     10 |00:00:00.01 |     138 |       |       |          |
|*  4 |     SORT ORDER BY STOPKEY                      |       |      1 |    340 |   207   (1)|     10 |00:00:00.01 |     138 |  2048 |  2048 | 2048  (0)|
|   5 |      PARTITION HASH SINGLE                     |       |      1 |    340 |   206   (0)|    340 |00:00:00.01 |     138 |       |       |          |
|   6 |       TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1    |      1 |    340 |   206   (0)|    340 |00:00:00.01 |     138 |       |       |          |
|*  7 |        INDEX RANGE SCAN                        | T1_I1 |      1 |    340 |     4   (0)|    340 |00:00:00.01 |       3 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN">=1)
   2 - filter(ROWNUM. LE. 10)
   4 - filter(ROWNUM .LE. 10)
   7 - access("CUSTOMER_ID"=50 AND "TRANSACTION_DATE">=TO_DATE(' 1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TRANSACTION_DATE" IS NOT NULL)

Even though the optimizer has recognised that is will be visiting a single partition through a local index it has not chosen a descending index range scan, though it has used the appropriate index; so it’s fetched all the relevant rows from the table in the wrong order then sorted them discarding all but the top 10. We’ve done 138 buffer visits (which would turn into disk I/Os, and far more of them, in the production system).

Does this mean that the optimizer can’t use the descending index when the table is partitioned – or that somehow the costing has gone wrong. Here’s plan with the hints in place to see what happens when we demand a descending range scan:


----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |       |      1 |        |   207 (100)|     10 |00:00:00.01 |       8 |
|*  1 |  VIEW                                 |       |      1 |     10 |   207   (1)|     10 |00:00:00.01 |       8 |
|*  2 |   COUNT STOPKEY                       |       |      1 |        |            |     10 |00:00:00.01 |       8 |
|   3 |    VIEW                               |       |      1 |    340 |   207   (1)|     10 |00:00:00.01 |       8 |
|   4 |     PARTITION HASH SINGLE             |       |      1 |    340 |   206   (0)|     10 |00:00:00.01 |       8 |
|   5 |      TABLE ACCESS BY LOCAL INDEX ROWID| T1    |      1 |    340 |   206   (0)|     10 |00:00:00.01 |       8 |
|*  6 |       INDEX RANGE SCAN DESCENDING     | T1_I1 |      1 |    340 |     4   (0)|     16 |00:00:00.01 |       3 |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN">=1)
   2 - filter(ROWNUM .LE. 10)
   6 - access("CUSTOMER_ID"=50 AND "TRANSACTION_DATE" IS NOT NULL AND "TRANSACTION_DATE">=TO_DATE('
              1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

The optimizer is happy to oblige with the descending range scan – we can see that we’ve visited only 8 buffers, and fetched only 10 rows from the table. The cost, however, hasn’t made any allowance for the limited range scan. Check back to the plan for the simple (non-partitioned) table and you’ll see that the optimizer did allow for the reduced range scan. So the problem here is a costing one – we have to hint the index range scan if we want Oracle limit the work it does.

You might notice, by the way that the number of rowids returned in the index range scan descending operation is 16 rather than 10 – a little variation that didn’t show up when the table wasn’t partitioned. I don’t know why this happened, but when I changed the requirement to 20 rows the range scan returned 31 rowids, when I changed it to 34 rows the range scan returned 46 rows, and a request for 47 rows returned 61 index rowids – you can see the pattern, the number of rowids returned by the index range scan seems to be 1 + 15*N.

Footnote:

If you want to avoid hinting the code (or adding an SQL patch) you need only re-create the index with the transaction_date column declared as descending (“desc”), at which point the optimizer automatically chooses the correct strategy and the run-time engine returns exactly 10 rowids and doesn’t need to do any sorting. But who wants to create a descending index when they don’t really need it !

If you want to reproduce the experiments, here’s the script to create my test data.


rem
rem     Script:         pt_ind_desc_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2018
rem     Purpose:        
rem
rem     Last tested 
rem             18.3.0.0
rem             12.2.0.1
rem             12.1.0.2
rem

create table t1 (
        customer_id,
        transaction_date,
        small_vc,
        padding 
)
partition by hash(customer_id) partitions 4
nologging
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        mod(rownum,128)                         customer_id,
        (trunc(sysdate) - 1e6) + rownum         transaction_date,
        lpad(rownum,10,'0')                     v1,
        lpad('x',100,'x')                       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 -- > comment to avoid WordPress format issue
;

create index t1_i1 on t1(customer_id, transaction_date) 
local 
nologging
;

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1'
        );
end;
/

I’ve run this test on 12.1.0.2, 12.2.0.1, and 18.3.0.0 – the behaviour is the same in all three versions.

Update (1st Feb 2019)

As the client reminded me after reading the post, it’s worth pointing out that for more complex SQL you still have to worry about the errors in the cardinality and cost calculations that could easily push the optimizer into the wrong join order and/or join method – whether you choose to hint the ascending index or create a descending index.  Getting the plan you want for this type of “pagination” query can be a messy process.

Richard Foote's picture

Announcement: “Oracle Indexing Internals and Best Practices Seminar” – Berlin 8-9 May 2019 !!

I’m very excited to announce I’ll be running my highly acclaimed “Oracle Indexing Internals and Best Practices” seminar in conjunction with the German Oracle User Group (DOAG) in Berlin, Germany on 8-9 May 2019. The venue will be the DOAG Offices in Berlin (DOAG Office, 6th Floor, Tempelhofer Weg 64, 12347 Berlin). Because of venue […]

Richard Foote's picture

Announement: “Oracle Indexing Internals and Best Practices Seminar” – Berlin 8-9 May 2019 !!

I’m very excited to announce I’ll be running my highly acclaimed “Oracle Indexing Internals and Best Practices” seminar in conjunction with the German Oracle User Group (DOAG) in Berlin, Germany on 8-9 May 2019. The venue will be the DOAG Offices in Berlin (DOAG Office, 6th Floor, Tempelhofer Weg 64, 12347 Berlin). Because of venue […]

Franck Pachot's picture

Where to check Data Guard gap?

At work, we had a discussion with well-known colleagues, Luca Canali and Ludovico Caldara, about where we check that Data Guard recovery works as expected without gap. Several views can be queried, depending on the context. Here are a few comments about them.

v$database

This is my preferred because it relies on the actual state of the database, whatever the recovery process is:

SQL> select scn_to_timestamp(current_scn) 
from v$database;
SCN_TO_TIMESTAMP(CURRENT_SCN)
----------------------------------------------------------
22-JAN-19 03.08.32.000000000 PM

This reads the current System Change number (DICUR_SCN from X$KCCDI) and maps it to a timestamp (using the mapping SMON_SCN_TIME table).

However, relying on a function which is available only when the database is opened, this is easy only with Active Data Guard. When the database is in mount state, you will get ORA-00904: “SCN_TO_TIMESTAMP”: invalid identifier.

Note that I’ve also seen cases where, in case of gap, the SMON_SCN_TIME was not up-to-date and I got ORA-08181: specified number is not a valid system change number. Then this is not for automatic monitoring.

Without Active Data Guard, you need to do the SCN to timestamp conversion on the primary. Or read the SCN from the datafiles, but this is not the latest apply but the lastest checkpointed:

SQL> alter session set nls_date_format='dd-mm-yy hh24:mi:ss';
Session altered.
SQL> select max(checkpoint_time) from v$datafile_header;
MAX(CHECKPOINT_TI
-----------------
26-01-19 17:45:04

Reading from V$DATABASE does not rely on Data Guard and then is also available when the MRP is not started and also in Standard Edition non-managed standby.

gv$recovery_progress

This is Luca’s favored one (see is adg.sql script among many other interesting ones in https://github.com/LucaCanali/Oracle_DBA_scripts):

SQL> select inst_id, max(timestamp) 
from gv$recovery_progress group by inst_id;
INST_ID MAX(TIMESTAMP)
---------- --------------------
1 22-JAN-2019 15:08:51

Where does this information come from? If you look at the execution plan you will see that it reads X$KSULOP which is the X$ that is behind V$SESSION_LONGOPS. You can get the same timestamp from it:

SQL> alter session set nls_date_format='dd-mm-yy hh24:mi:ss';
Session altered.
SQL> select inst_id,opname,timestamp from gv$session_longops 
where opname='Media Recovery' and target_desc='Last Applied Redo';
INST_ID OPNAME               TIMESTAMP
---------- -------------------- -----------------
1 Media Recovery 26-01-19 18:56:39
1 Media Recovery 26-01-19 19:40:35

As this information comes from what the MRP (Managed Recovery Process) logs, this view is available only when the recovery is running (APPLY-ON).

v$managed_standby;

Talking about what is logged by MRP, Ludo goes to the MRP status in v$managed_standby to see the sequence and block# increase. This is very interesting as we can compare the remaining work to do, from what is received by RFS:

19:59:46 SQL> select inst_id,process,status,client_process,thread#,sequence#,block#,blocks from gv$managed_standby;
I PROCESS   STATUS       CLIENT_P THREAD# SEQUENCE# BLOCK# BLOCKS
- --------- ------------ -------- ------- --------- ------ ------
1 ARCH CLOSING ARCH 1 34 32768 481
1 DGRD ALLOCATED N/A 0 0 0 0
1 DGRD ALLOCATED N/A 0 0 0 0
1 ARCH CLOSING ARCH 1 41 28672 2046
1 ARCH CLOSING ARCH 1 37 30720 1804
1 ARCH CLOSING ARCH 1 40 26624 1608
1 RFS IDLE Archival 1 0 0 0
1 RFS IDLE UNKNOWN 0 0 0 0
1 RFS IDLE LGWR 1 42 121 1
1 RFS IDLE UNKNOWN 0 0 0 0
1 MRP0 APPLYING_LOG N/A 1 42 121 40960
11 rows selected.
19:59:50 SQL> /
I PROCESS   STATUS       CLIENT_P THREAD# SEQUENCE# BLOCK# BLOCKS
- --------- ------------ -------- ------- --------- ------ ------
1 ARCH CLOSING ARCH 1 34 32768 481
1 DGRD ALLOCATED N/A 0 0 0 0
1 DGRD ALLOCATED N/A 0 0 0 0
1 ARCH CLOSING ARCH 1 41 28672 2046
1 ARCH CLOSING ARCH 1 37 30720 1804
1 ARCH CLOSING ARCH 1 40 26624 1608
1 RFS IDLE Archival 1 0 0 0
1 RFS IDLE UNKNOWN 0 0 0 0
1 RFS IDLE LGWR 1 42 124 1
1 RFS IDLE UNKNOWN 0 0 0 0
1 MRP0 APPLYING_LOG N/A 1 42 124 40960
11 rows selected.

v$archived_log

V$ARCHIVED_LOG has an ‘APPLIED’ flag, but it is not really helpful here as it does not consider the real-time apply. In the following screenshot the changes up to 20:35:46 in sequence# 55 have been applied but V$ARCHIVED_LOG shows sequence 52 as not applied:

This view is about archived logs. But before being archived, the redo stream is received to the standby logs.

v$standby_log

With real-time apply (RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE) the redo is applied as soon as it is received in the standby redo logs. The gap should be small and is visible in v$managed_standby (number of blocks between RFS from LGWR and MRP apply). I’ve seen some monitoring queries on V$STANDBY_LOG. The idea is to read the actual state of the transport, in the same idea that when I read V$DATABASE for the actual state of apply, without relying on what is logged by the processes:

SQL> select max(last_time) 
from gv$standby_log;
MAX(LAST_TIME)
--------------------
22-JAN-2019 15:08:55

However, this query reads X$KCCSL which is not very efficient as it reads the standby redo log files. This can be long when they are large and full:

So… be careful with this one.

v$dataguard_stats

The DG Broker ‘show database’ displays the gap information. This comes from V$DATAGUARD_STATS as Data Guard checks the state at regular interval and stores the latest here:

SQL> select name||' '||value ||' '|| unit
||' computed at '||time_computed
from v$dataguard_stats;
NAME||''||VALUE||''||UNIT||'COMPUTEDAT'||TIME_COMPUTED
------------------------------------------------------
transport lag +00 00:00:00 day(2) to second(0) interval computed at 01/27/2019 22:08:33
apply lag +00 00:00:01 day(2) to second(0) interval computed at 01/27/2019 22:08:33
apply finish time +00 00:00:06.493 day(2) to second(3) interval computed at 01/27/2019 22:08:33

You must always check when the value was calculated (TIME_COMPUTED) and may add this to gap to estimate the gap from the current time, as with DGMGRL:

Role:               PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 1 second ago)
Apply Lag: 1 second (computed 1 second ago)
Average Apply Rate: 0 Byte/s
Real Time Query: OFF
Instance(s):
CDB2

To prevent automated spam submissions leave this field empty.