Describe Upgrade

Jonathan Lewis's picture

Here’s an odd little change between Oracle versions that could have a stunning impact on the application performance if the thing that generates your client code happens to use an unlucky selection of constructs.  It’s possible to demonstrate the effect remarkably easily – you just have to describe a table, doing it lots of times to make it easy to see what’s happening.

rem
rem     Script:         describe_18c.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Apr 2019
rem 

create table t1 as
select  *
from    all_objects
where   rownum = 1
;

set heading off
set feedback off
set pagesize 0
set linesize 156
set trimspool on
set termout off
set serveroutput on

execute snap_rowcache.start_snap
execute snap_libcache.start_snap

start start_10000
-- start_1 contains "describe t1"

set termout on
set serveroutput on

spool temp
execute snap_rowcache.end_snap
execute snap_libcache.end_snap

spool off

The start_10000 script is my mechanism for running a simple piece of code many times, and as the comment following it says, all I’m doing is repeating “describe t1”. The calls to “snap” something are examples of procedures I use to find the changes recorded in various dynamic performance views over short periods of time (there’s an example of the code for v$mystat here) In this case, as the names suggest, the snapshots record the changes in v$rowcache (the dictionary cache) and v$librarycache (the library cache). I’ve put a simple variant of the code at the end of the blog note so you don’t have to do all the preparation if you want to run a quick test for yourself.

Here are the results I get when running the test in Oracle 18.3.0.0

---------------------------------
Dictionary Cache - 05-Apr 19:00:00
Interval:-      27 seconds
---------------------------------
Parameter                 Sub# Usage Fixed    Gets  Misses   Scans  Misses    Comp    Mods Flushes
---------                ----- ----- -----    ----  ------   -----  --------------    ---- -------
dc_objects                         0     0 260,051       0       0       0       0       0       0
dc_users                           0     0 260,000       0       0       0       0       0       0
---------------------------------
Library Cache - 05-Apr 19:00:00
Interval:-      27 seconds
---------------------------------
Type      Description              Gets        Hits Ratio        Pins        Hits Ratio   Invalid    Reload
-----     -----                    ----        ---- -----        ----        ---- -----   -------    ------
NAMESPACE TABLE/PROCEDURE        10,003      10,003   1.0     280,028     280,028   1.0         0         0

Before showing you corresponding figures from 12.2.0.1 I’ll just point out that in version 18.3.0.0 of Oracle the structure of view all_objects gives me a table of 26 columns. Think about that and the 10,000 describes while looking at the number above, then look at the corresponding 12.2.0.1 results:

---------------------------------
Dictionary Cache - 05-Apr 19:00:00 
Interval:-      28 seconds
---------------------------------
Parameter                 Usage Fixed    Gets  Misses   Scans  Misses    Comp    Mods Flushes
---------                 ----- -----    ----  ------   -----  --------------    ---- -------
dc_users                      0     0       2       0       0       0       0       0       0
dc_objects                    0     0       9       0       0       0       0       0       0

---------------------------------
Library Cache - 05-Apr 19:04:17
Interval:-      28 seconds
---------------------------------
Type      Description              Gets        Hits Ratio        Pins        Hits Ratio   Invalid    Reload
-----     -----                    ----        ---- -----        ----        ---- -----   -------    ------
NAMESPACE TABLE/PROCEDURE        10,005      10,005   1.0      20,018      20,018   1.0         0         0

The internal mechanism of the “describe” call has changed between 12.2.0.1 to 18.3.0.0.

For each describe in 18.3, for each column in the table you see a “get” on dc_users and dc_objects in v$rowcache and you see one “get” on the TABLE/PROCEDURE namespace in v$librarycache, and (2 + number of columns) “pins”. In 12.2.0.1 there are no gets on the dictionary cache and only 1 get and two pins in the library cache for each describe.

As a couple of extra checks I modified the test to query from a 12c client to and 18c server, then from an 18c client to a 12c server. The big numbers appeared in the former test (i.e. when the server is 18c) and the small number for the latter (when the server is 12c). I also tried a couple of other variations on the theme:

  • If the table t1 doesn’t exist when I run the test then there are no gets on the row cache, and I see 2 gets and pins (with hits) on the library cache per describe.
  • If I run the test using “decribe other_schema.t1 the pins (and hits) on the library cache go up by 1 per describe
  • If I execute “alter session set current_schema = other_schema” so that “describe t1” is actually describing a table in another schema the pins (and hits) on the library cache go up by 1 per describe
  • If I run the test in the SYS schema, 18c behaves like 12c !! But SYS is often a little wierd compared to other schemas

Enabling the 10051 trace – I can see that both versions report an OPI call type = 119: “V8 Describe Any” for each call to “describe” (which, presumably, corresponds to the OCIDescribeAny() low-level function call). And that’s really where this blog started, and why lots of people might need to be aware (at least in the short term) of this change in behaviour across versions .

Welcome to the Real World.

My demonstration is clearly silly – no-one does hundreds of describes per second in a real application, surely. Well, not deliberately, and not necessarily with the intent to do a full describe, but sometimes n-tier development environments end up generating code that does things you might not expect. One such example is the way that JDBC can handle a requirement of the form:

insert into tableX( id, ......) values (my_sequence.nextval, ...) returning id into ?;

In the course of handling this requirement one of the coding strategies available to JDBC ends up executing the type 119 “V8 Describe Any” call. Imagine the effect this has when you have a couple of dozen concurrent sessions totalling a couple of million single row inserts per hour. The competition for library cache pins and row cache gets is massive – and the memory structures involved are all protected by mutexes. So when a a client of mine recently upgraded their system from 11.2.0.4 to 18.3.0.0 they saw “library cache: mutex X” waits change from a few hundred seconds per hour to tens of thousands of seconds, and “row cache mutex” leaping up  from nowhere in the “Top timed events” to reporting further even more thousands of seconds of wait time per hour.

The actual impact of this issue will depend very strongly on how much use you (or your ORM) makes of this construct. The problem may be particularly bad for my client because of the very large number of concurrent executions of a very small number of distinct statements that all address the same table. For low concurrency, or for a wide range of different tables and statements, you may not see so much contention.

If you are seeing contention for “row cache mutex” and “library cache: mutex X”, then a quick corroborative test (if you are licensed for the performance and dianostic packs) is to check the top_level_call# and top_level_call_name from v$active_session_history:

select
        top_level_call#, top_level_call_name, count(*)
from    v$active_session_history
group by
        top_level_call#, top_level_call_name
order by
        count(*)

If (119, ‘V8 Describe Any’) shows up as a significant fraction of the total then you’re probably looking at this issue.

Java is not my strong point – but here’s a trivial piece of standalone Java that you can use to demonstrate the issue if you’re familiar with running Java on the server. There are a few notes inline to explain necessary preparatory steps and code changes:


/*
        To create a class file, you need to execute
        javac temptest2.java

        This generates file temptest2.class
        If this is successful then execute
        java temptest {number of iterations} {commit frequency}

        e.g.
        java temptest2 10000 10

        To be able to compile, you need a CLASSPATH environment variable
        e.g. export CLASSPATH=$CLASSPATH:$ORACLE_HOME/jdbc/lib/ojdbc8.jar

        For java to see the created class the CLASSPATH must also include 
        the holding directory
        e.g. export CLASSPATH=$CLASSPATH:/mnt/working

        Example combined setting:
        export CLASSPATH=$CLASSPATH:$ORACLE_HOME/jdbc/lib/ojdbc8.jar:/mnt/working

        A schema will need to be created to hold two objects,
        And the connection string in the code below will need to be modified -
        and the listener has to be started and the database registered to it.

        Database objects:
        -----------------
        create sequence s1;

        create table test(
                id number, 
                code varchar2(32), 
                descr varchar2(32), 
                insert_user varchar2(32),
                insert_date date
        );

*/


import java.sql.*;
import oracle.jdbc.OracleDriver;
import java.util.Date;

public class temptest2
{
  public static void main (String arr[]) throws Exception
  {
    DriverManager.registerDriver(new oracle.jdbc.OracleDriver());
    Connection con = DriverManager.getConnection
          ("jdbc:oracle:thin:@localhost:1521:or18","test_user","test");

    Integer iters = new Integer(arr[0]);
    Integer commitCnt = new Integer(arr[1]);

    con.setAutoCommit(false);
    doInserts( con, iters.intValue(), commitCnt.intValue() );

    con.commit();
    con.close();
  }

  static void doInserts(Connection con, int count, int commitCount )
  throws Exception
  {

    int  rowcnt = 0;
    int  committed = 0;
    long start = new Date().getTime();

    for (int i = 0; i < count; i++ ) {
      PreparedStatement ps = con.prepareStatement(
           "insert into test (id, code, descr, insert_user, insert_date) " +
                     "values (s1.nextval,?,?, user, sysdate)",
           new String[]{"id"}
      );
      ps.setString(1,"PS - code" + i);
      ps.setString(2,"PS - desc" + i);
      ps.executeUpdate();

      ResultSet rs = ps.getGeneratedKeys();
      int x = rs.next() ? rs.getInt(1) : 0;
      System.out.println(x);
                
      rowcnt++;
      ps.close();

      if ( rowcnt == commitCount )
      {
        con.commit();
        rowcnt = 0;
        committed++;
      }
    }
    long end = new Date().getTime();
    con.commit();
    System.out.println
    ("pstatement " + count + " times in " + (end - start) + " milli seconds committed = "+committed);
  }
}

/*
 *
 * Sample from trace file after setting events 10046 and 10051:
 *
 * OPI CALL: type=119 argc= 7 cursor=  0 name=V8 Describe Any
 * OPI CALL: type=94 argc=38 cursor=  0 name=V8 Bundled Exec
 * PARSE #140693461998224:c=0,e=15,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3884345238,tim=1368184246
 * EXEC #140693461998224:c=0,e=135,p=0,cr=0,cu=4,mis=0,r=1,dep=0,og=1,plh=3884345238,tim=1368184411
 * OPI CALL: type=105 argc= 2 cursor=  0 name=Cursor close all
 * CLOSE #140693461998224:c=0,e=15,dep=0,type=3,tim=1368185231
 * OPI CALL: type=14 argc= 0 cursor=  0 name=COMMIT
 * XCTEND rlbk=0, rd_only=0, tim=1368185291
 * OPI CALL: type=119 argc= 7 cursor=  0 name=V8 Describe Any
 * OPI CALL: type=94 argc=38 cursor=  0 name=V8 Bundled Exec
 * PARSE #140693461998224:c=0,e=20,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3884345238,tim=1368187929
 * EXEC #140693461998224:c=0,e=162,p=0,cr=0,cu=4,mis=0,r=1,dep=0,og=1,plh=3884345238,tim=1368188141
 * OPI CALL: type=105 argc= 2 cursor=  0 name=Cursor close all
 * CLOSE #140693461998224:c=0,e=6,dep=0,type=3,tim=1368189336
 * OPI CALL: type=14 argc= 0 cursor=  0 name=COMMIT
 * XCTEND rlbk=0, rd_only=0, tim=1368189373
 *
*/


You’ll notice that I’ve prepared, executed and closed a statement inside a loop. The problem wouldn’t happen if I prepared the statement before the loop and closed it after the loop, doing nothing but the execute inside the loop; but the code is simply modelling the “single row processing” effect that typically appears through ORMs.

You’ll have to decide for yourself how to take snapshots of the dynamic performance views while this code is running, and how to emable tracing – but anyone who want to fiddle with the code is probably better at coding Java than I am – so it’s left as an exercise to the reader (I used a logon trigger for the traces, and snap_rowcache and snap_libcache from another session).

There is a certain cruel irony to this issue.  For years I have been telling people that

    insert into my_table(id, ...) values(my_sequence.nextval,...) returning id into :bind1;

is more efficient than:

    select my_sequence.nextval into :bind1 from dual;
    insert into my_table(id,.....) values(:bind1, ...);

If, at present, you’re using  Hibernate as your ORM it generates code that does the (inefficient, bad practice) latter and you won’t see the “describe” problem.

Footnote

If you want a very simple SQL*Plus script to see the effect – and have privileges to query v$rowcache and v$librarycache – here’s a hundred describes with a little wrapper to show the effect:

em
rem     The schema running this script must not be SYS
rem     but must be granted select on v_$rowcache and
rem     v_$librarycache. For the results to be clearly
rem     visible the test needs to be run while virtually
rem     nothing else is running on the instance.
rem
rem     In 18.3.0.0 every describe seems to access 
rem     dc_users, dc_objects, and pin the library cache
rem     once for every column in the table described
rem     (plus a fixed "overhead" of 2 pins and one get)
rem
rem     When run by SYS the counts fall back to the
rem     12.2  numbers -i.e. only two pins and one get
rem     on the libraray cache with no accesses to the 
rem     dictionary cache
rem
rem     The excess gets and pins disappear in 19.2, 
rem     thought the pin count on the library cache 
rem     goes up to 4 per describe.
rem

drop table t1 purge;
create table t1(
        n1 number,
        n2 number,
        n3 number,
        n4 number,
        n5 number,
        n6 number,
        n7 number
)
;


-- variant create table t1 as select * from all_objects where rownum = 1;


set serveroutput off
set linesize 167
set trimspool on


spool temp_desc
select namespace, gets, pins from v$librarycache where namespace = 'TABLE/PROCEDURE';
select parameter, gets from v$rowcache where parameter in ('dc_users','dc_objects') and subordinate# is null;
spool off

set termout off

describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1

describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1

describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1

describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1

describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1

describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1

describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1

describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1

describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1

describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1
describe t1

set termout on
set serveroutput on

spool temp_desc append

select namespace, gets, pins from v$librarycache where namespace = 'TABLE/PROCEDURE';
select parameter, gets from v$rowcache where parameter in ('dc_users','dc_objects') and subordinate# is null;

spool off



set doc off
doc

Sample output from 18.3
=======================
NAMESPACE                                                              GETS       PINS
---------------------------------------------------------------- ---------- ----------
TABLE/PROCEDURE                                                       27449      71108


PARAMETER                              GETS
-------------------------------- ----------
dc_users                              17341
dc_objects                           115830


NAMESPACE                                                              GETS       PINS
---------------------------------------------------------------- ---------- ----------
TABLE/PROCEDURE                                                       27555      72017


PARAMETER                              GETS
-------------------------------- ----------
dc_users                              18041
dc_objects                           116533


Note change in rowcache gets - one per column per describe on each parameter.
Note change in library cache pins - (one per column + 2) per describe.

Sample output from 12.2
=======================
NAMESPACE                  GETS       PINS
-------------------- ---------- ----------
TABLE/PROCEDURE           13393      20318


PARAMETER                              GETS
-------------------------------- ----------
dc_users                               4889
dc_objects                            31413


NAMESPACE                  GETS       PINS
-------------------- ---------- ----------
TABLE/PROCEDURE           13504      20539


PARAMETER                              GETS
-------------------------------- ----------
dc_users                               4889
dc_objects                            31416


No change in v$rowcache
Only the same single get and the "+2" pins per describe in v$librarycache

#


The good news is that I sent this text to a colleague who has access to Oracle 19, and the problem goes away (almost completley) – there are just two extra pins on the library cache in Oracle 19 compared to Oracle 12, and no gets on the rowcache dc_users and dc_objects. This suggests that it’s a known issue (even though there’s no visible bug report, and the problem is still present in 18.5) so it may be possible to get a backport of the version 19 code for vesion 18 fairly quickly. If not the best temporary workaround is probably to bypass the ORM and manually code for a function call that processes an anonymous PL/SQL block – but I haven’t tested that idea yet.

There is a JDBC cursor cache available – and if this were enabled than the prepared statement that was closed by the code would be kept open by the JDBC cache (and, of course, still be present in Oracle’s v$open_cursor) and Oracle wouldn’t receive any further parse or “describe” calls. Unfortunately it seems that there’s a cursor leak (still) in the JDBC caching algorithm that will lead to sessions hitting Oracle error “ORA-01000: maximum open cursors exceeded.”

Acknowledgements.

I’d particularly like to thank Lasse Jenssen who spent a couple of hours with me (when he could have been attending some interesting sessions) at the OUG Ireland conference a few days ago, working through various tests and strategies to pin down the problem and attempt to circumvent it. (Any criticism of the Java code above should, nevertheless be aimed at me).

Update

This problem is now visible on MoS as: Bug 29628952 : INCREASED ROW CACHE GETS ON DC_OBJECTS AND DC_USERS FOR DESCRIBE IN 18C AND LATER.

It’s also visible as Bug 29628647 : INCREASED GETS FOR DC_OBJECTS AND DC_USERS FOR DESCRIBE, unfortunately this latter bug has been associated with version 19.2 – where the problem doesn’t exist so the analyst has reported back (quite corretly) with “I see no problem.”

 

To prevent automated spam submissions leave this field empty.