Oakies Blog Aggregator

Jonathan Lewis's picture

Descending bug

Following on from Monday’s posting about reading execution plans and related information, I noticed a question on the ODC database forum asking about the difference between “in ({list of values})” and a list of “column = {constant}” predicates connected by OR. The answer to the question is that there’s essentially no difference as you would be able to see from the predicate section of an execution plan:


SELECT  c1, c2, c3, c4, c5, c6, c7, c8..  
FROM    TAB1  
WHERE   STS IN ( 'A', 'B')  
AND     cnt < '4'  
AND     dt < sysdate  
and     rownum <=1;  
  
---------------------------------------------------------------------------------------------------------  
| Id  | Operation                     | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |  
---------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT              |                         |     1 |  1847 | 33399   (1)| 00:03:14 |  
|*  1 |  COUNT STOPKEY                |                         |       |       |            |          |  
|   2 |   INLIST ITERATOR             |                         |       |       |            |          |  
|*  3 |    TABLE ACCESS BY INDEX ROWID| TAB1                    |   114K|   201M| 33399   (1)| 00:03:14 |  
|*  4 |     INDEX RANGE SCAN          | TAB1_STS_IDX            |   114K|       |   723   (1)| 00:00:05 |  
---------------------------------------------------------------------------------------------------------  
  
Predicate Information (identified by operation id):  
---------------------------------------------------  
   1 - filter(ROWNUM<=1)  
   3 - filter("cnt"<'4' AND "dt"

Note how the predicate section tells you that the original “sts in ( ‘A’, ‘B’ )” has been transformed into “sts = ‘A’ or sts = ‘B'”.

A further point I made about IN-lists in Monday’s post was that as one step in the transformation Oracle would sort the list and eliminate duplicates, and it suddenly occurred to me to wonder whether Oracle would sort the list in descending order if the only relevant index were defined to start with a descending column. Naturally I had to try it so here’s a suitable script to prepare some data:

rem
rem     Script:         descending_bug_06.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Aug 2018
rem     Purpose:
rem
rem     Last tested
rem             18.3.0.0        Crashes
rem             12.2.0.1        Crashes
rem             12.1.0.2        Crashes
rem             11.2.0.4        Bad Plan
rem

create table t1
nologging
pctfree 95 pctused 5
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        lpad(rownum,10,'0')             v1,
        case mod(rownum,1000)
                when 0 then 'A'
                when 3 then 'B'
                when 6 then 'C'
                       else 'D'
        end                             sts,
        case mod(rownum,1000)
                when 0 then '1'
                when 3 then '2'
                when 6 then '3'
                       else '4'
        end                             cnt,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5 -- > comment to avoid WordPress format issue
;

create index t1_i1a on t1(sts) nologging;
create index t1_i1d on t1(sts desc) nologging;

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size skewonly'
        );
end;
/

There is one oddity in this script – if you’ve got every column in an index declared as DESC you’ve made a mistake and none of the columns should be declared as DESC. The feature is relevant only if you want a mixture of ascending and descending column in a single index.

An important detail of the script is that I’ve gathered stats AFTER creating the objects – it’s important to do this, even in 18.3, because (a) creating the “descending” index will result in a hidden virtual column being created to represent the descending column and I want make sure I have stats on that column and (b) the “stats on creation” code doesn’t generate histograms and I want a (frequency) histogram on columns sts and the hidden, virtual, descending version of the column.

After generating the data and checking that I have the correct histograms for sts and sys_nc00006$ (the relevant hidden column) I can then run the following test:

set serveroutput off
alter session set statistics_level = all;

alter index t1_i1d invisible;

select  sts, count(*)
from    t1
where   sts in ('B','C')
group by
        sts
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

alter index t1_i1d   visible;
alter index t1_i1a invisible;

select  sts, count(*)
from    t1
where   sts in ('B','C')
group by
        sts
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

alter session set statistics_level = typical;
set serveroutput on


The code makes one index invisible then runs a query that should use an inlist iterator; then it switches indexes making the invisible one visible and vice versa and repeats the query. I’ve enabled rowsource execution statistics and pulled the execution plans from memory to make sure I don’t get fooled by any odd glitches that might exist within “explain plan”. Here are the results from 11.2.0.4 – normal index, then descending index – with a little cosmetic cleaning:


S   COUNT(*)
- ----------
B        100
C        100

SQL_ID  f20u42pmw1z6w, child number 0
-------------------------------------
select sts, count(*) from t1 where sts in ('B','C') group by  sts

-----------------------------------------------------------------------------------------
| Id  | Operation            | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |      1 |        |      2 |00:00:00.01 |       5 |
|   1 |  SORT GROUP BY NOSORT|        |      1 |      2 |      2 |00:00:00.01 |       5 |
|   2 |   INLIST ITERATOR    |        |      1 |        |    200 |00:00:00.01 |       5 |
|*  3 |    INDEX RANGE SCAN  | T1_I1A |      2 |    178 |    200 |00:00:00.01 |       5 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access(("STS"='B' OR "STS"='C'))


Index altered.
Index altered.


S   COUNT(*)
- ----------
C        100
B        100


SQL_ID  f20u42pmw1z6w, child number 0
-------------------------------------
select sts, count(*) from t1 where sts in ('B','C') group by  sts

-----------------------------------------------------------------------------------------
| Id  | Operation            | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |      1 |        |      2 |00:00:00.02 |     198 |
|   1 |  SORT GROUP BY NOSORT|        |      1 |      2 |      2 |00:00:00.02 |     198 |
|*  2 |   INDEX FULL SCAN    | T1_I1D |      1 |   1000 |    200 |00:00:00.02 |     198 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter((SYS_OP_UNDESCEND("T1"."SYS_NC00006$")='B' OR
              SYS_OP_UNDESCEND("T1"."SYS_NC00006$")='C'))


As expected we see counts of 100 for ‘B’s and ‘C’s, and we also see that the “sort group by nosort” operation with the descending index has produced the results in reverse order. The problem though is that the optimizer has decided to use an “index full scan” on the descending index, and the estimate of the rows returned is terribly wrong (and seems to be the common “5% guess”, used once for each target value), and the number of buffer visits is huge compared to the result from the normal index – Oracle really did walk every leaf block in the index to get this result. The predicate section also looks rather silly – why hasn’t the optimizer produced predicates more like: “sys_nc00006$ = sys_op_descend(‘B’)” ?

In passing you’ll notice that the estimated rows in the plan using the normal index is a little low. This is the result of Oracle using a small sample (ca. 5,500 rows) in 11g to gather histogram stats. 12c will do better for a frequency histogram with the fast algorithm it uses for a 100% (auto) sample size.

So 11g doesn’t do very well but we’ve got 12.1.0.2, 12.2.0.1, and (in the last couple of weeks) 18.3 to play with. Here’s the result from 12.1.0.2 and 12.2.0.1 for the query that should use the descending index:


select  sts, count(*)
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [qernsRowP], [1], [], [], [], [], [], [], [], [], [], []


SQL_ID  f20u42pmw1z6w, child number 0
-------------------------------------
select sts, count(*) from t1 where sts in ('B','C') group by  sts

-----------------------------------------------------------------------------------------
| Id  | Operation            | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |      1 |        |      0 |00:00:00.01 |       0 |
|   1 |  SORT GROUP BY NOSORT|        |      1 |      2 |      0 |00:00:00.01 |       0 |
|   2 |   INLIST ITERATOR    |        |      1 |        |    101 |00:00:00.03 |       5 |
|*  3 |    INDEX RANGE SCAN  | T1_I1D |      2 |    200 |    101 |00:00:00.01 |       5 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access(("T1"."SYS_NC00006$"=SYS_OP_DESCEND('B') OR
              "T1"."SYS_NC00006$"=SYS_OP_DESCEND('C')))
       filter((SYS_OP_UNDESCEND("T1"."SYS_NC00006$")='B' OR
              SYS_OP_UNDESCEND("T1"."SYS_NC00006$")='C'))

The query crashed! The plan, however, did look appropriate – the optimizer picked an inlist iterator, picked an index range scan, got the correct estimate of rows (index entries), and did better with the predicate section (though having used a sensible predicate for the access predciate it then used the bizarre version as the filter predicate). Judging from the A-rows column the query seems to have crashed at roughly the point where the optimizer was switching from the range scan for the first iteration into the range scan for the second iteration.

And then there’s Oracle 18.3 – which does the same as the 12c versions :(

To make sure that my silly “single column so it shouldn’t be declared descending” index was the sole cause of the problem I repeated the tests using a two-column index on (sts, cnt).

Conclusion:

Descending indexes or (to be more accurate) indexes with descending columns can still produce problems even in the very latest version of Oracle.

Footnote

Oracle MoS has the wonderful “ORA-600/ORA-7445/ORA-700 Error Look-up Tool (Doc ID 153788.1)” (which doesn’t yet allow you to choose 18.3 as a version) so I used this to do a look up for ORA-00600 errors with first paremeter qernsRowP in 12.2.0.1 and got the following suggestion from doc ID 285913.1: “set event:10119 to disable no-sort fetch and then reparse the failing SQL.” The example suggested setting the event to level 12, and this solved the problem for all three failing versions – but the suggestion came with a warning: “Setting this event at system level may impact the performance of database.” The execution plan (taken, in this case, from 18.2) may explain the warning:

 

S   COUNT(*)
- ----------
B	 100
C	 100

SQL_ID	f20u42pmw1z6w, child number 0
-------------------------------------
select sts, count(*) from t1 where sts in ('B','C') group by  sts

------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	    |	   1 |	      |      2 |00:00:00.01 |	    4 |       |       | 	 |
|   1 |  HASH GROUP BY	   |	    |	   1 |	    2 |      2 |00:00:00.01 |	    4 |  1558K|  1558K|  659K (0)|
|   2 |   INLIST ITERATOR  |	    |	   1 |	      |    200 |00:00:00.01 |	    4 |       |       | 	 |
|*  3 |    INDEX RANGE SCAN| T1_I1D |	   2 |	  200 |    200 |00:00:00.01 |	    4 |       |       | 	 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access(("T1"."SYS_NC00006$"=SYS_OP_DESCEND('B') OR "T1"."SYS_NC00006$"=SYS_OP_DESCEND('C')))
       filter((SYS_OP_UNDESCEND("T1"."SYS_NC00006$")='B' OR SYS_OP_UNDESCEND("T1"."SYS_NC00006$")='C'))


The plan has changed from using a “sort group by nosort” – which effectively means just keeping a running count as you go – to a real “hash group by” which means you have to do the hashing arithmetic for every value (though maybe there’s a deterministic trick that means Oracle won’t do the arithmetic if the next value to be hashed is the same as the previous value) and the actual memory used (659K) does seem a little extreme for counting two distinct values.

 

connor_mc_d's picture

Take care with regular expressions

In an Office Hours session a couple of months back, I covered an important change that comes to regular expressions once you upgrade to 12c Release 2. You can see the video covering the issue here:

but for the TL;DR brigade reading this post: Regular expressions are not deterministic when you take NLS settings into account and thus cannot be used in constraints and/or function-based indexes.

This is just a post to quickly revisit the topic for anyone thinking of upgrading from an earlier release to 12c Release 2. An AskTOM question came in asking what would happen to such constraints during the upgrade process.

The sad answer is … nothing. For example, if you successfully (and by strict definition, incorrectly) created a constraint with a regular expression in 11g, then after the upgrade, that constraint will still be present in your new 12c Release 2 system. It will continue to work as it did in 11g, and even if you disable/enable it, or put it through a validate command to exercise the data, it will work as it did before.

To be honest, I do not like this, because it can become what I call a “sleeper” problem. If, 6 months after you upgrade, you happen to drop and recreate that constraint you’ll be most distressed to find that it cannot be added, and you will have most probably long forgotten that it was caused by an event that occurred 6 months ago, namely the upgrade. And perhaps worse, you now have an index or constraint that could potentially be corrupted by innocent tinkering with session NLS settings.

So before you upgrade, definitely take a cursory glance through your constraint definitions and take remedial action if needed.

davidkurtz's picture

Parallel Execution of PL/SQL

A recent experience with text searching showed up some limitations on parallel execution of PL/SQL functions, and how to work around them.

Text Searching can be CPU Intensive?

It started with a requirement to find names that roughly matched a search string. The roughness of the match was determined by calculating the Levenshtein distance between each customer name and the search string. Informally, the Levenshtein distance is a mathematical function that calculates the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. Oracle has implemented it as a function in a delivered package UTL_MATCH.EDIT_DISTANCE().  This function isn't directly relevant to the rest of the discussion about parallelism, except that it is an example of CPU intensive processing that doesn't alter the database where parallel execution may be a legitimate tactic for improving performance.
The examples in this article use the SH schema in the Oracle sample schemas (available on Github).

The following query searches case-insensitively for customer names within a Levenshtein distance of 3 from 'Stephen'. It finds 'Steven' and 'Staphany'.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">set autotrace on timi on pages 99 lines 200
with x as (
select c.cust_first_name
, utl_match.edit_distance(upper(cust_first_name),'STEPHEN') edit_distance
from customers c
)
select * from x
where edit_distance <= 3
/

CUST_FIRST_NAME EDIT_DISTANCE
-------------------- -------------
Staphany 3
Steven 2
Steven 2

However, to do so Oracle had to full scan the table and execute the function for every row in the table.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">Plan hash value: 2008213504
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2775 | 19425 | 429 (4)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| CUSTOMERS | 2775 | 19425 | 429 (4)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("UTL_MATCH"."EDIT_DISTANCE"(UPPER("CUST_FIRST_NAME"),'STEPHEN')<=3)

Oracle implemented Levenshtein as a C function that is then called from PL/SQL, so it just consumes CPU and doesn't do anything else to the database. You can start to see why the user complained that the query with UTL_MATCH was slow.
However, the first question is how much time do we spend on the full scan and how much time do we spend executing this function?

Follow the Time with Instrumentation 

For test purposes, I am going to build my own packaged functions with session instrumentation. Then I can use Active Session History (ASH) to work out where the time went.
NB: ASH requires a licence for the Diagnostics Pack

  • One function levenshtein() calls to UTL_MATCH.EDIT_DISTANCE().
  • The other dolittle() is a control function that does nothing except instrumentation. It is used to measure the intrusion effect of the instrumentation.
#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">CREATE OR REPLACE package dmk AS
function levenshtein(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER;
function dolittle(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER;
END;
/

CREATE OR REPLACE package body dmk AS
function levenshtein(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER IS
l_distance INTEGER;
l_module VARCHAR2(64);
l_action VARCHAR2(64);
BEGIN
dbms_application_info.read_module(l_module,l_action); /*save current module/action*/
dbms_application_info.set_action('levenshtein()'); /*set action for function*/
l_distance := utl_match.edit_distance(UPPER(p1),UPPER(p2));
dbms_application_info.set_action(l_action); /*restore previous action*/
RETURN l_distance;
END levenshtein;

function dolittle(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER IS
l_distance INTEGER;
l_module VARCHAR2(64);
l_action VARCHAR2(64);
BEGIN
dbms_application_info.read_module(l_module,l_action);
dbms_application_info.set_action('dolittle()');
l_distance := 1;
dbms_application_info.set_action(l_action);
RETURN l_distance;
END dolittle;
END dmk;
/

Now, I will run a test query that executes UTL_MATCH for each of the 55500 rows in the CUSTOMERS table and executes that all 10 times inside a PL/SQL block.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">set timi on autotrace on lines 500 serveroutput on
DECLARE
l_counter INTEGER := 0;
BEGIN
dbms_application_info.set_module('DMK LEVENSHTEIN TEST',null);
FOR j IN 1..10 LOOP
FOR i IN (
select sum(dmk.levenshtein(cust_first_name,'STEPHEN')) a
, sum(dmk.dolittle(cust_first_name,'STEPHEN')) b
from customers c
) LOOP
l_counter := l_counter+i.b;
END LOOP;
END LOOP;
dbms_application_info.set_module(null,null);
dbms_output.put_line('Executions: '||l_counter);
END;
/
Executions: 555000

PL/SQL procedure successfully completed.

Elapsed: 00:00:25.08

We can see that the Levenshtein function was executed 555000 times.  Now I will query the ASH for this test and group it by ACTION.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">set timi on autotrace off lines 500
column module format a25
column action format a25
select module, action, sql_id, sum(1) ash_Secs
from v$active_Session_History
where module = 'DMK LEVENSHTEIN TEST'
and sample_time >= SYSDATE-1/1440
group by module, action, sql_id
/

MODULE ACTION SQL_ID ASH_SECS
------------------------- ------------------------- ------------- ----------
DMK LEVENSHTEIN TEST dolittle() 7gp0w6qdvxrd2 2
DMK LEVENSHTEIN TEST levenshtein() 7gp0w6qdvxrd2 3
DMK LEVENSHTEIN TEST 7gp0w6qdvxrd2 20

The runtime of the Levenshtein function took 3 seconds, and the function that does nothing except instrumentation is 2 seconds, so the overhead of UTL_MATCH is only about 1 second, and there are 20 seconds in the SQL. In this test, the overhead of Levenshtein is low, but it would still be worth doing the full scan in parallel.

I Can't Get No Parallelism 

But there is a problem! I specified parallelism with a hint, but I don't get a parallel plan.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">select /*+PARALLEL(C 4)*/ avg(utl_match.edit_distance(UPPER(cust_first_name),'STEPHEN'))
from customers c
/

Plan hash value: 1978308596
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 115 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 7 | | |
| 2 | SORT AGGREGATE | | 1 | 7 | | |
| 3 | TABLE ACCESS FULL| CUSTOMERS | 55500 | 379K| 115 (0)| 00:00:01 |
---------------------------------------------------------------------------------

However, if I use a simple SQL function, then I do get parallelism.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">select /*+PARALLEL(C 4)*/ max(cust_first_name)
from customers c
/

Plan hash value: 1221513835
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 115 (0)| 00:00:01 | | | |
| 1 | SORT AGGREGATE | | 1 | 7 | | | | | |
| 2 | PX COORDINATOR | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 7 | | | Q1,00 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | 7 | | | Q1,00 | PCWP | |
| 5 | PX BLOCK ITERATOR | | 55500 | 379K| 115 (0)| 00:00:01 | Q1,00 | PCWC | |
| 6 | TABLE ACCESS FULL| CUSTOMERS | 55500 | 379K| 115 (0)| 00:00:01 | Q1,00 | PCWP | |
-----------------------------------------------------------------------------------------------------------------

Note
-----
- Degree of Parallelism is 4 because of table property

So, UTL_MATCH.EDIT_FUNCTION must be preventing parallelism in some way. There is a closed Oracle bug 169587070 that details exactly this problem with UTL_MATCH.

PARALLEL_ENABLE and PRAGMA RESTRICT_REFERENCES 

This is also documented default behaviour in PL/SQL. Database VLDB and Partitioning Guide: About Parallel Execution of Functions, Functions in Parallel Queries:
"A user-written function may be executed in parallel in any of the following cases: 

  • If it has been declared with the PARALLEL_ENABLE keyword
  • If it is declared in a package or type and has a PRAGMA RESTRICT_REFERENCES clause that indicates all of WNDS, RNPS, and WNPS 
  • If it is declared with CREATE FUNCTION and the system can analyze the body of the PL/SQL code and determine that the code neither writes to the database nor reads or modifies package variables" 

Oracle only wraps the bodies of their delivered package, but not the package specifications. So, you can see for yourself that the delivered UTL_MATCH package does not contain these specifications.

Workarounds 

There are various workarounds for this.

  • I could add the PARALLEL_ENABLE declarations to the functions in the package specification of the delivered UTL_MATCH package. Although it does work, I would certainly not be happy to alter any delivered Oracle package in any serious database, without approval from Oracle support.
  • Or, I could add the RESTRICT_REFERENCES pragma to the package specification. Again, although this works, it involves altering a delivered package.
  • However, I can wrap the delivered package in my own packaged function with either PARALLEL_ENABLE (my personal preferrance).
#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">CREATE OR REPLACE package dmk AS
FUNCTION levenshtein(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER PARALLEL_ENABLE;
FUNCTION dolittle(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER PARALLEL_ENABLE;
END;
/
  • Or you can use RESTRICT_REFERENCES in the package specification, but you must include the TRUST pragma to over-ride the lack of a pragma definition in the called packaged function.
#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">CREATE OR REPLACE package dmk AS
FUNCTION levenshtein(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER;
PRAGMA RESTRICT_REFERENCES (levenshtein,WNDS,RNDS,WNPS,RNPS,TRUST);
FUNCTION dolittle(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER;
PRAGMA RESTRICT_REFERENCES (do_little,WNDS,RNDS,WNPS,RNPS,TRUST);
END;
/

Now, I get parallel execution of the unaltered delivered UTL_MATCH.EDIT_DISTANCE() function.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">select /*+PARALLEL(C 4) FULL(C)*/ sum(dmk.levenshtein(cust_first_name,'STEPHEN')) a
, sum(dmk.dolittle(cust_first_name,'STEPHEN')) b
from customers c
/

Plan hash value: 1221513835
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 115 (0)| 00:00:01 | | | |
| 1 | SORT AGGREGATE | | 1 | 7 | | | | | |
| 2 | PX COORDINATOR | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 7 | | | Q1,00 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | 7 | | | Q1,00 | PCWP | |
| 5 | PX BLOCK ITERATOR | | 55500 | 379K| 115 (0)| 00:00:01 | Q1,00 | PCWC | |
| 6 | TABLE ACCESS FULL| CUSTOMERS | 55500 | 379K| 115 (0)| 00:00:01 | Q1,00 | PCWP | |
-----------------------------------------------------------------------------------------------------------------

Note
-----
- Degree of Parallelism is 4 because of table property

Conclusion: Parallel PL/SQL 

I can now repeat the earlier test, but with a parallel hint, the runtime goes down from 25 to 9 seconds, although about the same amount of database time is recorded by ASH. So parallelism can improve the performance for the end user, but will not reduce the total CPU overhead. If anything is likely to increase overall CPU time.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">set timi on lines 500
DECLARE
l_counter INTEGER := 0;
BEGIN
dbms_application_info.set_module('DMK LEVENSHTEIN TEST',null);
FOR j IN 1..10 LOOP
FOR i IN (
select /*+PARALLEL(C 4)*/
sum(dmk.levenshtein(cust_first_name,'STEPHEN')) a
, sum(dmk.dolittle(cust_first_name,'STEPHEN')) b
from customers c
) LOOP
l_counter := l_counter+i.b;
END LOOP;
END LOOP;
dbms_application_info.set_module(null,null);
dbms_output.put_line('Executions: '||l_counter);
END;
/
Executions: 555000

PL/SQL procedure successfully completed.

Elapsed: 00:00:09.34

set timi on autotrace off lines 500
column module format a32
column action format a32
select module, action, sql_id, sum(1) ash_Secs
from v$active_Session_History
where module = 'DMK LEVENSHTEIN TEST'
and sample_time >= SYSDATE-1/1440
group by module, action, sql_id
/

MODULE ACTION SQL_ID ASH_SECS
-------------------------------- -------------------------------- ------------- ----------
DMK LEVENSHTEIN TEST 3391gqpzu5g2k 21
DMK LEVENSHTEIN TEST levenshtein() 3391gqpzu5g2k 5
DMK LEVENSHTEIN TEST dolittle() 3391gqpzu5g2k 4

NB: I have not been able to get parallelism to work in a PL/SQL function defined in a WITH clause because you cannot specify PARALLEL_ENABLE, and a pragma can only be specified in a package specification.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">WITH
FUNCTION levenshtein(p1 VARCHAR2, p2 VARCHAR2) RETURN INTEGER PARALLEL_ENABLE IS
BEGIN
RETURN utl_match.edit_distance(UPPER(p1),UPPER(p2));
END;
select /*+PARALLEL(C 4)*/
sum(levenshtein(cust_first_name,'STEPHEN')) a
from customers c
/
ORA-06553: PLS-712: illegal option for subprogram LEVENSHTEIN
connor_mc_d's picture

Gooey GUIDs

Do a quick Google search and you’ll find plenty of blog posts about why GUIDs are superior to integers for a unique identifier, and of course, an equal number of posts about why integers are superior to GUIDs. In the Oracle world, most people have been using sequence numbers since they were pretty much the only option available to us in earlier versions. But developers coming from other platforms often prefer GUIDs simply due to their familiarity with them.

I’m pretty much ambivalent when it comes to which one to use.  In fact, a good example is the AskTOM database -which had exclusively sequence-based primary keys on inception, but as the database has evolved and developers of different backgrounds and ages have worked on it, there is now of mix of strategies. The older tables have sequence based primary keys, and many of the newer tables have GUIDs as primary keys. Don’t get me wrong – I’m not advocating that you should have a mix – for a given database schema I’d recommend picking one regime and sticking with it. But my point is, that even with the mix of approaches in the AskTOM schema, I’ve never encountered any problems or issues with either.

However, there is one use case where I would strongly recommend using some caution on the use of GUIDs, and that is in the arena of systems that load data in bulk (eg data warehouses).

GUIDs are not cheap to generate. A quick look at the history on the structure and generation of unique IDs at https://en.wikipedia.org/wiki/Universally_unique_identifier all come down to a common component – the need to generate a good random number, and “good” can be a struggle for computers because you need algorithms that ensure sufficient randomness and distribution of the generated numbers.  That takes CPU cycles and whilst that is something that you will never notice when using a GUID for the 100 customers in your customer table, it definitely will be noticeable if you are going to attach a GUID to every one of your 10 million daily sales transactions, or telephone records, or similar.

Here’s a simple example where I’ll load 100 million rows into a table.  First I’ll try the conventional (when it comes to Oracle) approach of using a sequence number to uniquely identify each row.



SQL> create sequence seq cache 50000;

Sequence created.

SQL>
SQL> create table t1
  2  ( pk int,
  3    data1 int,
  4    data2 varchar2(10),
  5    data3 int,
  6    data4 varchar2(10)
  7  )
  8  tablespace demo
  9  /

Table created.


SQL>
SQL> set timing on
SQL> insert /*+ APPEND */ into t1
  2  select seq.nextval, int_val, char_val, int_val, char_val
  3  from
  4   ( select level int_val, to_char(level) char_val from dual connect by level <= 10000 ),
  5   ( select 1 from dual connect by level <= 10000 );

100000000 rows created.

Elapsed: 00:03:31.42
SQL>
SQL> commit;

Commit complete.

Elapsed: 00:00:00.01
SQL>

3 minutes 30 seconds for 100 million rows is pretty good performance for a laptop, although obviously the table structure here is very simple.

And now I’ll repeat the exercise with the same table structure, but using a raw column to hold the output of a call to SYS_GUID().



SQL>
SQL>
SQL> create table t2
  2  ( pk raw(20),
  3    data1 int,
  4    data2 varchar2(10),
  5    data3 int,
  6    data4 varchar2(10)
  7  )
  8  tablespace demo
  9  /

Table created.

SQL>
SQL> set timing on
SQL> insert /*+ APPEND */ into t2
  2  select sys_guid(), int_val, char_val, int_val, char_val
  3  from
  4   ( select level int_val, to_char(level) char_val from dual connect by level <= 10000 ),
  5   ( select 1 from dual connect by level <= 10000 );

100000000 rows created.

Elapsed: 00:30:56.78
SQL> commit;

Commit complete.

Elapsed: 00:00:00.03

That’s right – we’ve blown out to 30 minutes. As you can see, there can be a large cost when the row volumes (and hence number of calls to generate a GUID) get large. We can even take the INSERT out of the equation here, and simply do a raw stress test to see how many GUIDs we can call from the SQL engine using the following test harness.



SQL> create table t ( sz int, dur interval day to second );

Table created.

SQL>
SQL>
SQL> declare
  2    ts_start timestamp;
  3    ts_end   timestamp;
  4    iter int;
  5    dummy raw(32);
  6  begin
  7   for i in 1 .. 8 loop
  8    iter := power(10,i);
  9
 10    ts_start := systimestamp;
 11    if iter <= 10000 then
 12       select max(x) into dummy from
 13       (
 14       select sys_guid() x from
 15       ( select 1 from dual connect by level <= iter )
 16       );
 17    else
 18       select max(x) into dummy from
 19       (
 20       select sys_guid() x from
 21       ( select 1 from dual connect by level <= iter/10000 ),
 22       ( select 1 from dual connect by level <= 10000 )
 23       );
 24    end if;
 25
 26    ts_end := systimestamp;
 27    insert into t values (iter, ts_end - ts_start );
 28    commit;
 29
 30   end loop;
 31  end;
 32  /

PL/SQL procedure successfully completed.

SQL>
SQL> select * from t order by 1;

        SZ DUR
---------- ----------------------------------------------------------
        10 +00 00:00:00.000000
       100 +00 00:00:00.000000
      1000 +00 00:00:00.015000
     10000 +00 00:00:00.172000
    100000 +00 00:00:01.607000
   1000000 +00 00:00:16.083000
  10000000 +00 00:02:49.713000
 100000000 +00 00:26:46.570000
 
 

I’m not trying to scare you off GUIDs – but like any functionality or feature you’re using to build applications, make sure you test it for business requirements you need to satisfy and make an informed decision on how best to use (or not use) it.

Franck Pachot's picture

A tribute to Natural Join

By Franck Pachot

.
I know that lot of people are against the ANSI join syntax in Oracle. And this goes beyond the limits when talking about NATURAL JOIN. But I like them and use them quite often.

Why is Natural Join bad?

Natural join is bad because it relies on column names, and, at the time of writing the query, you don’t know which columns will be added or removed later. Here is an example on the SCOTT schema, joining on DEPTNO which has the same name in DEPT and EMP:

SQL> select * from EMP natural join DEPT where DNAME='SALES';
 
DEPTNO EMPNO ENAME JOB MGR HIREDATE SAL COMM DNAME LOC
---------- ---------- ---------- --------- ---------- --------- ---------- ---------- -------------- -------------
30 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 SALES CHICAGO
30 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 SALES CHICAGO
30 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 SALES CHICAGO
30 7900 JAMES CLERK 7698 03-DEC-81 950 SALES CHICAGO
30 7698 BLAKE MANAGER 7839 01-MAY-81 2850 SALES CHICAGO
30 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 SALES CHICAGO

The DEPT table has a ‘LOC’column for the location of the department. But the data model may evolve and you may add a location for each employee. And we may also call it LOC:

SQL> alter table EMP add (LOC varchar2(10));
Table altered.

But now our Natural Join adds this column to the join predicate and the result is wrong because it shows only rows which have same department location as employee location:

SQL> select * from EMP natural join DEPT where DNAME='SALES';
 
no rows selected

Projection

In my opinion, the problem is not the Natural Join. Column names have a meaning for their tables. But the tables have different roles in our queries. As soon as a table or view participates to our query, we should redefine the column names. If we don’t, the result is completely wrong as:

SQL> select * from EMP join DEPT using(DEPTNO) where DNAME='SALES';
 
DEPTNO EMPNO ENAME JOB MGR HIREDATE SAL COMM LOC DNAME LOC
---------- ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- -------------- -------------
30 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 SALES CHICAGO
30 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 SALES CHICAGO
30 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 SALES CHICAGO
30 7900 JAMES CLERK 7698 03-DEC-81 950 SALES CHICAGO
30 7698 BLAKE MANAGER 7839 01-MAY-81 2850 SALES CHICAGO
30 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 SALES CHICAGO

Look: the result has two columns with the same name. This is completely wrong for a relational database and I don’t even understand why this parses without raising an error.

The projection is the most important relational operation, often overlooked as if it was just a rename for aesthetic purpose. You need to name the columns of your result set. They are the metadata for the interface between SQL and the host language. ‘select *’ is a shortcut when running an interactive query, to get a glance at the result rows. But a SQL query result is not complete without proper column names. And in most cases, at least when you query more than one table, the name of the query result columns should be different than the name of the underlying table columns. A department may have a location. And an employee may have a location. But the location of the employee department is something completely different than the employee location.

Then, as you need to name each column anyway, why not doing it as soon as possible? Do it for each table involved in the query, so that you are sure that all column names are correct within the query. As soon as you introduce an new table in the FROM clause, you should actually name the columns according to their role in the query. Let’s take an example with an airline data model. Each airport is linked to a city. This can be a CITY column in the AIRPORTS table. But as soon as you join FLIGHTS with AIRPORTS, this table has a different role. You join on destination airport or source airport. Then you alias the AIRPORTS table in the FROM clause, such as DST_AIRPORTS or SRC_AIRPORTS. Within the query, you can reference the columns with the table alias, such as DST_AIRPORTS.CITY or SRC_AIRPORTS.CITY but this cannot be exposed as-is in the query result. You must name them in the SELECT clause with something like SELECT DST_AIRPORTS.CITY as DST_ARP_CITY , SRC_AIRPORTS.CITY as SRC_ARP_CITY.

Then, as I’ll need to rename them anyway, I prefer to do it as soon as I join to a new table in the FROM clause. Instead of joining to AIRPORTS DST_AIRPORTS I can join to (SELECT IATA DST_ARP_IATA, CITY DST_ARP_CITY FROM AIRPORTS) and all column names will relate to the role without table aliases and without further renaming. And when I do that correctly, I can use natural join without risk.

Projection in the FROM clause

Let’s take an example. Here is a query in DEPT where I explicitly mention that LOC is the department location. This is implicit when the column name belongs to the DEPT table. But it will not be implicit anymore once I join this table to another table. Here is the view ready to be included in any query:


SQL> select DEPTNO,DNAME DEPT_DNAME,LOC DEPT_LOC from DEPT where DNAME='SALES';
 
DEPTNO DEPT_DNAME DEPT_LOC
---------- -------------- -------------
30 SALES CHICAGO

Now, I can join this to the EMP table. I prefix all columns from EMP with “EMP_” and all columns from DEPT with “EMP_DEPT_” because they belong to DEPT when in the role of employee department:

SQL> select EMP_EMPNO,EMP_ENAME,EMP_DEPT_DNAME,EMP_DEPT_LOC,EMP_LOC,EMP_MGR_EMPNO
from
(select DEPTNO EMP_DEPTNO,EMPNO EMP_EMPNO,ENAME EMP_ENAME,MGR EMP_MGR_EMPNO,LOC EMP_LOC from EMP)
natural join
(select DEPTNO EMP_DEPTNO,DNAME EMP_DEPT_DNAME,LOC EMP_DEPT_LOC from DEPT)
where EMP_DEPT_DNAME='SALES';
 
EMP_EMPNO EMP_ENAME EMP_DEPT_DNAME EMP_DEPT_LOC EMP_LOC EMP_MGR_EMPNO
---------- ---------- -------------- ------------- ---------- -------------
7521 WARD SALES CHICAGO 7698
7844 TURNER SALES CHICAGO 7698
7499 ALLEN SALES CHICAGO 7698
7900 JAMES SALES CHICAGO 7698
7698 BLAKE SALES CHICAGO 7839
7654 MARTIN SALES CHICAGO 7698

As you can see, when the names are clearly indicating the column with its role in the join, and how they are correlated with the other tables, there is no need to mention any join predicate. I used Natural Join because the join is on EMP_DEPTNO and I’m sure that it will always be the one and only one column with the same name. By query design.

And the column names in the result are correct, explicitly mentioning what is an Employee attribute or an Employee department attribute. That can be easy to parse and put in an object graph in the host language. You can see there that the MGR column of EMP was named EMP_MGR_EMPNO because this is actually what it is: the EMPNO of the employee manager. It is a foreign key to the EMP table.

And then, adding more information about the manager is easy: join with EMP again but with the proper projection of columns: EMPNO will be EMP_MGR_EMPNO when in the role of the employee manager, ENAME will be EMP_MGR_ENAME, DEPTNO will be EMP_MGR_DEPTNO, and so on:


SQL> select EMP_EMPNO,EMP_ENAME,EMP_DEPT_DNAME,EMP_DEPT_LOC,EMP_LOC,EMP_MGR_DEPTNO,EMP_MGR_ENAME
from
(select DEPTNO EMP_DEPTNO,EMPNO EMP_EMPNO,ENAME EMP_ENAME,MGR EMP_MGR_EMPNO,LOC EMP_LOC from EMP)
natural join
(select DEPTNO EMP_DEPTNO,DNAME EMP_DEPT_DNAME,LOC EMP_DEPT_LOC from DEPT)
natural join
(select DEPTNO EMP_MGR_DEPTNO,EMPNO EMP_MGR_EMPNO,ENAME EMP_MGR_ENAME from EMP)
where EMP_DEPT_DNAME='SALES';
 
EMP_EMPNO EMP_ENAME EMP_DEPT_DNAME EMP_DEPT_LOC EMP_LOC EMP_MGR_DEPTNO EMP_MGR_ENAME
---------- ---------- -------------- ------------- ---------- -------------- -------------
7900 JAMES SALES CHICAGO 30 BLAKE
7499 ALLEN SALES CHICAGO 30 BLAKE
7654 MARTIN SALES CHICAGO 30 BLAKE
7844 TURNER SALES CHICAGO 30 BLAKE
7521 WARD SALES CHICAGO 30 BLAKE
7698 BLAKE SALES CHICAGO 10 KING

No need to review the whole query when adding a new table. No need to solve the new ‘column ambiguously defined’. We don’t even need to alias the tables here.

Want to add the department name of the manager? That’s easy: join to DEPT with the right column projection (all prefixed by EMP_MGR_DEPT as the new columns are all about the employee manager’s department):

SQL> select EMP_EMPNO,EMP_ENAME,EMP_DEPT_DNAME,EMP_DEPT_LOC,EMP_LOC,EMP_MGR_DEPTNO,EMP_MGR_ENAME,EMP_MGR_DEPT_DNAME
from
(select DEPTNO EMP_DEPTNO,EMPNO EMP_EMPNO,ENAME EMP_ENAME,MGR EMP_MGR_EMPNO,LOC EMP_LOC from EMP)
natural join
(select DEPTNO EMP_DEPTNO,DNAME EMP_DEPT_DNAME,LOC EMP_DEPT_LOC from DEPT)
natural join
(select DEPTNO EMP_MGR_DEPTNO,EMPNO EMP_MGR_EMPNO,ENAME EMP_MGR_ENAME from EMP)
natural join
(select DEPTNO EMP_MGR_DEPTNO,DNAME EMP_MGR_DEPT_DNAME,LOC EMP_MGR_DEPT_LOC from DEPT)
where EMP_DEPT_DNAME='SALES';
 
EMP_EMPNO EMP_ENAME EMP_DEPT_DNAME EMP_DEPT_LOC EMP_LOC EMP_MGR_DEPTNO EMP_MGR_EN EMP_MGR_DEPT_D
---------- ---------- -------------- ------------- ---------- -------------- ---------- --------------
7698 BLAKE SALES CHICAGO 10 KING ACCOUNTING
7900 JAMES SALES CHICAGO 30 BLAKE SALES
7499 ALLEN SALES CHICAGO 30 BLAKE SALES
7654 MARTIN SALES CHICAGO 30 BLAKE SALES
7844 TURNER SALES CHICAGO 30 BLAKE SALES
7521 WARD SALES CHICAGO 30 BLAKE SALES

This can be even easier when you generate SQL queries. When adding a new table to join to, you just prefix all columns with their role. Check foreign keys so that the naming is consistent with the referenced tables. Then when parsing the result, the naming convention can help to break on the object hierarchy.

Additional notes

I mentioned that aliasing the subquery is not mandatory because I do not have to prefix the column names. However, when looking at the predicates section of the execution plan, the columns may be prefixed with an internal alias:

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("from$_subquery$_006"."EMP_MGR_DEPTNO"="from$_subquery$_009"."EMP_MGR_DEPTNO")
2 - access("from$_subquery$_001"."EMP_MGR_EMPNO"="from$_subquery$_006"."EMP_MGR_EMPNO" AND "from$_subquery$_001"."EMP_DEPTNO"="from$_subquery$_003"."EMP_DEPTNO")

Then it is a good idea to add prefixes, such as EMP, EMP_DEPT, EMP_MGR EMP_MGR_DEPTNO in the query above so that the predicates become:

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("EMP_MGR"."EMP_MGR_DEPTNO"="EMP_MGR_DEPT"."EMP_MGR_DEPTNO")
2 - access("EMP"."EMP_MGR_EMPNO"="EMP_MGR"."EMP_MGR_EMPNO" AND "EMP"."EMP_DEPTNO"="EMP_DEPT"."EMP_DEPTNO")
5 - filter("DNAME"='SALES')

I also like to add a QB_NAME hint so that I can reference easily those subqueries if I have to add some hints there. Finally, this is what I can generate for this query:


SQL> select EMP_EMPNO,EMP_ENAME,EMP_DEPT_DNAME,EMP_DEPT_LOC,EMP_LOC,EMP_MGR_DEPTNO,EMP_MGR_ENAME,EMP_MGR_DEPT_DNAME
from
(select /*+qb_name(EMP)*/ DEPTNO EMP_DEPTNO,EMPNO EMP_EMPNO,ENAME EMP_ENAME,MGR EMP_MGR_EMPNO,LOC EMP_LOC from EMP) EMP
natural join
(select /*+qb_name(EMP_DEPT)*/ DEPTNO EMP_DEPTNO,DNAME EMP_DEPT_DNAME,LOC EMP_DEPT_LOC from DEPT) EMP_DEPT
natural join
(select /*+qb_name(EMP_MGR)*/ DEPTNO EMP_MGR_DEPTNO,EMPNO EMP_MGR_EMPNO,ENAME EMP_MGR_ENAME from EMP) EMP_MGR
natural join
(select /*+qb_name(EMP_MGR_DEPT)*/ DEPTNO EMP_MGR_DEPTNO,DNAME EMP_MGR_DEPT_DNAME,LOC EMP_MGR_DEPT_LOC from DEPT) EMP_MGR_DEPT
where EMP_DEPT_DNAME='SALES';

So what?

My goal here is not to recommend to always use natural joins. This depends on the context (ad-hoc queries, embedded ones in existing code with naming standards,…) and whether con control exactly the columns names. There are also a few bugs with ANSI joins, and natural join is not widely used, so maybe not tested a lot. But when I hear that Natural Join is bad, I want to explain the why/how/when. And one of the good sides of it is that it forces us to do the projection/rename as soon as possible and this makes the query easier to read/maintain/evolve. Of course, using natural join in that way requires that all tables are added to the FROM clause through a subquery which carefully names all columns in the SELECT clause so that the correlation with the other tables is clearly defined.

 

Cet article A tribute to Natural Join est apparu en premier sur Blog dbi services.

Jonathan Lewis's picture

Masterclass – 1

A recent thread on the Oracle developer community database forum raised a fairly typical question with a little twist. The basic question is “why is this (very simple) query slow on one system when it’s much faster on another?” The little twist was that the original posting told use that “Streams Replication” was in place to replicate the data between the two systems.

To make life easy for remote trouble-shooters the poster had supplied (for each system) the output from SQL Monitor when running the query, the autotrace output (which shows the predicate section that SQL Monitor doesn’t report), and the session statistics for the query run, plus some statistics about the single table in the query, the index used in the plan, and the column on which that index was based.

Here, with a little cosmetic editing (and a query that has clearly been camouflaged by the OP), is the information supplied for the faster database, where the query took about 30 seconds to complete.


SELECT c1, c2, c3, c4, c5, c6, c7, c8..  
FROM TAB1  
WHERE STS IN ( 'A', 'B')  
AND cnt < '4'  
AND dt < sysdate  
and rownum <=1;  
  
Sql_monitor and stats from DB1  
******************************  
  
Global Information  
------------------------------  
 STS              :  DONE (ALL ROWS)             
 Instance ID         :  1                           
 Execution Started   :  08/17/2018 08:31:22         
 First Refresh Time  :  08/17/2018 08:31:22         
 Last Refresh Time   :  08/17/2018 08:31:53         
 Duration            :  31s                         
 Program             :  sqlplus.exe                 
 Fetch Calls         :  1                           
  
Global Stats  
===============================================================================  
| Elapsed |   Cpu   |    IO    | Concurrency | Fetch | Buffer | Read  | Read  |  
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Calls |  Gets  | Reqs  | Bytes |  
===============================================================================  
|      33 |    3.00 |       30 |        0.08 |     1 |   102K | 38571 | 301MB |  
===============================================================================  
  
SQL Plan Monitoring Details (Plan Hash Value=715774357)  
======================================================================================================================================================================================  
| Id |            Operation            |          Name           |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read  | Read  | Activity |       Activity Detail        |  
|    |                                 |                         | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs  | Bytes |   (%)    |         (# samples)          |  
======================================================================================================================================================================================  
|  0 | SELECT STATEMENT                |                         |         |       |         1 |    +31 |     1 |        1 |       |       |          |                              |  
|  1 |   COUNT STOPKEY                 |                         |         |       |         1 |    +31 |     1 |        1 |       |       |          |                              |  
|  2 |    INLIST ITERATOR              |                         |         |       |         1 |    +31 |     1 |        1 |       |       |          |                              |  
|  3 |     TABLE ACCESS BY INDEX ROWID | TAB1                    |    114K | 33399 |        32 |     +0 |     2 |        1 | 38377 | 300MB |    96.77 | Cpu (1)                      |  
|    |                                 |                         |         |       |           |        |       |          |       |       |          | db file sequential read (16) |  
|    |                                 |                         |         |       |           |        |       |          |       |       |          | read by other session (13)   |  
|  4 |      INDEX RANGE SCAN           | TAB1_STS_IDX            |    115K |   723 |        30 |     +2 |     2 |     118K |   194 |   2MB |     3.23 | read by other session (1)    |  
======================================================================================================================================================================================  
  
---------------------------------------------------------------------------------------------------------  
| Id  | Operation                     | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |  
---------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT              |                         |     1 |  1847 | 33399   (1)| 00:03:14 |  
|*  1 |  COUNT STOPKEY                |                         |       |       |            |          |  
|   2 |   INLIST ITERATOR             |                         |       |       |            |          |  
|*  3 |    TABLE ACCESS BY INDEX ROWID| TAB1                    |   114K|   201M| 33399   (1)| 00:03:14 |  
|*  4 |     INDEX RANGE SCAN          | TAB1_STS_IDX            |   114K|       |   723   (1)| 00:00:05 |  
---------------------------------------------------------------------------------------------------------  
  
Predicate Information (identified by operation id):  
---------------------------------------------------  
   1 - filter(ROWNUM<=1)  
   3 - filter("cnt"<'4' AND "dt"

And here’s the equivalent information from the slower database where the query took more than 9 times as long (4 minutes 42 seconds) to complete.


Global Information  
------------------------------  
 STS              :  DONE (ALL ROWS)           
 Instance ID         :  1                         
 Execution Started   :  08/17/2018 08:21:47       
 First Refresh Time  :  08/17/2018 08:21:47       
 Last Refresh Time   :  08/17/2018 08:26:29       
 Duration            :  282s                      
 Module/Action       :  SQL*Plus/-                
 Program             :  sqlplus.exe               
 Fetch Calls         :  1                         
  
Global Stats  
================================================================  
| Elapsed |   Cpu   |    IO    | Fetch | Buffer | Read | Read  |  
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  | Reqs | Bytes |  
================================================================  
|     287 |    8.76 |      278 |     1 |   110K | 110K | 858MB |  
================================================================  
  
SQL Plan Monitoring Details (Plan Hash Value=715774357)  
======================================================================================================================================================================================  
| Id |            Operation            |          Name           |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read | Read  | Activity |        Activity Detail        |  
|    |                                 |                         | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs | Bytes |   (%)    |          (# samples)          |  
======================================================================================================================================================================================  
|  0 | SELECT STATEMENT                |                         |         |       |         1 |   +282 |     1 |        1 |      |       |          |                               |  
|  1 |   COUNT STOPKEY                 |                         |         |       |         1 |   +282 |     1 |        1 |      |       |          |                               |  
|  2 |    INLIST ITERATOR              |                         |         |       |         1 |   +282 |     1 |        1 |      |       |          |                               |  
|  3 |     TABLE ACCESS BY INDEX ROWID | TAB1                    |    142K | 40211 |       282 |     +1 |     2 |        1 | 109K | 854MB |   100.00 | db file sequential read (277) |  
|  4 |      INDEX RANGE SCAN           | TAB1_STS_IDX            |    142K |   892 |       280 |     +3 |     2 |     118K |  491 |   4MB |          |                               |  
======================================================================================================================================================================================  
  
Execution Plan (autotrace) 
---------------------------------------------------------------------------------------------------------  
| Id  | Operation                     | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |  
---------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT              |                         |     1 |  1847 | 40211   (1)| 00:08:03 |  
|*  1 |  COUNT STOPKEY                |                         |       |       |            |          |  
|   2 |   INLIST ITERATOR             |                         |       |       |            |          |  
|*  3 |    TABLE ACCESS BY INDEX ROWID| TAB1                    |   141K|   249M| 40211   (1)| 00:08:03 |  
|*  4 |     INDEX RANGE SCAN          | TAB1_STS_IDX            |   141K|       |   892   (1)| 00:00:11 |  
---------------------------------------------------------------------------------------------------------  
  
Predicate Information (identified by operation id):  
---------------------------------------------------  
   1 - filter(ROWNUM<=1)  
   3 - filter("cnt"<'4' AND "dt"

There are all sorts of noteworthy details in these two sets of information – some of the “how to see what’s in front of you” type, some of the “be careful, Oracle can deceive you” type. So I’m going to walk though the output picking up a number of background thoughts before commenting on the answer to the basic question.

We’ll start with the object statistics, then we’ll look at the SQL Monitor plan to see if we can determine where the extra time was spent, then we’ll try to work out what else the plan might be telling us about the code and data, then we’ll summarise my observations to make a claim about the difference in behaviour.

Object statistics

The table has 79M rows with average length of 1,847 bytes, using 22M blocks. With an 8KB block size and that average row size we would expect to see about 3 rows per block, and that’s fairly consistent with the value of rows / blocks.  We don’t know what the sample size was for this stats collection, but it might have been a “small” sample size rather than the the 100% you would get from using auto_sample_size, so that might also explain some discrepancy between the two different views on the figures.

We note that the secondary system reports a chain_cnt in excess of 500,000 rows. The only (unhacked) way that this figure could be set would be through a call to analyze statistics, and once the figure is there it won’t go away unless you use the analyze command again to delete statistics.  We don’t know the history of how and when the figure got there so it doesn’t guarantee that there are any chained or migrated rows, nor does the zero in the table stats on the primary system guarantee that it doesn’t have any chained or migrated rows – all it tells us is that at some time someone used the wrong command to gather stats and there were some (less than 1%) migrated or chained rows in the table at the time. (The optimizer will use this figure in its arithmetic if it is set, by the way, so it may affect some of the cost calculations – but not by a huge amount.)

The column sts reports 5 distinct values, no nulls, and a density of 6.2e-9 which is roughly half of 1/79M: so we have a frequency histogram on the column (in the absence of a histogram the density would be 1/5, and it’s reasonable to assume that the number of buckets was either the default or set to something larger than 5).  We were told that the system was running 11.2.0.4 – so we have to be a little suspicious about the accuracy of this histogram since it will have been sampled with a very small sample if the stats collection had used auto_sample_size. (12c will use a specially optimized 100% sample for frequency and top-N histograms when using auto_sample_size)

The index on sts has a clustering_factor of around 22M which is similar to the number of blocks in the table – and that’s not too surprising if there are are only a very small number of distinct values in the column – especially when the presence of the histogram suggest that there’s a skew in the data distribution. (There’s more to come on that point.) The number of leaf blocks is about 500,000 (being lazy about arithmetic) – just as a side note this suggests the index is running in a fairly inefficient state (and probably hasn’t been created with the compress keyword).

Doing a rough estimate of the index arithmetic :  the avg_col_len for sts is 2, so the space required for each index entry will be 13 bytes (2 for the column, 7 for the rowid content, 2 for the row header, 2 for the row directory entry).  Take off the block overhead, and assume the index is running at a “typical” 70% space usage per leaf block and you might expect 5,600 bytes used per leaf block for current index data and that works out to about 430 index entries per leaf block.  With 79M rows in the table that should lead to 79M/430 leaf blocks – i.e. roughly 184,000 leaf blocks, not 493,000 leaf blocks.  However it’s not unusual to see an index with extremely repetitive values operating at something like 50% utilisation, which would bring our estimate to about 310 rows per leaf block and 255,000 leaf blocks – which is still off by a factor of nearly 2 compared to what we’ve actually got. Again, of course, we have to be a little bit cautious about these statistics – we don’t know the sample size, and Oracle uses a surprisingly small number of blocks to sample the stats for an index.

Where’s the time.

The SQL Monitor gives us a very clear report of where most of the time went – almost all of it was spent in I/O waits, and almost all of the wait time was in the “table access by index rowid” opration in both cases; but the primary system did 38,377 read requests while the secondary did 109,000 read requests in that line of the plan. It is significant, though, that quite a lot (40%) of the ASH samples for that operation on the primary system were for “read by other session” rather than “db file sequential read”:  in other words some other session(s) were doing a lot of work to pull the data we wanted into the buffer cache at the same time. Apart from the fact that a wait for “read by other session” often means we spend less time waiting than if we’d had to do the read ourselves, the presence of this wait suggests that other sessions may be “pre-caching” data for us so that we end up having to read far fewer blocks than would otherwise be the case.

It’s important to note at the same time that the difference in Buffer Gets for the two systems was small – 102K vs. 110K – and the “Rows (actual)” was the same in both cases – 118K entries returned by the index range scan.  Both systems did similar amounts of “logical” work, to process similar amounts of data; the difference was the fraction of the work that required a buffer get to turn into a disc read or a “wait for other read”.

We might want to pick up a few more numbers to corroborate the view that the only significant difference was in the volume of data cached and not some more esoteric reason.  Some of the session statistics should help.


DB1:  table fetch by rowid                          117,519
DB2:  table fetch by rowid                          117,521

DB1:  undo change vector size                         4,432
DB2:  undo change vector size                         4,432

DB1:  redo size                                       5,536
DB2:  redo size                                       5,440

DB1:  session logical reads                         102,500
DB2:  session logical reads                         110,326

DB1:  no work - consistent read gets                102,368
DB2:  no work - consistent read gets                110,071

DB1:  table fetch continued row                       2,423
DB2:  table fetch continued row                       3,660

The number of rows fetched by rowid is virtually identical and we have done (virtually) no work that generates undo or redo – such as delayed block cleanout; there are no statistics shown for “%undo record applied” so we probably haven’t done very much work to get a read consistent view of the data though we can’t be sure that the OP simply failed to copy that stat into list supplied (but then the similarity of “session logical reads” to “no work – consistent read gets” confirms the hypothesis that we didn’t do any (significant) work on visiting undo blocks.

We do see a few percent increase in the number of buffer gets (“session logical reads”) – but this may reflect the fact that the actual pattern of data in one table is slightly different from the pattern in the other – thanks to ASSM the process id of the process that inserts a row into a table can affect (within a small range, usually) the block into which the row is inserted; but at 102,000 / 110,000 buffer gets to visit 117,500 rows in the table we can see that there must be some table blocks that hold two (or more) rows that are identified as consecutive in the index – leading to some row visits being achieved through a buffer pin and without a fresh buffer get. You’ll note that this argument is consistent with the small variation in clustering_factor (not that we entirely trust those figures) for the two indexes – the system with the lower clustering_factor for the index has done fewer buffer gets to acquire the same number of rows from the table – by definition that means (assuming default setup) that there are more cases where “the next table row” is in the same block as the current row.

The final figure I’ve shown is the “table fetch continued rows”: according to the table stats (which we don’t necessarily trust completely) 500K out of 79M rows are chained/migrated which is roughly 0.6%. We know that we’re visiting about 117K table rows so might expect (on average) roughly the same percentage migrated/chained viz: 0.6% of 117K = 743, so there’s a little anomaly there (or an error in our assumption about “average” behaviour.  It’s worth noting, though, that a “continued fetch” would have to do an extra buffer visit (and maybe an extra physical read).  You might wonder, of course, how there could be any chained or migrated rows when the average row length is 1,847 bytes but in a follow-up post the OP did say there were 3 BLOB columns in the table, which can cause havoc with interpreting stats for all sorts of reasons. We don’t have any information about the table structure – particularly whether the columns in the query appear before or after the BLOB columns in the table definition – and we don’t know what processing takes place (for example, maybe the 3rd BLOB is only updated after the sts column has been changed to a value other than A or B which would help to explain why we shouldn’t be using the 0.6% calculation above as a table-wide average), so we’re not in a position to say why any of the continued fetches appear but there are several guesses we could make and they’re all easy to check.

Plan observations

If we examine row estimates we see that it 114K for the faster plan and 141K for the slower plan (with a closely corresponding variation in cost). The difference in estimates simply tells us that the histogram gathering was probably a small sample size and subject to a lot of variation. The scale of the estimates tells us that the A and B rows are probably rare – call it 125K out of 79M rows, about 0.16% of the total rows in the table, so it would not be surprising to see consecutive samples for the histogram producing significant variations in estimates.

The other interesting thing we can note in the SQL Monitor plan is that the Starts column for the index range scan / table access operations in both plans shows the value 2: this means that there are no “A” rows that match the other predicates:  Oracle has run the “A” iteration to completion then started the “B” iteration and found a row on the second iteration. Is this a coincidence, or should it always happen, or is it only fairly likely to happen; is it possible to find times when there are no suitable “B” rows but plenty of suitable “A” rows. The final predicate in the query is “rownum <= 1” – so the query is picking one row with no explicit strategy for choosing a specific row when there are multiple appropriate rows, does this mean that we could optimizer the query by rewriting it as a “union all” that searched for B rows first and A rows second ? We just don’t know enough about the processing.

In passing, we can’t get Oracle to search the B rows first by changing the order of the in-list.  If you have a predicate like “where sts in ({list of literal values})” the optimizer will sort the list to eliminate duplicates before rewriting the predicate as a list of disjuncts, and then (if the path uses an iterator) iterate through the list in the resulting order.

In the absence of information about the way in which the data is processed we can only say that we need to avoid visiting the table so frequently. To do this we will need to add one or both of the columns from the other predicates to the index – this might double the size of the index, but eliminate 95% of the potential I/O.  For example if we discover that A and B rows are initially created “into the future” and this query is looking for a row whose “time has come” so that it can be processed and changed to an X row (say) then there may only ever be a tiny number of rows where the “sts = A and the dt < sysdate” and an index on (sts, dt) would be a perfect solution (especially if it were compressed on at least the first column).

The OP has declared a reluctance to add an index to the table – but there are two points to go with this indexing strategy. Since we know there’s a frequency histogram and the A and B rows appear to be rare values – what benefit is there in having an index that covers the other values (unless 2 of the remaining 3 are also rare).  How about creating a function-based index that represents only the rare values and modifying this code to use that index – e.g.

create index t1_id on t1 (
        case sts when 'A' then sts when 'B' then sts end,
        case sts when 'A' then dt  when 'B' then dt  end
) compress 1
;

select  *
from    t1
where   case sts when 'A' then sts when 'B' then sts end in ('A','B')
and     case sts when 'A' then dt  when 'B' then dt  end < sysdate
and     cnt < '4'
and     rownum <= 1
/


You might be able to replace a huge index (79M rows worth) with this small one (120K rows worth) unless there’s too much other code in the system that has to be adjusted or the sts column is actually the target of a referential integrity constraint; at worst you could add this index knowing that it’s generally not going to consume much in the way of extra space or processing resources and is going to save you a lot of work for this query.

Summary

The execution plan from SQL Monitor points very strongly to the fast system benefiting from having a lot of the relevant data cached and constantly being reloaded into the cache by other sessions while the slow system has to acquire almost all of its data by real phyiscal reads. Most of the reads address the table so engineering an index that is low-cost and (fairly) high precision is the most significant strategy for reducing the workload and time on the slow system.

The fact that all the potential A rows fail to match the full predicate set suggests that there MAY be some aspect of the processing that means it would be more efficient to check for B rows before checking for A rows.

Given the massive skew in the data distribution a function-based index that hides all the non-popular values (or even all the values that are not of interest to this query) may be the most cost-effective way of adding a very effective index to the table with minimal resource requirements.

And finally

It’s taken me more than 4 hours to write this note after spending about 10 minutes reading through the information supplied by the OP and identifying and cross-checking details. A small fraction of the 4+ hours was spent creating a little model to check something I had to say about in-lists, the rest of it was trying to write up a coherent description covering all the details.

That’s it for today, but I may have missed a couple of points that I noticed as I read the OP’s posting; and I will want to do a little cosmetic work on this article and check grammar and spelling over the next couple of days.

Shortly after I posted this blog note the owner of the question reported the following as the distribution of values for the sts column:

 STS   COUNT(*)
---- ----------
   A          6
   E        126
   D        866
   C   80212368
   B     117631

Two things stand out about these figures – first it’s an ideal example of a case where it would be nice avoid having index entries for the 80 million ‘C’ rows. Depending on the coding and testing costs, the supportability of the application and the possible side effects this could be done with a function-based index, or by introducing a virtual column that hides the ‘C’s behing a NULL, or by changing the code to use NULL instead of ‘C’.

Secondly – I made a comment about rewriting the code to query the B’s before the A’s. But we saw that Oracle worked through about 117,000 rows before returning a result: so the fitures above tell us that it must have worked through almost all the B’s and the handful of A’s was just a tiny little blip before it got to the B iteration – so there’s no point in making that change.

My suggestion for the function-based index above could be modified in two ways, of course – add two more “when”s to each “case” to capture the D and E rows, or take the opposite viewpoint and create an index on expressions like: “case sts when ‘C’ then to_char(null) else sts end”. The benefit of the latter approach is that you don’t have to modify the index definition (and rebuild the index) if a new legal value for sts appears.

Franck Pachot's picture

Installing ZFS on OEL7 UEK4

Here are the commands I use to setup ZFS on Oracle Linux

#This was tested on OEL 7.2 in Oracle Cloud Compute Classic and OEL 7.5 in Oracle Cloud infrastructure
set -x
cat /etc/oracle-release
yum -y install kernel-uek-devel-$(uname -r)
yum install -y yum-utils
yum-config-manager --enable ol7_developer_EPEL
yum install -y dkms
rpm -Uvh http://download.zfsonlinux.org/epel/zfs-release.el7_5.noarch.rpm
sudo yum install -y zfs
dkms status
/sbin/modprobe zfs
systemctl -a | grep zfs
echo "ZFS Pool creation on the last disk added:"
zpool create -f myzpool -m /mnt/myzpool /dev/$(lsblk | awk 'END{print $1}')
zpool status
zpool list

The full blog is on:

Installing ZFS on OEL7 UEK4 for Docker storage - Blog dbi services

khailey's picture

MySQL 5.6 vs 5.7

Whitelist preview of Performance Insights has just started on RDS MySQL and it gave me a chance to visually compare load profiles of MySQL 5.6 and 5.7.

I first tried to use sysbench and ran into some curious anomolies.

The test I ran was

sysbench \
           --test=oltp \
           --oltp-table-size=10000000 \
           --oltp-test-mode=complex \
           --num-threads=10 \
           --max-time=0 \
           --max-requests=0 \
           --mysql-host=$host \
           --mysql-user=$user \
           --mysql-password=$password \
           --mysql-db=sysbench run

Load chart on MySQL 5.6 and 5.7

 

MySQL 5.6

I first ran on MySQL 5.6 and in Performance Insights the load looked like:

sysbench_56

 

Which looks like a small load. The DB instance and load driving machine are both in the same AWS region (N.Virginia) i.e. not much latency. If there is a lot of latency between the machine driving load and the database then it can often be hard to drive a high load. In this case latency was low so it should be easy to drive a high load.

MySQL 5.7

I then ran the same load on 5.7 and it looked like:

sysbench_57

which is radially different. Where as on 5.6 the host was mainly idle since load was well below the # of vCPUs on 5.7 the load is above the # of vCPUs indicating a bottleneck.

SQL on 5.6 vs 5.7

 

MySQL 5.6

Let’s look at the SQL showing up in Performance Insights. For MySQL 5.6 the list was quite limited and there was no “commit”.

sysbench_56_sql

MySQL 5.7

On 5.7 most of the load was on commit and there were quite a few other SQL showing up:

sysbench_57_sql

Waits on 5.6 vs 5.7

Looking at the waits on 5.6 and 5.7 they are quite different as well.

MySQL 5.6

MySQL 5.6 spends most of it’s time on CPU and none on write to stable storage.

sysbench_56_waits

MySQL 5.7

on MySQL 5.7 most of the time is spent waiting on writes to stable storage and little, relatively , on CPU

 

sysbench_57_waits

Analysis

At first I started to wonder if there was something fundamentally different between 5.6 and 5.7.

The main difference I was seeing was on COMMIT and writes to stable storage so rather than trying to go debug sysbench , I decided to run a basic workload that should help me see if there was a basic big difference between 5.6 and 5.7.

Update workload on MySQL 5.6 and 5.7

I ran a workload of 4 sessions, each updating a single row table 10,000 times. In one case autocommit was on and the other it was off. With autocommit on, I’d expect a lot of waits on writes to stable storage and with it off I’d expect much less.

It turns out the load looks almost exactly the same on 5.6 as 5.7. On the left of each chart below there is a little spike a few seconds wide. That little spike is 40,000 updates by 4 users in a few seconds with hardly any wait as autocommit is off. Then the load jobs sleeps for 10 seconds, then the large load in both is the 40,000 updates with autocommit on and we wait on writes to stable storage as expected.

MySQL 5.6

mysql_66_update

MySQL 5.7

mysql_57_update

Conclusion

Looks like something is seriously wrong with sysbench on MySQL 5.6. I’ll have to dig into this.

In the mean time it looks like MySQL 5.6 and 5.7 do respond quite similarly. If anything MySQL 5.7 was a bit faster. Here are the actual timings for 40,000 updates by 4 concurrent users:

5.6
autocommit=0
real   0m5.285s
autocommit=1
real   3m32.095s
 
5.7
autocommit=0
real   0m5.186s
autocommit=1
real   2m57.236s

 

 

Uwe Hesse's picture

Automatic Table Reorganization in #Exasol

Exasol Logohttps://uhesse.files.wordpress.com/2018/05/exasol_logo.png?w=150 150w" sizes="(max-width: 161px) 100vw, 161px" />

One Exasol key feature is the low maintenance effort it has compared to many other database systems. Automatic Table Reorganization is an example for this approach:

SQL_EXA> open schema adam;
EXA: open schema adam;

Rows affected: 0

SQL_EXA> create or replace table t1 as
         select rownum as id from (select 1 from dual connect by level<=1e6); 
EXA: create or replace table t1 as... Rows affected: 1000000 
SQL_EXA> select table_name,table_row_count,delete_percentage from exa_user_tables;
EXA: select table_name,table_row_count,delete_percentage from exa_user_tabl...

TABLE_NAME  TABLE_ROW_COUNT       DELETE
----------- --------------------- ------
T1                        1000000    0.0

1 row in resultset.

Yes we have CREATE OR REPLACE TABLE in Exasol </p />
</p></div>
    <div class=»

connor_mc_d's picture

Let’s talk about Techiquette

I’m typing this at about 38,000 feet in the air. I’m travelling home from the Oracle Latin America tour – it was a great event, but I’ll save the details for another blog post. In reality, I should be sleeping, but as luck would have it one of the lights on our plane is malfunctioning. So whilst the rest of the plane is sleeping happily in restful darkness, my row and the rows around it are bathed in blazing artificial light Sad smile

image

Ah…the joys of travel. Anyway, I digress.

Before I boarded the flight, I did what most people do in the airport. I signed my life away to a deluge of advertising emails as one does in order to get 30 precious minutes of free wi-fi. I checked some emails, and had a glance through twitter when I came upon this one.

image

As you might expect, there’s plenty of social media activity about this tweet, much of it about the contradiction between the language and what you might expect to come out of the Office of the President of the United States. I’m not going to enter into that debate, or get into political discussion. Let’s face it – Donald Trump strikes me as more a Sybase man (Hey don’t flame me Sybase readers…these are just jokes ok?)

But the tweet made me think about the kind of language we sometimes see on AskTOM and on technical forums in general.

So let’s talk about etiquette on technical forums, or “Techiquette” as I’ve coined it.

We’re actually pretty lucky in technical circles with our discussion forums because most of the conversation resolves around topics that can be backed up with cold hard facts. In those instances, even robust discussion stays civil because it’s pretty hard to disagree with solid evidence when its presented.  A hypothetical example might be something like:

Forum Participant #1: “I need to have a table with 500 columns”
Forum Participant #2: “You cannot – Oracle is limited to 255 columns”
Forum Participant #3: “I disagree – here is an example DDL demonstrating 800 columns”

Participant #2 can’t really dispute the response, because the example is right there in black and white.

But here is where I think conversations can go off the rails and slide downhill into a abuse and insult. It is when the content is both technical and emotional, for example:

Forum Participant #1: “I love using RMAN – it made our backup processes much easier”
Forum Participant #2: “RMAN sucks – it is too complex, anyone using it obviously has no clue about anything”
Forum Participant #1: “Dumb ass”

and away we go…The conversation goes from friendly to flame war.

The problem is – I don’t want my forums to be devoid of emotion, even when the topic is technical. Ever been to a presentation where the presenter does not show any joy or enthusiasm for the material? It’s like spending 45 minutes having root canal work on your teeth. We want people to be passionate about the things that interest them.

So how do we avoid forums descending into abuse whilst still encouraging emotion, and hence potentially emotionally charged content?

Here is my ground rules:

  • Expressing an affirmative or agreeing position? Positive emotions 100% allowed and encouraged.
  • Expressing a negative or disagreeing position? Express it as dispassionately as possible. Stick solely to the facts.

Simple mathematics tells us that if we can stick to the above, the average “positivity index” in a tech conversation will always be greater than or equal to zero.

I’m not claiming to be an angel here – I don’t always manage to follow my own guidelines as well as I should. But I’m always striving to improve, and hopefully if we can all follow them to the best of our ability, then in the main, we’ll have much more civil and interesting community conversations as a result.

To prevent automated spam submissions leave this field empty.