Oakies Blog Aggregator

Jonathan Lewis's picture

ASSM argh!

Here’s a problem with ASSM that used to exist in older versions of Oracle had disappeared by 11.2.0.4 and then re-appeared in 12.1.0.2 – disappearing again by 12.2.0.1. It showed up on MoS a few days ago under the heading: “Insert is running long with more waits on db file sequential read”.

The obvious response to this heading is to question the number of indexes on the table – because big tables with lots of indexes tend to give you lots of random I/O as Oracle maintains the indexes – but this table had no indexes. The owner of the problem supplied several of bits of information in the initial post, with further material in response to follow-up questions, including the tkprof summary of the 10046/level 12 trace of the insert and two extracts from the trace file to show us some of the “db file sequential read” waits – the first extract made me wonder if there might be some issue involving 16KB blocks but the second one dispelled that illusion.

There are several buggy things that can appear with ASSM and large-scale DML operations, and sometimes the problems can appear long after the original had done the dirty deed, so I thought I’d create a simple model based on the information supplied to date – and discovered what the problem (probably) was. Here’s how it starts – I’ve created a tablespace using ASSM, and in this tablespace I’ve created a table which has 48 columns with a row length of 290 bytes (roughly matching the OP’s table), and I’ve hacked out a simple PL/SQL block that loops around inserting arrays of 100 rows at a time into the table for a total of 1M rows before committing.


rem
rem     Script:         assm_cleanout.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2018
rem     Purpose:
rem
rem     Last tested
rem             12.2.0.1
rem             12.1.0.2        Lots of blocks left "not full"
rem             11.2.0.4
rem

rem
rem     using OMF, so no file-name needed
rem     Ran this bit as SYS, and altered test user to have unlimited quota
rem

/*
create
        tablespace test_8k_assm
        datafile size 1G
        extent management local
        autoallocate
        segment space management auto
;
*/

rem
rem     Create the table, pre-allocate some space.
rem     This means we should get consistent 8M extents and not initial little ones
rem

create table t1 (
        v001 varchar2(5), v002 varchar2(5), v003 varchar2(5), v004 varchar2(5), v005 varchar2(5),
        v006 varchar2(5), v007 varchar2(5), v008 varchar2(5), v009 varchar2(5), v010 varchar2(5),
        v011 varchar2(5), v012 varchar2(5), v013 varchar2(5), v014 varchar2(5), v015 varchar2(5),
        v016 varchar2(5), v017 varchar2(5), v018 varchar2(5), v019 varchar2(5), v020 varchar2(5),
        v021 varchar2(5), v022 varchar2(5), v023 varchar2(5), v024 varchar2(5), v025 varchar2(5),
        v026 varchar2(5), v027 varchar2(5), v028 varchar2(5), v029 varchar2(5), v030 varchar2(5),
        v031 varchar2(5), v032 varchar2(5), v033 varchar2(5), v034 varchar2(5), v035 varchar2(5),
        v036 varchar2(5), v037 varchar2(5), v038 varchar2(5), v039 varchar2(5), v040 varchar2(5),
        v041 varchar2(5), v042 varchar2(5), v043 varchar2(5), v044 varchar2(5), v045 varchar2(5),
        v046 varchar2(5), v047 varchar2(5), v048 varchar2(5)
)
segment creation immediate
tablespace test_8k_assm
storage(initial 8M)
;

alter table t1 allocate extent (size 8M);
alter table t1 allocate extent (size 8M);

rem
rem     Simple anonymous pl/sql block
rem     Large insert, handled with array inserts
rem     Can modify loop count and array size very easily
rem

declare
        type tab_array is table of t1%rowtype;
        junk_array tab_array;
begin

        select
                'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx',
                'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx',
                'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx',
                'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx',
                'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx'
        bulk collect into
                junk_array
        from
                all_objects
        where
                rownum  <= 100 -- > comment to avoid WordPress format issue
        ;

        for i in 1..10000 loop
                forall j in 1..junk_array.count
                        insert into t1 values junk_array(j) ;
        end loop;

end;

commit;

The number of rows per block after this insert is 24, with 1038 bytes free space left (808 due to the pctfree = 10, then the bit that was too small to take a 25th row before breaching the pctfree barrier). This means we should report 1M/24 = 41,666 full blocks and one block with some free space. So we query the table using the dbms_space package:


declare
        m_unformatted_blocks    number;
        m_unformatted_bytes     number;
        m_fs1_blocks            number;
        m_fs1_bytes             number;
        m_fs2_blocks            number;
        m_fs2_bytes             number;

        m_fs3_blocks            number;
        m_fs3_bytes             number;
        m_fs4_blocks            number;
        m_fs4_bytes             number;
        m_full_blocks           number;
        m_full_bytes            number;

begin
        dbms_space.SPACE_USAGE(
                segment_owner           => 'TEST_USER',
                segment_name            => 'T1',
                segment_type            => 'TABLE',
                unformatted_blocks      => m_unformatted_blocks,
                unformatted_bytes       => m_unformatted_bytes,
                fs1_blocks              => m_fs1_blocks ,
                fs1_bytes               => m_fs1_bytes,
                fs2_blocks              => m_fs2_blocks,
                fs2_bytes               => m_fs2_bytes,
                fs3_blocks              => m_fs3_blocks,
                fs3_bytes               => m_fs3_bytes,
                fs4_blocks              => m_fs4_blocks,
                fs4_bytes               => m_fs4_bytes,
                full_blocks             => m_full_blocks,
                full_bytes              => m_full_bytes
        );

        dbms_output.new_line;
        dbms_output.put_line('Unformatted                   : ' || to_char(m_unformatted_blocks,'999,999,990') || ' / ' || to_char(m_unformatted_bytes,'999,999,999,990'));
        dbms_output.put_line('Freespace 1 (  0 -  25% free) : ' || to_char(m_fs1_blocks,'999,999,990') || ' / ' || to_char(m_fs1_bytes,'999,999,999,990'));
        dbms_output.put_line('Freespace 2 ( 25 -  50% free) : ' || to_char(m_fs2_blocks,'999,999,990') || ' / ' || to_char(m_fs2_bytes,'999,999,999,990'));
        dbms_output.put_line('Freespace 3 ( 50 -  75% free) : ' || to_char(m_fs3_blocks,'999,999,990') || ' / ' || to_char(m_fs3_bytes,'999,999,999,990'));
        dbms_output.put_line('Freespace 4 ( 75 - 100% free) : ' || to_char(m_fs4_blocks,'999,999,990') || ' / ' || to_char(m_fs4_bytes,'999,999,999,990'));
        dbms_output.put_line('Full                          : ' || to_char(m_full_blocks,'999,999,990') || ' / ' || to_char(m_full_bytes,'999,999,999,990'));

end;
/

The results aren’t what we expect:


Unformatted                   :            0 /                0
Freespace 1 (  0 -  25% free) :       35,001 /      286,728,192
Freespace 2 ( 25 -  50% free) :            1 /            8,192
Freespace 3 ( 50 -  75% free) :            0 /                0
Freespace 4 ( 75 - 100% free) :           67 /          548,864
Full                          :        6,665 /       54,599,680

We have one block marked as 25 – 50% free (that’s the one block with 16 rows in it, which means about 40% space currently free) but our 41,666 full blocks are actually reported as 6,665 full blocks and 35,001 blocks with some space available. That’s going to hurt eventually if some process wants to insert more rows and finds that it has to fail its way through 35,001 blocks before finding a block which has enough free space.

So what happens when I repeat the PL/SQL block (and commit)? Here are the results from calls to dbms_space after the next two cycles:


Unformatted                   :            0 /                0
Freespace 1 (  0 -  25% free) :       70,002 /      573,456,384
Freespace 2 ( 25 -  50% free) :            2 /           16,384
Freespace 3 ( 50 -  75% free) :            0 /                0
Freespace 4 ( 75 - 100% free) :          220 /        1,802,240
Full                          :       13,330 /      109,199,360

Unformatted                   :          256 /        2,097,152
Freespace 1 (  0 -  25% free) :      105,003 /      860,184,576
Freespace 2 ( 25 -  50% free) :            3 /           24,576
Freespace 3 ( 50 -  75% free) :            0 /                0
Freespace 4 ( 75 - 100% free) :          117 /          958,464
Full                          :       19,995 /      163,799,040

Every time we execute the PL/SQL block we leave a trail of 35,001 more blocks which are flagged as “not quite full”.

Looking at the session stats while running the insert loop I can tell that Oracle isn’t checking to see whether or not it should be using those blocks. (A quick way of proving this is to flush the buffer cache before each execution of the PL/SQL and note that Oracle doesn’t read back the 105,000 blocks before inserting any data). So somehow, sometime, someone might get a nasty surprise – and here’s one way that it might happen:

Since I know I my data fits 24 rows per block I’m going to modify my PL/SQL block to select one row into the array then loop round the insert 25 times – so I know I’m inserting a little bit more than one block’s worth of data. Starting from the state with 105,003 blocks marked as “Freespace 1” this is what I saw – first, the free space report after inserting 25 rows:


Unformatted                   :          240 /        1,966,080
Freespace 1 (  0 -  25% free) :        1,074 /        8,798,208
Freespace 2 ( 25 -  50% free) :            0 /                0
Freespace 3 ( 50 -  75% free) :            0 /                0
Freespace 4 ( 75 - 100% free) :          133 /        1,089,536
Full                          :      123,927 /    1,015,209,984

Then a few wait events and session statistics for the insert:


---------------------------------------------------------
SID:    39:TEST_USER - jonathan
Session Events - 09-Jan 16:57:18
Interval:-      6 seconds
---------------------------------------------------------
Event                                             Waits   Time_outs        Csec    Avg Csec    Max Csec
-----                                             -----   ---------        ----    --------    --------
db file sequential read                          15,308           0         128        .008           3
db file scattered read                           20,086           0         271        .014           4

---------------------------------
Session stats - 09-Jan 16:57:18
Interval:-  6 seconds
---------------------------------
Name                                                                     Value
----                                                                     -----
session logical reads                                                  269,537
physical read total IO requests                                         35,401
db block gets                                                          229,522
consistent gets                                                         40,015
physical reads                                                         124,687
physical reads cache                                                   124,687
db block changes                                                       208,489
physical reads cache prefetch                                           89,293
redo entries                                                           207,892
redo size                                                           16,262,724
undo change vector size                                                  1,720
deferred (CURRENT) block cleanout applications                         103,932
table scan blocks gotten                                                20,797
HSC Heap Segment Block Changes                                              25

The session has read and updated almost all of the level 1 bitmap blocks. I don’t know exactly what triggered this re-read, but seems to be related to the number of rows inserted (or, perhaps, the amount of space used rather than the row count) as an insert crosses the pctfree boundary and fails over to the next block. I’ve only done a couple of little tests to try and get a better idea of why an insert sometimes sweeps through the bitmap blocks – so I know that inserting 2 or 3 rows at a time will also trigger the cleanout – but there are probably several little details involved that need to be identified.

You might note a couple of details in the stats:

  • Because I had flushed the buffer cache before the insert Oracle did its “cache warmup” tablescanning trick – if this had not happened I would probably have done a single block read for every single bitmap block I touched.
  • There are 103,932 block cleanout applications – but 208,000 db block changes and redo entries. Roughly half the latter are for data block cleanouts (OP code 4.1) and half are the state changes on the level 1 bitmap blocks (OP code 13.22). You’ll notice that neither change produces any undo.
  • I’ve also included the HSC Heap Segment Block Changes statistics to show you that not all changes to Heap Segment Blocks show up where you might expect them.

And finally:

If you re-run the tests on 11.2.0.4 and 12.2.0.1 you get the following results after the intial script run – the problem doesn’t exist:


11.2.0.4
========
Unformatted                   :            0 /                0
Freespace 1 (  0 -  25% free) :            0 /                0
Freespace 2 ( 25 -  50% free) :            1 /            8,192
Freespace 3 ( 50 -  75% free) :            0 /                0
Freespace 4 ( 75 - 100% free) :           67 /          548,864
Full                          :       41,666 /      341,327,872

12.2.0.1
========
Unformatted                   :            0 /                0
Freespace 1 (  0 -  25% free) :            0 /                0
Freespace 2 ( 25 -  50% free) :            1 /            8,192
Freespace 3 ( 50 -  75% free) :            0 /                0
Freespace 4 ( 75 - 100% free) :           67 /          548,864
Full                          :       41,666 /      341,327,872

So how does this help the OP.

  • First, there may be a huge mess still waiting to be cleaned in the table – but at 34M blocks I’m not too keen to suggest running the dbms_space routine to find out what it looks like – but maybe that’s necessary.
  • Secondly – an upgrade to 12.2 will probably avoid the problem in future.
  • Thirdly – if the number of rows per block is very close to uniform, write a little code to do a loop that inserts (say) 2 * expected number of rows per block as single row inserts and rolls back; the inserts will probably trigger a massive though perhaps not complete cleanout, so rinse and repeat until the cleanout is complete. Try to find a time when you don’t mind the extra load to get this job done.
  • Finally – on the big job that does the bulk insert – repeat the dummy insert/rollback at the end of the job to clean up the mess made by the job.

Addenda

Prompted by comment #2 below, I should add that if the problem has been fixed in 12.2 then possibly there’s a bug report and patch for it already. If there isn’t then the OP could raise an SR (referencing this blog note), and request a bug fix or back-port from 12.2.

And with 24 hours of publication, comment #4 (from Yury Pudovchenko) tells us that the bug is fixed by the Oct 2017 Bundle Patch.

 

 

Uwe Hesse's picture

How to cancel SQL statements and disconnect sessions in #PostgreSQL

https://uhesse.files.wordpress.com/2018/01/postgresql.png?w=150 150w, https://uhesse.files.wordpress.com/2018/01/postgresql.png?w=300 300w" sizes="(max-width: 360px) 100vw, 360px" />

In PostgreSQL, you can cancel problem statements or terminate offending sessions remotely with PG_CANCEL_BACKEND and PG_TERMINATE_BACKEND. This article shows how you can do it with working examples.

edb=# select current_database,current_user;
 current_database | current_user 
------------------+--------------
 edb              | enterprisedb
(1 row)

I’m connected as superuser. The demo is done with EDB Postgres Advanced Server 10.1.5 but the shown technique should work the same with other Postgres distributions and older versions. First I create a demo user:

edb=# create role adam password 'adam' login;
CREATE ROLE
edb=# grant connect on database edb to adam;
GRANT

Now opening another session with that new user:

-bash-4.2$ psql -U adam
Password for user adam: 
psql.bin (10.1.5)
Type "help" for help.

edb=> begin
edb$> loop
edb$> null;
edb$> end loop;
edb$> end;

That session burns CPU now in an endless loop. Back to the superuser session:

edb=# select pid,usename,query
edb-# from pg_catalog.pg_stat_activity where datname='edb';
  pid  |   usename    |                         query                         
-------+--------------+-------------------------------------------------------
 14346 | adam         | begin                                                +
       |              | loop                                                 +
       |              | null;                                                +
       |              | end loop;                                            +
       |              | end;
  5517 | enterprisedb | select pid,usename,query                             +
       |              | from pg_catalog.pg_stat_activity where datname='edb';
(2 rows)

This cancels the SQL statement of one session:

edb=# select pg_cancel_backend(14346);
 pg_cancel_backend 
-------------------
 t
(1 row)

The session that was doing the endless loop gets this output:

ERROR:  canceling statement due to user request
CONTEXT:  edb-spl function inline_code_block line 3 at NULL

This disconnects a single session:

edb=# select pg_terminate_backend(14346);
 pg_terminate_backend 
----------------------
 t
(1 row)

The disconnected session gets this output after trying to do anything:

edb=> \d
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

This way you cancel all SQL statements of a certain user:

edb=# select pg_cancel_backend(pid) from pg_stat_activity where usename='adam';

This way you disconnect all sessions of a certain user:

edb=# select pg_terminate_backend(pid) from pg_stat_activity where usename='adam';

In the unlikely event that your end users are connected with psql, it will try to reconnect them. That’s the background of the above listed output “… Attempting reset: Succeeded.” Means the session got reconnected. If you want to prevent that particular user from (re-)connecting, you need to do this additionally:

edb=# revoke connect on database edb from adam;
REVOKE ROLE

Remember to also revoke from public if you haven’t done that already, otherwise the above revoke doesn’t show any effect:

edb=# revoke connect on database edb from public;
REVOKE

Now this disconnects all sessions of one user and the above prevents new connects by that user:

edb=# select pg_terminate_backend(pid) from pg_stat_activity where usename='adam';
 pg_terminate_backend 
----------------------
 t
(1 row)

The terminated session then gets this output:

edb=> \d
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Keep in mind that the revoke is impacting the whole user, not just one particular session. In other words no session with that user can be established subsequently until you say:

edb=# grant connect on database edb to adam;
GRANT ROLE

You can terminate all sessions connected to one particular database – except your own session – like this:

edb=# select pg_terminate_backend(pid)
edb-# from pg_stat_activity
edb-# where datname='edb' and pid<>pg_backend_pid;

Once again, this does not prevent new sessions from connecting. So either you REVOKE CONNECT on the user layer as shown above, or you do it on the database layer.

This is how normal users are prevented from connecting to the database:

edb=# alter database edb with connection limit 0;
ALTER DATABASE

Superusers can still connect. The above is the equivalent to ALTER SYSTEM ENABLE RESTRICTED SESSION in Oracle. This is what normal users get now upon trying to connect:

-bash-4.2$ psql -U adam
Password for user adam: 
psql.bin: FATAL:  too many connections for database "edb"

The default of unlimited number of sessions allowed to connect can be set back like this:

edb=# alter database edb with connection limit -1;
ALTER DATABASE

This prevents also superusers from connecting to a database, but you must not be connected to that database yourself:

edb=# alter database edb with allow_connections false;
ERROR:  cannot disallow connections for current database

So either you connect to another existing database in the cluster or you create another database temporarily:

edb=# create database dummy;
CREATE DATABASE
edb=# \c dummy
You are now connected to database "dummy" as user "enterprisedb".
dummy=# alter database edb with allow_connections false;
ALTER DATABASE

Now both normal users and superusers get this output when trying to connect:

psql.bin: FATAL:  database "edb" is not currently accepting connections

The default that connections to that database are allowed can be set back with this command:

dummy=# alter database edb with allow_connections true;
ALTER DATABASE
dummy=# \c edb
You are now connected to database "edb" as user "enterprisedb".
edb=# drop database dummy;
DROP DATABASE

What I like especially about the shown functionality is the option to remotely cancel a particular (ongoing) statement without having to terminate the session that runs the statement. I’m not aware of a supported way to do that in Oracle. It can be done if Resource Manager has been configured appropriately beforehand, but that requires quite some effort and doesn’t work just out of the box.

connor_mc_d's picture

Licensed for Advanced Compression? Don’t forget the network

We often think of Advanced Compression being exclusively about compressing data “at rest”, ie, on some sort of storage device.  And don’t get me wrong, if we consider just that part of Advanced Compression, that still covers a myriad of opportunities that could yield benefits for your databases and database applications:

  • Heat maps
  • Automatic Data Optimization
  • XML, JSON and LOB compression (including de-duplication)
  • Compression on backups
  • Compression on Data Pump files
  • Additional compression options on indexes and tables
  • Compressed Flashback Data Archive storage
  • Storage snapshot compression

However, if you are licensed for the option, there are other things that you can also take advantage of when it comes to compression of data on the network.

In 12c, Advanced Network Compression is part of the Advanced Compression option.  Obviously, the modern data centre typically co-locates application servers and database servers within a high speed high bandwidth network, but once data needs to be accessed over less optimal networks, for example, users running ad-hoc queries to their desktops, then compressing data across the wire can yield benefits.

Trust me, when you live in Perth, Western Australia, any kind of improvement to access data across the network is a good thing! Smile

Here’s an example of the option in action. Here’s my source table – which I’ve deliberately created with lots of potential for compression:


SQL create table comp_Test
  2  as select rownum x, rpad('x',1000,'x') y from dual
  3  connect by level <= 100000;

Table created.

Every row is almost a replica of the others, and column Y is 1000 repeated characters. So it is an artificial but perfect candidate to see the compression in action. To ensure that the network transmission of the data is a impactful factor here, I’ve located this table in a “database server” (my spare laptop) accessed via a slow wireless network from my other machine. We can see the impact of the network by starting with the default settings

Default settings


C:\temp>sqlplus scott/tiger@pdb122

SQL*Plus: Release 12.2.0.1.0 Production on Wed Dec 27 13:48:05 2017

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Last Successful login time: Wed Dec 27 2017 13:48:26 +08:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production


SQL> set timing on
SQL> set arraysize 1000
SQL> set feedback only
SQL> select * from scott.comp_test;

100000 rows selected.

Elapsed: 00:00:24.68

So a touch more than 1000 bytes * 100000 rows yields the following session level statistics


SQL> @netstat

NAME                                                              VALUE
------------------------------------------------------------ ----------
bytes sent via SQL*Net to client                              102138497

Now I’ll activate compression by adding the following parameters to sqlnet.ora on client and server.


SQLNET.COMPRESSION=on
SQLNET.COMPRESSION_LEVELS=(low,high)
SQLNET.COMPRESSION_THRESHOLD=1024

and repeat the test in the a fresh connection. To see if we can potentially use compression, we can check V$SESSION_CONNECT_INFO


SQL> select sid, network_service_banner from V$SESSION_CONNECT_INFO 
  2  where sid = sys_context('USERENV','SID')
  3  @pr
==============================
SID                           : 324
NETWORK_SERVICE_BANNER        : Windows NT TCP/IP NT Protocol Adapter for 64-bit Windows: Version 12.2.0.1.0 - Production
==============================
SID                           : 324
NETWORK_SERVICE_BANNER        : Oracle Advanced Network Compression Service for 64-bit Windows: Version 12.2.0.1.0 - Production
==============================
SID                           : 324
NETWORK_SERVICE_BANNER        : NTS Authentication service adapter for 64-bit Windows: Version 2.0.0.0.0 - Production
==============================
SID                           : 324
NETWORK_SERVICE_BANNER        : Encryption service for 64-bit Windows: Version 12.2.0.1.0 - Production
==============================
SID                           : 324
NETWORK_SERVICE_BANNER        : Crypto-checksumming service for 64-bit Windows: Version 12.2.0.1.0 - Production

or if you enable a SQL*Net trace, you’ll entries like this in the log files


nlstddp_dump_ptable:   SQLNET.COMPRESSION_LEVELS = HIGH
nlstddp_dump_ptable:   SQLNET.COMPRESSION = ON

Compression enabled


SQL> set timing on
SQL> set arraysize 1000
SQL> set feedback only
SQL> select * from scott.comp_test;

100000 rows selected.

Elapsed: 00:00:01.93

You can see the huge difference in elapsed time. Unfortunately, as far I can ascertain, the session network statistics are derived after decompression, because the bytes transferred still reflect the original sizes


SQL> @netstat

NAME                                                              VALUE
------------------------------------------------------------ ----------
bytes sent via SQL*Net to client                              102114310

We can also take advantage of the larger SDU sizes available in 12c. Now my sqlnet.ora looks like this:


DEFAULT_SDU_SIZE=1048576
SQLNET.COMPRESSION=on
SQLNET.COMPRESSION_LEVELS=(low,high)
SQLNET.COMPRESSION_THRESHOLD=1024

and then the test can be repeated.

Compression on, larger SDU


SQL> set timing on
SQL> set arraysize 1000
SQL> set feedback only
SQL> select * from scott.comp_test;

100000 rows selected.

Elapsed: 00:00:00.98

(* – elapsed times are averaged across a number of executions)

There is a whitepaper on Advanced Compression with more details here, so if you have the license, make sure you take a look to see if there are some benefits for you to maximize the return on your investment.

Franck Pachot's picture

Keep your orapw password file secure

This is a small demo I did when I’ve found a database password file (orapw) lying around in /tmp with -rw-rw-rw- permissions, to show how this is a bad idea. People think that the orapw file only contains hashes to validate a password given, and forget that it can be used to connect to a remote database without password.

I can easily imagine why the orapwd was there in /tmp. To build a standby database, you need to copy the password file to the standby server. If you don’t have direct access to the oracle user, but only a sudo access for ‘security reasons’, you can’t scp easily. Then you copy the file to /tmp, make it readable by all users, and you can scp with your user.

In this demo I don’t even have access to the host. I’ve only access to connect to a PDB with the SCOTT users, reated with utlsampl.sql, with those additional privileges, a read access on $ORACLE_HOME/dbs:

SQL> connect sys/oracle@//192.168.56.122/PDB1 as sysdba

Connected.

SQL> create or replace directory DBS as '/u01/app/oracle/product/12.2.0/dbhome_1/dbs';

Directory DBS created.

SQL> grant read on directory DBS to SCOTT;

Grant succeeded.
People tend to grant many privileges, and think that a read access on a directory which is supposed to contain only configuration files is harmless. Let’s see what you can do from another server.

Get the orapw file from a remote connection

I connect with SCOTT which can read from ORACLE_HOME/dbs:

SQL> connect scott/tiger@//192.168.56.122/PDB1

Connected.

SQL> show user

USER is "SCOTT"

SQL> select * from all_directories;

OWNER DIRECTORY_NAME DIRECTORY_PATH ORIGIN_CON_ID

----- -------------- -------------- -------------

SYS DBS /u01/app/oracle/product/12.2.0/dbhome_1/dbs 4
I create a table to read this file (other possibilities utl_tile, external tables,…):

SQL> create table DEMO ( b blob );

Table DEMO created.

SQL> insert into demo values ( bfilename('DBS','orapwCDB1') );

1 row inserted.
I’m on another server with the same version of Oracle Database software installed.

I use sqlplus to retrieve the server file to my client:

sqlcl -s scott/tiger@//192.168.56.120/PDB1 < $ORACLE_HOME/dbs/orapwCDB1

set pages 0 lin 17000 long 1000000000 longc 16384

select * from DEMO;

exit

EOF
This (documented by Laurent Schneider) uses sqlplus to display the BLOB variable as hexadecimal code and xdd (installed with vim-common) to revert it to binary.

So, on my server I have a copy of the database password file for the database I want to steal:

[oracle@VM122 ~]$ strings /u01/app/oracle/product/12.2.0/dbhome_1/dbs/orapwCDB1

ORACLE Remote Password file

X)l)|

SYSDG

+933k\

SYSBACKUP

f ts6 $9

SYSKM
Pull

A nice feature of 12c is the ability to pull backups from a service. With this, it is the destination that connects to the source. I have diagrams to explain here). It is an easy alternative to RMAN DUPLICATE (see MOS Doc ID 2283978.1 Creating a Physical Standby database using RMAN restore from service). And one difference is that you don’t have to provide the password:

I prepare a small init.ora and directory for the datafiles

echo "db_name=CDB1" > $ORACLE_HOME/dbs/initCDB1.ora

mkdir -p /u01/oradata/CDB1
I’m still on my server with the copy of the remote orapw file and a network access to the source database and I just restore it, without the need for a password:

RMAN> connect target /

connected to target database (not started)
I start a local instance:

RMAN> startup nomount force

Oracle instance started

Total System Global Area 859832320 bytes

Fixed Size 8798552 bytes

Variable Size 784338600 bytes

Database Buffers 58720256 bytes

Redo Buffers 7974912 bytes
I restore the controlfile:

RMAN> restore controlfile from service '//192.168.56.122/CDB1';

Starting restore at 05-JAN-18

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=262 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore

channel ORA_DISK_1: using network backup set from service //192.168.56.122/CDB1

channel ORA_DISK_1: restoring control file

channel ORA_DISK_1: restore complete, elapsed time: 00:00:02

output file name=/u01/oradata/CDB1/control01.ctl

output file name=/u01/fast_recovery_area/CDB1/control02.ctl

Finished restore at 05-JAN-18
That’s the interesting part because it has to be connected, at least as SYSOPER, to the source database but I didn’t provide any password.

I mount this controlfile locally:

RMAN> alter database mount;

Statement processed

released channel: ORA_DISK_1
And now it is easy to pull the whole database (the CDB with all its PDBs) to my local server:

RMAN> restore database from service '//192.168.56.122/CDB1';

Starting restore at 05-JAN-18

Starting implicit crosscheck backup at 05-JAN-18

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=262 device type=DISK

Crosschecked 6 objects

Finished implicit crosscheck backup at 05-JAN-18

Starting implicit crosscheck copy at 05-JAN-18

using channel ORA_DISK_1

Finished implicit crosscheck copy at 05-JAN-18

searching for all files in the recovery area

cataloging files...

cataloging done

List of Cataloged Files

=======================

File Name: /u01/fast_recovery_area/CDB1/autobackup/2018_01_04/o1_mf_s_964524009_f4vzyt59_.bkp

File Name: /u01/fast_recovery_area/CDB1/archivelog/2018_01_04/o1_mf_1_15_f4w5vv19_.arc

File Name: /u01/fast_recovery_area/CDB1/archivelog/2018_01_04/o1_mf_1_16_f4wmm0t8_.arc

File Name: /u01/fast_recovery_area/CDB1/archivelog/2018_01_04/o1_mf_1_14_f4vzjdl1_.arc

using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore

channel ORA_DISK_1: using network backup set from service //192.168.56.122/CDB1

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

channel ORA_DISK_1: restoring datafile 00001 to /u01/oradata/CDB1/system01.dbf

channel ORA_DISK_1: restore complete, elapsed time: 00:00:16

channel ORA_DISK_1: starting datafile backup set restore

...
So what?

This is not an issue and is totally expected. In a Data Guard configuration, the primary and standby database have to communicate with each others and then need a passwordless authentication. This is done with the password file, and this is the reason why you need to copy it rather than just create another one with the same passwords.

So, there is more than just a hash of the password (which is required to validate a password) and probably includes a key (randomly generated when you create the password file) used for passwordless authentication.

Then, be careful, and do not give read access to the orapw files. You must secure them in the same way as a ssh key or an encryption wallet. and this include:

  • Do not leave a copy of the orapw file in a shared location
  • Be careful with grants on directories, even in READ
  • Do not grant CREATE ANY DIRECTORY except for a PDB with PATH_PREFIX lockdown
 

Cet article Keep your orapw password file secure est apparu en premier sur Blog dbi services.

Jonathan Lewis's picture

Defaults

Following on from a Twitter reference and an update to an old posting about a side effect of  constraints on the work done inserting data, I decided to have a closer look at the more general picture of default values and inserts. Here’s a script that I’ve tested against 11.2.0.4, 12.1.0.2, and 12.2.0.1 (original install, no patches applied in all cases):


rem
rem     Script:         defaults_cost.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2017
rem

create table t1 (
        column1  varchar2(10),
        column2  varchar2(10),
        column3  varchar2(10),
        column4  varchar2(10),
        column32 varchar2(32)   default 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
        column33 varchar2(33)   default 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
        virtual1      generated always as (
                column4 || column3 || column2 || column1
        ) virtual
)
segment creation immediate
;

execute dbms_output.put_line(dbms_stats.create_extended_stats(user,'t1','(column1 || column2 || column3 || column4)'))
execute dbms_output.put_line(dbms_stats.create_extended_stats(user,'t1','(column1,column2,column3)'))

create or replace function plsqlfunction_with_30char_name(
        i_in varchar2
)
return varchar
deterministic
is
begin
        return initcap(i_in);
end;
/

create index t1_i1 on t1(substr(plsqlfunction_with_30char_name(column1),1,10));

When you create a function-based index you get a hidden, virtual column supporting the index expression; when you create extended stats (of either type) you get a hidden virtual column holding the extension definition, when you create any type of virtual column, including a “real” virtual column you get a data dictionary entry holding the column name and the expression definition: all these options use the “data_default” column from user_tab_cols to display the defining information – as we can see when we the following query:


select  column_name, data_default
from    user_tab_cols
where   table_name = 'T1'
order by
         column_id
;

COLUMN_NAME                      DATA_DEFAULT
-------------------------------- --------------------------------------------------------------------------------
COLUMN1
COLUMN2
COLUMN3
COLUMN4
COLUMN32                         'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
COLUMN33                         'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
VIRTUAL1                         "COLUMN4"||"COLUMN3"||"COLUMN2"||"COLUMN1"
SYS_STUDAN97VB_XDKRTR_NPFAB80P   "COLUMN1"||"COLUMN2"||"COLUMN3"||"COLUMN4"
SYS_STUTCBJ8OFFSY1D9ZXRYZ0W3W#   SYS_OP_COMBINED_HASH("COLUMN1","COLUMN2","COLUMN3")
SYS_NC00010$                     SUBSTR("TEST_USER"."PLSQLFUNCTION_WITH_30CHAR_NAME"("COLUMN1"),1,10)

Apart from the special cases I’ve just listed, you’ll also see the “default values” I specified for column32 and column33 – you’ll notice that I’ve supplied a 30 character string as the default for column32, and a 31 character string as the default for column33 – this is a convenience that means the used space in the data_default (which is a long column) corresponds to the name of the column once you include the single quotes in the their character count.

Having set my data up I’m going to emulate a bad application that uses lots of literal string SQL and leaves Oracle to fill in the default values (and, of course, derive the various virtual values it might need).


alter session set events '10046 trace name context forever, level 4';

begin
        for i in 1..10 loop
                execute immediate '
                        insert into t1 (column1, column2, column3, column4)
                        values( ' || i || ', ' || i || ', ' || i || ', ' || i || ')'
                ;
                commit;
        end loop;
end;
/

alter session set events '10046 trace name context off';

This code generates 10 strings that populate column1 through to column4 only. But you’ll notice the call to enable SQL tracing – and here’s the interesting bit of the output from applying tkprof to the trace file:


  SQL ID: 47r1y8yn34jmj Plan Hash: 2191121161

select default$
from
 col$ where rowid=:1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse       50      0.00       0.00          0          0          0           0
Execute     50      0.00       0.00          0          0          0           0
Fetch       50      0.00       0.00          0        100          0          50
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      150      0.00       0.00          0        100          0          50

The summary is the same for all three versions of Oracle I tested*** – we’ve queried for a default value from col$ 5 times for each row we insert! (Technically that’s 5 times for each insert statement we’ve had to (hard-)parse; this anomaly wouldn’t appear if we have been using a bind-variable method and reusing the insert statement.) *** There is one difference in 12.2 – the number of parse calls reported for this statement was 1 rather than 50 but, judging by the various cursor cache hit stats, that may be due to a change in accounting rather than a change in workload.

Check the table definition: there are two “real defaults” and 4 expressions due to the various virtual columns – so why 5 calls per insert and not 6 ? The answer lies in the length of the actual value involved – if the text that appears in the (long) data_default column is 32 characters or shorter it will be stored in the the dictionary cache (rowcache), but only one of our 6 values is that short, so Oracle looks up the other five on each insert (hard parse).

This is a little strange on two counts: first, given the enormous memories that have been available for years and the extreme usefulness of virtual columns and extended stats it seems something of an oversight to limit the memory available to the cache that holds such critical definitions; secondly, we don’t need to evaluate the virtual columns (unless they are indexed) or extended stats on inserts so why is Oracle querying the definitions anyway ? [Possible answer: perhaps Oracle is using generic code that allows for column check constraints – which might exist on users’ virtual columns – and hasn’t catered for bypassing system-generated virtual columns.]

A key point to remember before you start worrying too much about the impact of the execution count for this query against col$ is that it’s only relevant to “hard” parses – so you’re only likely to notice it if you have a large number of different “literal string” inserts that should be using bind variables; and that means you’ve probably got an important coding defect to address before you worry too much about the extra impact caused by this particular call. Nevertheless there are a couple of bug reports on MoS that have been raised against this query and, after writing this note, I did a quick Google search for the critical SQL_ID and found (among others) this production example from Franck Pachot.

 

Chris Antognini's picture

How Well a Query Optimizer Handles Subqueries?

At the beginning of December, at the UKOUG Tech17 conference in Birmingham (GB), I presented a comparison of the query optimizers of MySQL 8.0.3 and PostgreSQL 10.1. One of the things I talked about is their ability to handle subqueries. I summarized my findings with the following sentence:

Simple sub-queries that are not correctly optimized were observed.

It goes without saying that such a sentence leaves a lot of questions open. After all, it is just a summary. The aim of this post is to show you which subqueries I tested, and to compare my expectations with the execution plans generated by the query optimizers. In addition, since I’m not limited in time and scope as during a 50-minute presentation, I also discuss how the Oracle Database 12.2 query optimizer handles the same queries.

To check how well a query optimizer handles subqueries, it’s in my opinion sufficient to challenge it with queries that should be obvious (at least for a human being). The type of queries where the response time ratio between a good and a bad execution plan is of several orders of magnitude. If a query optimizer isn’t able to correctly handle such queries, with more complex ones it can only be worse…

For the tests I use two very simple tables:

CREATE TABLE small (u INTEGER NOT NULL, nu INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
CREATE UNIQUE INDEX small_u ON small (u);
CREATE INDEX small_nu ON small (nu);
CREATE INDEX small_n ON small (n);
CREATE INDEX small_nn ON small (nn);
CREATE TABLE large (u INTEGER NOT NULL, nu INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL);
CREATE UNIQUE INDEX large_u ON large (u);
CREATE INDEX large_nu ON large (nu);
CREATE INDEX large_n ON large (n);
CREATE INDEX large_nn ON large (nn);

The table “small” contains 10 rows; its unique key contains the integer values between 1 and 10. The table “large” contains 1 million rows; its unique key contains the integer values between 1 and 1 million. Note that for both tables the columns “nu” (not unique), “n” (null), and “nn” (not null) contain the same value as the unique key column. The only exception is that the column “n” contains “null” instead of the value “7.” Basically, only the constraints applied to them are different. In addition, the column “p” contains a string of 128 characters that is only present to have tables that aren’t too small (i.e. not very representative).

I considered six types of subqueries:

  • Type A – Scalar subqueries with equality predicate
  • Type B – Scalar subqueries with inequality predicate
  • Type C – Uncorrelated subqueries with either IN or EXISTS
  • Type D – Uncorrelated subqueries with either NOT IN or NOT EXISTS
  • Type E – Correlated subqueries with either IN or EXISTS
  • Type F – Correlated subqueries with either NOT IN or NOT EXISTS

Few notes:

  • For each type I considered two sub-types; the difference between them is given by the position of the tables “small” and “large”
  • I didn’t consider subqueries outside the WHERE clause
  • “IN” could be replaced by either “=ANY” or “=SOME”
  • “NOT IN” could be replaced by “!=ALL”

Type A – Scalar subqueries with equality predicate

Subtype A1 – Table “large” in the subquery

A10: SELECT * FROM small WHERE u = (SELECT nu FROM large WHERE u = 6)
A11: SELECT * FROM small WHERE n = (SELECT n FROM large WHERE u = 6)
A12: SELECT * FROM small WHERE n = (SELECT nn FROM large WHERE u = 6)
A13: SELECT * FROM small WHERE nn = (SELECT n FROM large WHERE u = 6)
A14: SELECT * FROM small WHERE nn = (SELECT nn FROM large WHERE u = 6)

The execution plan I expect for these queries carries out the following operations:

  • Access the table “large” through an index scan that returns at most one value. The operation is executed one single time.
  • Access the table “small” through either a table scan or an index scan and return the rows matching the value returned by the previous operation. The operation is executed one single time.

MySQL selects the following execution plans. All of them fulfill the expectations.

A10

+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
|  1 | PRIMARY     | small | NULL       | const | small_u       | small_u | 4       | const |    1 |   100.00 | NULL  |
|  2 | SUBQUERY    | large | NULL       | const | large_u       | large_u | 4       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+

A11/A12

+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
|  1 | PRIMARY     | small | NULL       | ref   | small_n       | small_n | 5       | const |    1 |   100.00 | Using where |
|  2 | SUBQUERY    | large | NULL       | const | large_u       | large_u | 4       | const |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+

A13/A14

+----+-------------+-------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key      | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+
|  1 | PRIMARY     | small | NULL       | ref   | small_nn      | small_nn | 4       | const |    1 |   100.00 | Using where |
|  2 | SUBQUERY    | large | NULL       | const | large_u       | large_u  | 4       | const |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+

Oracle Database selects the following execution plans. All of them fulfill the expectations.

A10

-----------------------------------------------------------------------------------------
| Id  | Operation                     | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |         |     1 |   141 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID  | SMALL   |     1 |   141 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN           | SMALL_U |     1 |       |     0   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| LARGE   |     1 |    10 |     3   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN         | LARGE_U |     1 |       |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

   2 - access("U"= (SELECT "NU" FROM "LARGE" "LARGE" WHERE "U"=6))
   4 - access("U"=6)

A11

-----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |     1 |   141 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| SMALL   |     1 |   141 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | SMALL_N |     1 |       |     1   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID      | LARGE   |     1 |    10 |     3   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN               | LARGE_U |     1 |       |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

   2 - access("N"= (SELECT "N" FROM "LARGE" "LARGE" WHERE "U"=6))
   4 - access("U"=6)

A12

-----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |     1 |   141 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| SMALL   |     1 |   141 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | SMALL_N |     1 |       |     1   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID      | LARGE   |     1 |    10 |     3   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN               | LARGE_U |     1 |       |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

   2 - access("N"= (SELECT "NN" FROM "LARGE" "LARGE" WHERE "U"=6))
   4 - access("U"=6)

A13

------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |          |     1 |   141 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| SMALL    |     1 |   141 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | SMALL_NN |     1 |       |     1   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID      | LARGE    |     1 |    10 |     3   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN               | LARGE_U  |     1 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

   2 - access("NN"= (SELECT "N" FROM "LARGE" "LARGE" WHERE "U"=6))
   4 - access("U"=6)

A14

------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |          |     1 |   141 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| SMALL    |     1 |   141 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | SMALL_NN |     1 |       |     1   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID      | LARGE    |     1 |    10 |     3   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN               | LARGE_U  |     1 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

   2 - access("NN"= (SELECT "NN" FROM "LARGE" "LARGE" WHERE "U"=6))
   4 - access("U"=6)

PostgreSQL selects the following execution plans. All of them fulfill the expectations.

A10

 Seq Scan on small  (cost=8.44..9.57 rows=1 width=148)
   Filter: (u = $0)
   InitPlan 1 (returns $0)
     ->  Index Scan using large_u on large  (cost=0.42..8.44 rows=1 width=4)
           Index Cond: (u = 6)

A11/A12

 Seq Scan on small  (cost=8.44..9.57 rows=1 width=148)
   Filter: (n = $0)
   InitPlan 1 (returns $0)
     ->  Index Scan using large_u on large  (cost=0.42..8.44 rows=1 width=4)
           Index Cond: (u = 6)

A13/A14

 Seq Scan on small  (cost=8.44..9.57 rows=1 width=148)
   Filter: (nn = $0)
   InitPlan 1 (returns $0)
     ->  Index Scan using large_u on large  (cost=0.42..8.44 rows=1 width=4)
           Index Cond: (u = 6)
Subtype A2 – Table “small” in the subquery

A20: SELECT * FROM large WHERE u = (SELECT nu FROM small WHERE u = 6)
A21: SELECT * FROM large WHERE n = (SELECT n FROM small WHERE u = 6)
A22: SELECT * FROM large WHERE n = (SELECT nn FROM small WHERE u = 6)
A23: SELECT * FROM large WHERE nn = (SELECT n FROM small WHERE u = 6)
A24: SELECT * FROM large WHERE nn = (SELECT nn FROM small WHERE u = 6)

The execution plan I expect for these queries carries out the following operations:

  • Access the table “small” through either a table scan or an index scan that returns at most one value. The operation is executed one single time.
  • Access the table “large” through an index scan and return the rows matching the value returned by the previous operation. The operation is executed one single time.

MySQL selects the following execution plans. All of them fulfill the expectations.

A20

+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
|  1 | PRIMARY     | large | NULL       | const | large_u       | large_u | 4       | const |    1 |   100.00 | NULL  |
|  2 | SUBQUERY    | small | NULL       | const | small_u       | small_u | 4       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+

A21/A22

+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
|  1 | PRIMARY     | large | NULL       | ref   | large_n       | large_n | 5       | const |    1 |   100.00 | Using where |
|  2 | SUBQUERY    | small | NULL       | const | small_u       | small_u | 4       | const |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+

A23/A24

+----+-------------+-------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key      | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+
|  1 | PRIMARY     | large | NULL       | ref   | large_nn      | large_nn | 4       | const |    1 |   100.00 | Using where |
|  2 | SUBQUERY    | small | NULL       | const | small_u       | small_u  | 4       | const |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+

Oracle Database selects the following execution plans. All of them fulfill the expectations.

A20

-----------------------------------------------------------------------------------------
| Id  | Operation                     | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |         |     1 |   149 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID  | LARGE   |     1 |   149 |     3   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN           | LARGE_U |     1 |       |     2   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| SMALL   |     1 |     6 |     1   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN         | SMALL_U |     1 |       |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

   2 - access("U"= (SELECT "NU" FROM "SMALL" "SMALL" WHERE "U"=6))
   4 - access("U"=6)

A21

-----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |     1 |   149 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| LARGE   |     1 |   149 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | LARGE_N |     1 |       |     3   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID      | SMALL   |     1 |     6 |     1   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN               | SMALL_U |     1 |       |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

   2 - access("N"= (SELECT "N" FROM "SMALL" "SMALL" WHERE "U"=6))
   4 - access("U"=6)

A22

-----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |     1 |   149 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| LARGE   |     1 |   149 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | LARGE_N |     1 |       |     3   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID      | SMALL   |     1 |     6 |     1   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN               | SMALL_U |     1 |       |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

   2 - access("N"= (SELECT "NN" FROM "SMALL" "SMALL" WHERE "U"=6))
   4 - access("U"=6)

A23

------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |          |     1 |   149 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| LARGE    |     1 |   149 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | LARGE_NN |     1 |       |     3   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID      | SMALL    |     1 |     6 |     1   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN               | SMALL_U  |     1 |       |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

   2 - access("NN"= (SELECT "N" FROM "SMALL" "SMALL" WHERE "U"=6))
   4 - access("U"=6)

A24

------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |          |     1 |   149 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| LARGE    |     1 |   149 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | LARGE_NN |     1 |       |     3   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID      | SMALL    |     1 |     6 |     1   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN               | SMALL_U  |     1 |       |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

   2 - access("NN"= (SELECT "NN" FROM "SMALL" "SMALL" WHERE "U"=6))
   4 - access("U"=6)

PostgreSQL selects the following execution plans. All of them fulfill the expectations.

A20

 Index Scan using large_u on large  (cost=1.55..9.57 rows=1 width=148)
   Index Cond: (u = $0)
   InitPlan 1 (returns $0)
     ->  Seq Scan on small  (cost=0.00..1.12 rows=1 width=4)
           Filter: (u = 6)

A21/A22

 Index Scan using large_n on large  (cost=1.55..9.57 rows=1 width=148)
   Index Cond: (n = $0)
   InitPlan 1 (returns $0)
     ->  Seq Scan on small  (cost=0.00..1.12 rows=1 width=4)
           Filter: (u = 6)

A23/A24

 Index Scan using large_nn on large  (cost=1.55..9.57 rows=1 width=148)
   Index Cond: (nn = $0)
   InitPlan 1 (returns $0)
     ->  Seq Scan on small  (cost=0.00..1.12 rows=1 width=4)
           Filter: (u = 6)

Type B – Scalar subqueries with inequality predicate

Subtype B1 – Table “large” in the subquery

B10: SELECT * FROM small WHERE u != (SELECT nu FROM large WHERE u = 6)
B11: SELECT * FROM small WHERE n != (SELECT n FROM large WHERE u = 6)
B12: SELECT * FROM small WHERE n != (SELECT nn FROM large WHERE u = 6)
B13: SELECT * FROM small WHERE nn != (SELECT n FROM large WHERE u = 6)
B14: SELECT * FROM small WHERE nn != (SELECT nn FROM large WHERE u = 6)

The execution plan I expect for these queries carries out the following operations:

  • Access the table “large” through an index scan that returns at most one value. The operation is executed one single time.
  • Access the table “small” through a table scan and discard the rows matching the value returned by the previous operation. The operation is executed one single time.

MySQL selects the following execution plans. All of them fulfill the expectations.

B10

+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
|  1 | PRIMARY     | small | NULL       | range | small_u       | small_u | 4       | NULL  |    9 |   100.00 | Using where |
|  2 | SUBQUERY    | large | NULL       | const | large_u       | large_u | 4       | const |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+

B11/B12

+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
|  1 | PRIMARY     | small | NULL       | range | small_n       | small_n | 5       | NULL  |    8 |   100.00 | Using where |
|  2 | SUBQUERY    | large | NULL       | const | large_u       | large_u | 4       | const |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+

B13/B14

+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
|  1 | PRIMARY     | small | NULL       | ALL   | small_nn      | NULL    | NULL    | NULL  |   10 |    90.00 | Using where |
|  2 | SUBQUERY    | large | NULL       | const | large_u       | large_u | 4       | const |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+

Oracle Database selects the following execution plans. All of them fulfill the expectations.

B10

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |     9 |  1269 |     6   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL           | SMALL   |     9 |  1269 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| LARGE   |     1 |    10 |     3   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | LARGE_U |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   1 - filter("U"<> (SELECT "NU" FROM "LARGE" "LARGE" WHERE "U"=6))
   3 - access("U"=6)

B11

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |     9 |  1269 |     6   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL           | SMALL   |     9 |  1269 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| LARGE   |     1 |    10 |     3   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | LARGE_U |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   1 - filter("N"<> (SELECT "N" FROM "LARGE" "LARGE" WHERE "U"=6))
   3 - access("U"=6)

B12

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |     9 |  1269 |     6   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL           | SMALL   |     9 |  1269 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| LARGE   |     1 |    10 |     3   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | LARGE_U |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   1 - filter("N"<> (SELECT "NN" FROM "LARGE" "LARGE" WHERE "U"=6))
   3 - access("U"=6)

B13

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |     9 |  1269 |     6   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL           | SMALL   |     9 |  1269 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| LARGE   |     1 |    10 |     3   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | LARGE_U |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   1 - filter("NN"<> (SELECT "N" FROM "LARGE" "LARGE" WHERE "U"=6))
   3 - access("U"=6)

B14

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |     9 |  1269 |     6   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL           | SMALL   |     9 |  1269 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| LARGE   |     1 |    10 |     3   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | LARGE_U |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   1 - filter("NN"<> (SELECT "NN" FROM "LARGE" "LARGE" WHERE "U"=6))
   3 - access("U"=6)

PostgreSQL selects the following execution plans. All of them fulfill the expectations.

B10

 Seq Scan on small  (cost=8.44..9.57 rows=9 width=148)
   Filter: (u <> $0)
   InitPlan 1 (returns $0)
     ->  Index Scan using large_u on large  (cost=0.42..8.44 rows=1 width=4)
           Index Cond: (u = 6)

B11/B12

 Seq Scan on small  (cost=8.44..9.57 rows=8 width=148)
   Filter: (n <> $0)
   InitPlan 1 (returns $0)
     ->  Index Scan using large_u on large  (cost=0.42..8.44 rows=1 width=4)
           Index Cond: (u = 6)

B13/B14

 Seq Scan on small  (cost=8.44..9.57 rows=9 width=148)
   Filter: (nn <> $0)
   InitPlan 1 (returns $0)
     ->  Index Scan using large_u on large  (cost=0.42..8.44 rows=1 width=4)
           Index Cond: (u = 6)
Subtype B2 – Table “small” in the subquery

B20: SELECT * FROM large WHERE u != (SELECT nu FROM small WHERE u = 6)
B21: SELECT * FROM large WHERE n != (SELECT n FROM small WHERE u = 6)
B22: SELECT * FROM large WHERE n != (SELECT nn FROM small WHERE u = 6)
B23: SELECT * FROM large WHERE nn != (SELECT n FROM small WHERE u = 6)
B24: SELECT * FROM large WHERE nn != (SELECT nn FROM small WHERE u = 6)

The execution plan I expect for these queries carries out the following operations:

  • Access the table “small” through either a table scan or an index scan that returns at most one value. The operation is executed one single time.
  • Access the table “large” through a table scan and discard the rows matching the value returned by the previous operation. The operation is executed one single time.

MySQL selects the following execution plans. All of them fulfill the expectations.

B20

+----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref   | rows   | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+
|  1 | PRIMARY     | large | NULL       | range | large_u       | large_u | 4       | NULL  | 494590 |   100.00 | Using where |
|  2 | SUBQUERY    | small | NULL       | const | small_u       | small_u | 4       | const |      1 |   100.00 | NULL        |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+

B21/B22

+----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref   | rows   | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+
|  1 | PRIMARY     | large | NULL       | ALL   | large_n       | NULL    | NULL    | NULL  | 989170 |    50.00 | Using where |
|  2 | SUBQUERY    | small | NULL       | const | small_u       | small_u | 4       | const |      1 |   100.00 | NULL        |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+

B23/B24

+----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref   | rows   | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+
|  1 | PRIMARY     | large | NULL       | ALL   | large_nn      | NULL    | NULL    | NULL  | 989170 |    50.00 | Using where |
|  2 | SUBQUERY    | small | NULL       | const | small_u       | small_u | 4       | const |      1 |   100.00 | NULL        |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+

Oracle Database selects the following execution plans. All of them fulfill the expectations.

B20

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   999K|   142M|  5791   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL           | LARGE   |   999K|   142M|  5790   (1)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| SMALL   |     1 |     6 |     1   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | SMALL_U |     1 |       |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   1 - filter("U"<> (SELECT "NU" FROM "SMALL" "SMALL" WHERE "U"=6))
   3 - access("U"=6)

B21

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   999K|   142M|  5791   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL           | LARGE   |   999K|   142M|  5790   (1)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| SMALL   |     1 |     6 |     1   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | SMALL_U |     1 |       |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   1 - filter("N"<> (SELECT "N" FROM "SMALL" "SMALL" WHERE "U"=6))
   3 - access("U"=6)

B22

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   999K|   142M|  5791   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL           | LARGE   |   999K|   142M|  5790   (1)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| SMALL   |     1 |     6 |     1   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | SMALL_U |     1 |       |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   1 - filter("N"<> (SELECT "NN" FROM "SMALL" "SMALL" WHERE "U"=6))
   3 - access("U"=6)

B23

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   999K|   142M|  5791   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL           | LARGE   |   999K|   142M|  5790   (1)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| SMALL   |     1 |     6 |     1   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | SMALL_U |     1 |       |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   1 - filter("NN"<> (SELECT "N" FROM "SMALL" "SMALL" WHERE "U"=6))
   3 - access("U"=6)

B24

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   999K|   142M|  5791   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL           | LARGE   |   999K|   142M|  5790   (1)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| SMALL   |     1 |     6 |     1   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | SMALL_U |     1 |       |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   1 - filter("NN"<> (SELECT "NN" FROM "SMALL" "SMALL" WHERE "U"=6))
   3 - access("U"=6)

PostgreSQL selects the following execution plans. All of them fulfill the expectations.

B20

 Seq Scan on large  (cost=1.12..34724.12 rows=999999 width=148)
   Filter: (u <> $0)
   InitPlan 1 (returns $0)
     ->  Seq Scan on small  (cost=0.00..1.12 rows=1 width=4)
           Filter: (u = 6)

B21/B22

 Seq Scan on large  (cost=1.12..34724.12 rows=999999 width=148)
   Filter: (n <> $0)
   InitPlan 1 (returns $0)
     ->  Seq Scan on small  (cost=0.00..1.12 rows=1 width=4)
           Filter: (u = 6)

B23/B24

 Seq Scan on large  (cost=1.12..34724.12 rows=999999 width=148)
   Filter: (nn <> $0)
   InitPlan 1 (returns $0)
     ->  Seq Scan on small  (cost=0.00..1.12 rows=1 width=4)
           Filter: (u = 6)

Type C – Uncorrelated subqueries with either IN or EXISTS

Subtype C1 – Table “large” in the subquery

C10: SELECT * FROM small WHERE n IN (SELECT n FROM large)
C11: SELECT * FROM small WHERE n IN (SELECT nn FROM large)
C12: SELECT * FROM small WHERE nn IN (SELECT n FROM large)
C13: SELECT * FROM small WHERE nn IN (SELECT nn FROM large)
C14: SELECT * FROM small WHERE EXISTS (SELECT * FROM large)

The execution plan I expect for the queries C10-C13 carries out a semi-join between two data sets:

  • Access the table “small” through a table scan. The operation is executed one single time.
  • For every row of the table “small,” access the table “large” through an index scan.

However, for the query C14, I expect the following operations:

  • Access the table “large” through either a table scan or an index scan. The operation is executed one single time, and only needs to check whether one row exists.
  • If the table “large” contains at least one row, access the table “small” through a table scan. The operation is executed one single time.

MySQL selects the following execution plans. All of them fulfill the expectations.

C10

+----+-------------+-------+------------+------+---------------+---------+---------+---------------+------+----------+--------------------------------+
| id | select_type | table | partitions | type | possible_keys | key     | key_len | ref           | rows | filtered | Extra                          |
+----+-------------+-------+------------+------+---------------+---------+---------+---------------+------+----------+--------------------------------+
|  1 | SIMPLE      | small | NULL       | ALL  | small_n       | NULL    | NULL    | NULL          |   10 |   100.00 | Using where                    |
|  1 | SIMPLE      | large | NULL       | ref  | large_n       | large_n | 5       | chris.small.n |    1 |   100.00 | Using index; FirstMatch(small) |
+----+-------------+-------+------------+------+---------------+---------+---------+---------------+------+----------+--------------------------------+

C11

+----+-------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+--------------------------------+
| id | select_type | table | partitions | type | possible_keys | key      | key_len | ref           | rows | filtered | Extra                          |
+----+-------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+--------------------------------+
|  1 | SIMPLE      | small | NULL       | ALL  | small_n       | NULL     | NULL    | NULL          |   10 |   100.00 | Using where                    |
|  1 | SIMPLE      | large | NULL       | ref  | large_nn      | large_nn | 4       | chris.small.n |    1 |   100.00 | Using index; FirstMatch(small) |
+----+-------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+--------------------------------+

C12

+----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+--------------------------------+
| id | select_type | table | partitions | type | possible_keys | key     | key_len | ref            | rows | filtered | Extra                          |
+----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+--------------------------------+
|  1 | SIMPLE      | small | NULL       | ALL  | small_nn      | NULL    | NULL    | NULL           |   10 |   100.00 | NULL                           |
|  1 | SIMPLE      | large | NULL       | ref  | large_n       | large_n | 5       | chris.small.nn |    1 |   100.00 | Using index; FirstMatch(small) |
+----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+--------------------------------+

C13

+----+-------------+-------+------------+------+---------------+----------+---------+----------------+------+----------+--------------------------------+
| id | select_type | table | partitions | type | possible_keys | key      | key_len | ref            | rows | filtered | Extra                          |
+----+-------------+-------+------------+------+---------------+----------+---------+----------------+------+----------+--------------------------------+
|  1 | SIMPLE      | small | NULL       | ALL  | small_nn      | NULL     | NULL    | NULL           |   10 |   100.00 | NULL                           |
|  1 | SIMPLE      | large | NULL       | ref  | large_nn      | large_nn | 4       | chris.small.nn |    1 |   100.00 | Using index; FirstMatch(small) |
+----+-------------+-------+------------+------+---------------+----------+---------+----------------+------+----------+--------------------------------+

C14

+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key      | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+
|  1 | PRIMARY     | small | NULL       | ALL   | NULL          | NULL     | NULL    | NULL |     10 |   100.00 | NULL        |
|  2 | SUBQUERY    | large | NULL       | index | NULL          | large_nu | 4       | NULL | 989170 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+

Oracle Database selects the following execution plans. Even though the execution plan of the queries C10-C11 doesn’t completely fulfill the expectations (the table “small” isn’t accessed through a table scan), the difference wouldn’t be noticeable at runtime. Therefore, I consider that all of them fulfill the expectations.

C10

------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |         |     9 |  1314 |    12   (0)| 00:00:01 |
|   1 |  NESTED LOOPS SEMI                   |         |     9 |  1314 |    12   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| SMALL   |     9 |  1269 |     2   (0)| 00:00:01 |
|*  3 |    INDEX FULL SCAN                   | SMALL_N |     9 |       |     1   (0)| 00:00:01 |
|*  4 |   INDEX RANGE SCAN                   | LARGE_N |  1000K|  4882K|     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

   3 - filter("N" IS NOT NULL)
   4 - access("N"="N")

C11

-------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |     9 |  1314 |    12   (0)| 00:00:01 |
|   1 |  NESTED LOOPS SEMI                   |          |     9 |  1314 |    12   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| SMALL    |     9 |  1269 |     2   (0)| 00:00:01 |
|*  3 |    INDEX FULL SCAN                   | SMALL_N  |     9 |       |     1   (0)| 00:00:01 |
|*  4 |   INDEX RANGE SCAN                   | LARGE_NN |  1000K|  4882K|     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   3 - filter("N" IS NOT NULL)
   4 - access("N"="NN")

C12

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |    10 |  1460 |    23   (0)| 00:00:01 |
|   1 |  NESTED LOOPS SEMI |         |    10 |  1460 |    23   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| SMALL   |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN | LARGE_N |  1000K|  4882K|     2   (0)| 00:00:01 |
------------------------------------------------------------------------------

   3 - access("NN"="N")

C13

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |    10 |  1460 |    23   (0)| 00:00:01 |
|   1 |  NESTED LOOPS SEMI |          |    10 |  1460 |    23   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| SMALL    |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN | LARGE_NN |  1000K|  4882K|     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

   3 - access("NN"="NN")

C14

----------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |    10 |  1410 |     5   (0)| 00:00:01 |
|*  1 |  FILTER               |          |       |       |            |          |
|   2 |   TABLE ACCESS FULL   | SMALL    |    10 |  1410 |     3   (0)| 00:00:01 |
|   3 |   INDEX FAST FULL SCAN| LARGE_NN |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------

   1 - filter( EXISTS (SELECT 0 FROM "LARGE" "LARGE"))

PostgreSQL selects the following execution plans. Only the execution plan of the query “C14” fulfills the expectations. That said, the other execution plans, in this specific case, perform well. I checked that when the data changes, also the execution plans changes. So, even though I was surprised by them, everything looks good.

C10

 Merge Semi Join  (cost=11.52..13.58 rows=9 width=148)
   Merge Cond: (small.n = large.n)
   ->  Index Scan using small_n on small  (cost=0.14..12.29 rows=10 width=148)
   ->  Index Only Scan using large_n on large  (cost=0.42..81855.69 rows=1000000 width=4)

C11

 Merge Semi Join  (cost=11.52..13.58 rows=9 width=148)
   Merge Cond: (small.n = large.nn)
   ->  Index Scan using small_n on small  (cost=0.14..12.29 rows=10 width=148)
   ->  Index Only Scan using large_nn on large  (cost=0.42..81855.69 rows=1000000 width=4)

C12

 Merge Semi Join  (cost=0.56..13.59 rows=10 width=148)
   Merge Cond: (small.nn = large.n)
   ->  Index Scan using small_nn on small  (cost=0.14..12.29 rows=10 width=148)
   ->  Index Only Scan using large_n on large  (cost=0.42..81855.69 rows=1000000 width=4)

C13

 Merge Semi Join  (cost=0.56..13.59 rows=10 width=148)
   Merge Cond: (small.nn = large.nn)
   ->  Index Scan using small_nn on small  (cost=0.14..12.29 rows=10 width=148)
   ->  Index Only Scan using large_nn on large  (cost=0.42..81855.69 rows=1000000 width=4)

C14

 Result  (cost=0.03..1.13 rows=10 width=148)
   One-Time Filter: $0
   InitPlan 1 (returns $0)
     ->  Seq Scan on large  (cost=0.00..32223.00 rows=1000000 width=0)
   ->  Seq Scan on small  (cost=0.03..1.13 rows=10 width=148)
Subtype C2 – Table “small” in the subquery

C20: SELECT * FROM large WHERE n IN (SELECT n FROM small)
C21: SELECT * FROM large WHERE n IN (SELECT nn FROM small)
C22: SELECT * FROM large WHERE nn IN (SELECT n FROM small)
C23: SELECT * FROM large WHERE nn IN (SELECT nn FROM small)
C24: SELECT * FROM large WHERE EXISTS (SELECT * FROM small)

The execution plan I expect for the queries C20-C23 carries out a semi-join between two data sets:

  • Access the table “small” through a table scan. The operation is executed one single time.
  • For every row of the table “small,” access the table “large” through an index scan.

However, for the query C24, I expect the following operations:

  • Access the table “small” through either a table scan or an index scan. The operation is executed one single time, and only needs to check whether one row exists.
  • If the table “small” contains at least one row, access the table “large” through a table scan. The operation is executed one single time.

MySQL selects the following execution plans. All of them fulfill the expectations.

C20

+----+-------------+-------+------------+-------+---------------+---------+---------+---------------+------+----------+-------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref           | rows | filtered | Extra                               |
+----+-------------+-------+------------+-------+---------------+---------+---------+---------------+------+----------+-------------------------------------+
|  1 | SIMPLE      | small | NULL       | index | small_n       | small_n | 5       | NULL          |   10 |   100.00 | Using where; Using index; LooseScan |
|  1 | SIMPLE      | large | NULL       | ref   | large_n       | large_n | 5       | chris.small.n |    1 |   100.00 | NULL                                |
+----+-------------+-------+------------+-------+---------------+---------+---------+---------------+------+----------+-------------------------------------+

C21

+----+-------------+-------+------------+-------+---------------+----------+---------+----------------+------+----------+------------------------+
| id | select_type | table | partitions | type  | possible_keys | key      | key_len | ref            | rows | filtered | Extra                  |
+----+-------------+-------+------------+-------+---------------+----------+---------+----------------+------+----------+------------------------+
|  1 | SIMPLE      | small | NULL       | index | small_nn      | small_nn | 4       | NULL           |   10 |   100.00 | Using index; LooseScan |
|  1 | SIMPLE      | large | NULL       | ref   | large_n       | large_n  | 5       | chris.small.nn |    1 |   100.00 | NULL                   |
+----+-------------+-------+------------+-------+---------------+----------+---------+----------------+------+----------+------------------------+

C22

+----+-------------+-------+------------+-------+---------------+----------+---------+---------------+------+----------+-------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key      | key_len | ref           | rows | filtered | Extra                               |
+----+-------------+-------+------------+-------+---------------+----------+---------+---------------+------+----------+-------------------------------------+
|  1 | SIMPLE      | small | NULL       | index | small_n       | small_n  | 5       | NULL          |   10 |   100.00 | Using where; Using index; LooseScan |
|  1 | SIMPLE      | large | NULL       | ref   | large_nn      | large_nn | 4       | chris.small.n |    1 |   100.00 | NULL                                |
+----+-------------+-------+------------+-------+---------------+----------+---------+---------------+------+----------+-------------------------------------+

C23

+----+-------------+-------+------------+-------+---------------+----------+---------+----------------+------+----------+------------------------+
| id | select_type | table | partitions | type  | possible_keys | key      | key_len | ref            | rows | filtered | Extra                  |
+----+-------------+-------+------------+-------+---------------+----------+---------+----------------+------+----------+------------------------+
|  1 | SIMPLE      | small | NULL       | index | small_nn      | small_nn | 4       | NULL           |   10 |   100.00 | Using index; LooseScan |
|  1 | SIMPLE      | large | NULL       | ref   | large_nn      | large_nn | 4       | chris.small.nn |    1 |   100.00 | NULL                   |
+----+-------------+-------+------------+-------+---------------+----------+---------+----------------+------+----------+------------------------+

C24

+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key      | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+
|  1 | PRIMARY     | large | NULL       | ALL   | NULL          | NULL     | NULL    | NULL | 989170 |   100.00 | NULL        |
|  2 | SUBQUERY    | small | NULL       | index | NULL          | small_nu | 4       | NULL |     10 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+

Oracle Database selects the following execution plans. All of them fulfill the expectations.

C20

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |     9 |  1368 |    18   (6)| 00:00:01 |
|   1 |  NESTED LOOPS                |         |     9 |  1368 |    18   (6)| 00:00:01 |
|   2 |   NESTED LOOPS               |         |     9 |  1368 |    18   (6)| 00:00:01 |
|   3 |    SORT UNIQUE               |         |     9 |    27 |     1   (0)| 00:00:01 |
|*  4 |     INDEX FULL SCAN          | SMALL_N |     9 |    27 |     1   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | LARGE_N |     1 |       |     2   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID| LARGE   |     1 |   149 |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   4 - filter("N" IS NOT NULL)
   5 - access("N"="N")

C21

-----------------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |          |    10 |  1520 |    18   (6)| 00:00:01 |
|   1 |  NESTED LOOPS                |          |    10 |  1520 |    18   (6)| 00:00:01 |
|   2 |   NESTED LOOPS               |          |    10 |  1520 |    18   (6)| 00:00:01 |
|   3 |    SORT UNIQUE               |          |    10 |    30 |     1   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN          | SMALL_NN |    10 |    30 |     1   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | LARGE_N  |     1 |       |     2   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID| LARGE    |     1 |   149 |     4   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

   5 - access("N"="NN")

C22

-----------------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |          |     9 |  1368 |    13   (8)| 00:00:01 |
|   1 |  NESTED LOOPS                |          |     9 |  1368 |    13   (8)| 00:00:01 |
|   2 |   NESTED LOOPS               |          |     9 |  1368 |    13   (8)| 00:00:01 |
|   3 |    SORT UNIQUE               |          |     9 |    27 |     1   (0)| 00:00:01 |
|*  4 |     INDEX FULL SCAN          | SMALL_N  |     9 |    27 |     1   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | LARGE_NN |     1 |       |     2   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID| LARGE    |     1 |   149 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

   4 - filter("N" IS NOT NULL)
   5 - access("NN"="N")

C23

-----------------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |          |    10 |  1520 |    13   (8)| 00:00:01 |
|   1 |  NESTED LOOPS                |          |    10 |  1520 |    13   (8)| 00:00:01 |
|   2 |   NESTED LOOPS               |          |    10 |  1520 |    13   (8)| 00:00:01 |
|   3 |    SORT UNIQUE               |          |    10 |    30 |     1   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN          | SMALL_NN |    10 |    30 |     1   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | LARGE_NN |     1 |       |     2   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID| LARGE    |     1 |   149 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

   5 - access("NN"="NN")

C24

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |  1000K|   142M|  5789   (1)| 00:00:01 |
|*  1 |  FILTER            |          |       |       |            |          |
|   2 |   TABLE ACCESS FULL| LARGE    |  1000K|   142M|  5788   (1)| 00:00:01 |
|   3 |   INDEX FULL SCAN  | SMALL_NN |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------

   1 - filter( EXISTS (SELECT 0 FROM "SMALL" "SMALL"))

PostgreSQL selects the following execution plans. All of them fulfill the expectations.

C20

 Merge Semi Join  (cost=1.74..2.59 rows=9 width=148)
   Merge Cond: (large.n = small.n)
   ->  Index Scan using large_n on large  (cost=0.42..81855.69 rows=1000000 width=148)
   ->  Sort  (cost=1.27..1.29 rows=10 width=4)
         Sort Key: small.n
         ->  Seq Scan on small  (cost=0.00..1.10 rows=10 width=4)

C21

 Merge Semi Join  (cost=1.69..2.60 rows=10 width=148)
   Merge Cond: (large.n = small.nn)
   ->  Index Scan using large_n on large  (cost=0.42..81855.69 rows=1000000 width=148)
   ->  Sort  (cost=1.27..1.29 rows=10 width=4)
         Sort Key: small.nn
         ->  Seq Scan on small  (cost=0.00..1.10 rows=10 width=4)

C22

 Merge Semi Join  (cost=1.74..2.59 rows=9 width=148)
   Merge Cond: (large.nn = small.n)
   ->  Index Scan using large_nn on large  (cost=0.42..81855.69 rows=1000000 width=148)
   ->  Sort  (cost=1.27..1.29 rows=10 width=4)
         Sort Key: small.n
         ->  Seq Scan on small  (cost=0.00..1.10 rows=10 width=4)

C23

 Merge Semi Join  (cost=1.69..2.60 rows=10 width=148)
   Merge Cond: (large.nn = small.nn)
   ->  Index Scan using large_nn on large  (cost=0.42..81855.69 rows=1000000 width=148)
   ->  Sort  (cost=1.27..1.29 rows=10 width=4)
         Sort Key: small.nn
         ->  Seq Scan on small  (cost=0.00..1.10 rows=10 width=4)

C24

 Result  (cost=0.11..32223.11 rows=1000000 width=148)
   One-Time Filter: $0
   InitPlan 1 (returns $0)
     ->  Seq Scan on small  (cost=0.00..1.10 rows=10 width=0)
   ->  Seq Scan on large  (cost=0.11..32223.11 rows=1000000 width=148)

Type D – Uncorrelated subqueries with either NOT IN or NOT EXISTS

Subtype D1 – Table “large” in the subquery

D10: SELECT * FROM small WHERE n NOT IN (SELECT n FROM large)
D11: SELECT * FROM small WHERE n NOT IN (SELECT nn FROM large)
D12: SELECT * FROM small WHERE nn NOT IN (SELECT n FROM large)
D13: SELECT * FROM small WHERE nn NOT IN (SELECT nn FROM large)
D14: SELECT * FROM small WHERE NOT EXISTS (SELECT * FROM large)

The execution plan I expect for the queries D10-D13 carries out an anti-join between two data sets:

  • Access the table “small” through a table scan. The operation is executed one single time.
  • For every row of the table “small,” access the index associated to referenced column of the table “large” to check whether rows that fulfill the WHERE clause exist.

However, for the query D14, I expect the following operations:

  • Access the table “large” through either a table scan or an index scan. The operation is executed one single time, and only needs to check whether one row exists.
  • If the table “large” contains no row, access the table “small” through a table scan. The operation is executed one single time.

MySQL selects the following execution plans. All of them fulfill the expectations.

D10

+----+--------------------+-------+------------+----------------+---------------+---------+---------+------+------+----------+-------------------------------------------------+
| id | select_type        | table | partitions | type           | possible_keys | key     | key_len | ref  | rows | filtered | Extra                                           |
+----+--------------------+-------+------------+----------------+---------------+---------+---------+------+------+----------+-------------------------------------------------+
|  1 | PRIMARY            | small | NULL       | ALL            | NULL          | NULL    | NULL    | NULL |   10 |   100.00 | Using where                                     |
|  2 | DEPENDENT SUBQUERY | large | NULL       | index_subquery | large_n       | large_n | 5       | func |    2 |   100.00 | Using where; Using index; Full scan on NULL key |
+----+--------------------+-------+------------+----------------+---------------+---------+---------+------+------+----------+-------------------------------------------------+

D11

+----+--------------------+-------+------------+----------------+---------------+----------+---------+------+------+----------+-------------------------------------------------+
| id | select_type        | table | partitions | type           | possible_keys | key      | key_len | ref  | rows | filtered | Extra                                           |
+----+--------------------+-------+------------+----------------+---------------+----------+---------+------+------+----------+-------------------------------------------------+
|  1 | PRIMARY            | small | NULL       | ALL            | NULL          | NULL     | NULL    | NULL |   10 |   100.00 | Using where                                     |
|  2 | DEPENDENT SUBQUERY | large | NULL       | index_subquery | large_nn      | large_nn | 4       | func |    1 |   100.00 | Using where; Using index; Full scan on NULL key |
+----+--------------------+-------+------------+----------------+---------------+----------+---------+------+------+----------+-------------------------------------------------+

D12

+----+--------------------+-------+------------+----------------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type        | table | partitions | type           | possible_keys | key     | key_len | ref  | rows | filtered | Extra                    |
+----+--------------------+-------+------------+----------------+---------------+---------+---------+------+------+----------+--------------------------+
|  1 | PRIMARY            | small | NULL       | ALL            | NULL          | NULL    | NULL    | NULL |   10 |   100.00 | Using where              |
|  2 | DEPENDENT SUBQUERY | large | NULL       | index_subquery | large_n       | large_n | 5       | func |    2 |   100.00 | Using where; Using index |
+----+--------------------+-------+------------+----------------+---------------+---------+---------+------+------+----------+--------------------------+

D13

+----+--------------------+-------+------------+----------------+---------------+----------+---------+------+------+----------+-------------+
| id | select_type        | table | partitions | type           | possible_keys | key      | key_len | ref  | rows | filtered | Extra       |
+----+--------------------+-------+------------+----------------+---------------+----------+---------+------+------+----------+-------------+
|  1 | PRIMARY            | small | NULL       | ALL            | NULL          | NULL     | NULL    | NULL |   10 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | large | NULL       | index_subquery | large_nn      | large_nn | 4       | func |    1 |   100.00 | Using index |
+----+--------------------+-------+------------+----------------+---------------+----------+---------+------+------+----------+-------------+

D14

+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+------------------+
| id | select_type | table | partitions | type  | possible_keys | key      | key_len | ref  | rows   | filtered | Extra            |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+------------------+
|  1 | PRIMARY     | NULL  | NULL       | NULL  | NULL          | NULL     | NULL    | NULL |   NULL |     NULL | Impossible WHERE |
|  2 | SUBQUERY    | large | NULL       | index | NULL          | large_nu | 4       | NULL | 989170 |   100.00 | Using index      |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+------------------+

Oracle Database selects the following execution plans. The execution plan of the queries D13-D14 fulfills the expectations. The others, because of the table scan on “large”, are suboptimal. Note that for the queries D12 the query optimizer is restricted by the fact that the “large_n” index doesn’t contain the null value. I consider this a physical database design issue, not a query optimizer issue. However, the execution plan of the queries D10-D11 is suboptimal because of a query optimizer limitation with nullable values.

D10

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |     1 |   146 |  5793   (1)| 00:00:01 |
|*  1 |  HASH JOIN ANTI NA |       |     1 |   146 |  5793   (1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| SMALL |    10 |  1410 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| LARGE |  1000K|  4882K|  5787   (1)| 00:00:01 |
----------------------------------------------------------------------------

   1 - access("N"="N")

D11

----------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |     1 |   146 |   616   (2)| 00:00:01 |
|*  1 |  HASH JOIN ANTI SNA   |          |     1 |   146 |   616   (2)| 00:00:01 |
|   2 |   TABLE ACCESS FULL   | SMALL    |    10 |  1410 |     3   (0)| 00:00:01 |
|   3 |   INDEX FAST FULL SCAN| LARGE_NN |  1000K|  4882K|   610   (1)| 00:00:01 |
----------------------------------------------------------------------------------

   1 - access("N"="NN")

D12

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |     1 |   146 |  5793   (1)| 00:00:01 |
|*  1 |  HASH JOIN ANTI NA |       |     1 |   146 |  5793   (1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| SMALL |    10 |  1410 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| LARGE |  1000K|  4882K|  5787   (1)| 00:00:01 |
----------------------------------------------------------------------------

   1 - access("NN"="N")

D13

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |     1 |   146 |    23   (0)| 00:00:01 |
|   1 |  NESTED LOOPS ANTI |          |     1 |   146 |    23   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| SMALL    |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN | LARGE_NN |   900K|  4394K|     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

   3 - access("NN"="NN")

D14

----------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |    10 |  1410 |     5   (0)| 00:00:01 |
|*  1 |  FILTER               |          |       |       |            |          |
|   2 |   TABLE ACCESS FULL   | SMALL    |    10 |  1410 |     3   (0)| 00:00:01 |
|   3 |   INDEX FAST FULL SCAN| LARGE_NN |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "LARGE" "LARGE"))

PostgreSQL selects the following execution plans. Only the execution plan of the query D14 fulfills the expectations. The others, because of the table scan on “large” and its materialization, are really bad.

D10-D13

 Seq Scan on small  (cost=0.00..218151.12 rows=5 width=148)
   Filter: (NOT (SubPlan 1))
   SubPlan 1
     ->  Materialize  (cost=0.00..41130.00 rows=1000000 width=4)
           ->  Seq Scan on large  (cost=0.00..32223.00 rows=1000000 width=4)

D14

 Result  (cost=0.03..1.13 rows=10 width=148)
   One-Time Filter: (NOT $0)
   InitPlan 1 (returns $0)
     ->  Seq Scan on large  (cost=0.00..32223.00 rows=1000000 width=0)
   ->  Seq Scan on small  (cost=0.03..1.13 rows=10 width=148)
Subtype D2 – Table “small” in the subquery

D20: SELECT * FROM large WHERE n NOT IN (SELECT n FROM small)
D21: SELECT * FROM large WHERE n NOT IN (SELECT nn FROM small)
D22: SELECT * FROM large WHERE nn NOT IN (SELECT n FROM small)
D23: SELECT * FROM large WHERE nn NOT IN (SELECT nn FROM small)
D24: SELECT * FROM large WHERE NOT EXISTS (SELECT * FROM small)

The execution plan I expect for the queries D20-D23 carries out an anti-join between two data sets:

  • Access the table “small” through a table scan and put the resulting data into a memory structure. This operation is executed one single time.
  • Access the table “large” through a table scan and, for every row, check the memory structure created by the previous operation to find out whether rows that fulfills the WHERE clause exist. This operation is executed one single time.

However, for the query D14, I expect the following operations:

  • Access the table “small” through either a table scan or an index scan. The operation is executed one single time, and only needs to check whether one row exists.
  • If the table “small” contains no row, access the table “large” through a table scan. The operation is executed one single time.

MySQL selects the following execution plans. All of them fulfill the expectations (what isn’t visible in the execution plans is that for the queries D20-D23 the result set generated by the subquery is materialized).

D20/D22

+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
|  1 | PRIMARY     | large | NULL       | ALL   | NULL          | NULL    | NULL    | NULL | 989170 |   100.00 | Using where |
|  2 | SUBQUERY    | small | NULL       | index | small_n       | small_n | 5       | NULL |     10 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+

D21/D23

+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key      | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+
|  1 | PRIMARY     | large | NULL       | ALL   | NULL          | NULL     | NULL    | NULL | 989170 |   100.00 | Using where |
|  2 | SUBQUERY    | small | NULL       | index | small_nn      | small_nn | 4       | NULL |     10 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+

D24

+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+------------------+
| id | select_type | table | partitions | type  | possible_keys | key      | key_len | ref  | rows | filtered | Extra            |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+------------------+
|  1 | PRIMARY     | NULL  | NULL       | NULL  | NULL          | NULL     | NULL    | NULL | NULL |     NULL | Impossible WHERE |
|  2 | SUBQUERY    | small | NULL       | index | NULL          | small_nu | 4       | NULL |   10 |   100.00 | Using index      |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+------------------+

Oracle Database selects the following execution plans. All of them fulfill the expectations.

D20

---------------------------------------------------------------------------------
| Id  | Operation               | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |       |   999K|   144M|  5795   (1)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT ANTI NA|       |   999K|   144M|  5795   (1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL     | SMALL |    10 |    30 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL     | LARGE |  1000K|   142M|  5788   (1)| 00:00:01 |
---------------------------------------------------------------------------------

   1 - access("N"="N")

D21

-------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |   999K|   144M|  5793   (1)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT ANTI SNA|          |   999K|   144M|  5793   (1)| 00:00:01 |
|   2 |   INDEX FULL SCAN        | SMALL_NN |    10 |    30 |     1   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL      | LARGE    |  1000K|   142M|  5788   (1)| 00:00:01 |
-------------------------------------------------------------------------------------

   1 - access("N"="NN")

D22

---------------------------------------------------------------------------------
| Id  | Operation               | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |       |   999K|   144M|  5795   (1)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT ANTI NA|       |   999K|   144M|  5795   (1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL     | SMALL |    10 |    30 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL     | LARGE |  1000K|   142M|  5788   (1)| 00:00:01 |
---------------------------------------------------------------------------------

   1 - access("NN"="N")

D23

---------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |   999K|   144M|  5793   (1)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT ANTI|          |   999K|   144M|  5793   (1)| 00:00:01 |
|   2 |   INDEX FULL SCAN    | SMALL_NN |    10 |    30 |     1   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL  | LARGE    |  1000K|   142M|  5788   (1)| 00:00:01 |
---------------------------------------------------------------------------------

   1 - access("NN"="NN")

D24

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |  1000K|   142M|  5789   (1)| 00:00:01 |
|*  1 |  FILTER            |          |       |       |            |          |
|   2 |   TABLE ACCESS FULL| LARGE    |  1000K|   142M|  5788   (1)| 00:00:01 |
|   3 |   INDEX FULL SCAN  | SMALL_NN |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "SMALL" "SMALL"))

PostgreSQL selects the following execution plans. All of them fulfill the expectations.

D20-D23

 Seq Scan on large  (cost=1.12..34724.12 rows=500000 width=148)
   Filter: (NOT (hashed SubPlan 1))
   SubPlan 1
     ->  Seq Scan on small  (cost=0.00..1.10 rows=10 width=4)

D24

 Result  (cost=0.11..32223.11 rows=1000000 width=148)
   One-Time Filter: (NOT $0)
   InitPlan 1 (returns $0)
     ->  Seq Scan on small  (cost=0.00..1.10 rows=10 width=0)
   ->  Seq Scan on large  (cost=0.11..32223.11 rows=1000000 width=148)

Type E – Correlated subqueries with either IN or EXISTS

Subtype E1 – Table “large” in the subquery

E10: SELECT * FROM small WHERE n IN (SELECT n FROM large WHERE large.u = small.u)
E11: SELECT * FROM small WHERE n IN (SELECT nn FROM large WHERE large.u = small.u)
E12: SELECT * FROM small WHERE nn IN (SELECT n FROM large WHERE large.u = small.u)
E13: SELECT * FROM small WHERE nn IN (SELECT nn FROM large WHERE large.u = small.u)
E14: SELECT * FROM small WHERE n IN (SELECT n FROM large WHERE large.nu = small.u)
E15: SELECT * FROM small WHERE n IN (SELECT nn FROM large WHERE large.nu = small.u)
E16: SELECT * FROM small WHERE nn IN (SELECT n FROM large WHERE large.nu = small.u)
E17: SELECT * FROM small WHERE nn IN (SELECT nn FROM large WHERE large.nu = small.u)
E18: SELECT * FROM small WHERE EXISTS (SELECT * FROM large WHERE large.u = small.u)
E19: SELECT * FROM small WHERE EXISTS (SELECT * FROM large WHERE large.nu = small.u)

The execution plan I expect for these queries is a join between two data sets:

  • Access the table “small” through a table scan. The operation is executed one single time.
  • For every row of the table “small,” access the table “large” through an index scan and check whether at least one row fulfills the WHERE clause.

Note that only for the queries E14-E17/E19 a semi-join is necessary. For the others, because the WHERE clause in the subquery references a unique value in table “large”, a “regular” join can take place.

MySQL selects the following execution plans. Except for the query E18, the others fulfill the expectations. For the query E18 the query optimizer does not recognize that no semi-join is necessary.

E10

+----+-------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys   | key     | key_len | ref           | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+
|  1 | SIMPLE      | small | NULL       | ALL    | small_u,small_n | NULL    | NULL    | NULL          |   10 |   100.00 | NULL        |
|  1 | SIMPLE      | large | NULL       | eq_ref | large_u,large_n | large_u | 4       | chris.small.u |    1 |     5.00 | Using where |
+----+-------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+

E11

+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys    | key     | key_len | ref           | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
|  1 | SIMPLE      | small | NULL       | ALL    | small_u,small_n  | NULL    | NULL    | NULL          |   10 |   100.00 | NULL        |
|  1 | SIMPLE      | large | NULL       | eq_ref | large_u,large_nn | large_u | 4       | chris.small.u |    1 |     5.00 | Using where |
+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+

E12

+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys    | key     | key_len | ref           | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
|  1 | SIMPLE      | small | NULL       | ALL    | small_u,small_nn | NULL    | NULL    | NULL          |   10 |   100.00 | NULL        |
|  1 | SIMPLE      | large | NULL       | eq_ref | large_u,large_n  | large_u | 4       | chris.small.u |    1 |     5.00 | Using where |
+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+

E13

+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys    | key     | key_len | ref           | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
|  1 | SIMPLE      | small | NULL       | ALL    | small_u,small_nn | NULL    | NULL    | NULL          |   10 |   100.00 | NULL        |
|  1 | SIMPLE      | large | NULL       | eq_ref | large_u,large_nn | large_u | 4       | chris.small.u |    1 |     5.00 | Using where |
+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+

E14

+----+-------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+--------------------------------+
| id | select_type | table | partitions | type | possible_keys    | key      | key_len | ref           | rows | filtered | Extra                          |
+----+-------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+--------------------------------+
|  1 | SIMPLE      | small | NULL       | ALL  | small_u,small_n  | NULL     | NULL    | NULL          |   10 |   100.00 | NULL                           |
|  1 | SIMPLE      | large | NULL       | ref  | large_nu,large_n | large_nu | 4       | chris.small.u |    1 |     5.00 | Using where; FirstMatch(small) |
+----+-------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+--------------------------------+

E15

+----+-------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+--------------------------------+
| id | select_type | table | partitions | type | possible_keys     | key      | key_len | ref           | rows | filtered | Extra                          |
+----+-------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+--------------------------------+
|  1 | SIMPLE      | small | NULL       | ALL  | small_u,small_n   | NULL     | NULL    | NULL          |   10 |   100.00 | NULL                           |
|  1 | SIMPLE      | large | NULL       | ref  | large_nu,large_nn | large_nu | 4       | chris.small.u |    1 |     5.00 | Using where; FirstMatch(small) |
+----+-------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+--------------------------------+

E16

+----+-------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+--------------------------------+
| id | select_type | table | partitions | type | possible_keys    | key      | key_len | ref           | rows | filtered | Extra                          |
+----+-------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+--------------------------------+
|  1 | SIMPLE      | small | NULL       | ALL  | small_u,small_nn | NULL     | NULL    | NULL          |   10 |   100.00 | NULL                           |
|  1 | SIMPLE      | large | NULL       | ref  | large_nu,large_n | large_nu | 4       | chris.small.u |    1 |     5.00 | Using where; FirstMatch(small) |
+----+-------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+--------------------------------+

E17

+----+-------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+--------------------------------+
| id | select_type | table | partitions | type | possible_keys     | key      | key_len | ref           | rows | filtered | Extra                          |
+----+-------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+--------------------------------+
|  1 | SIMPLE      | small | NULL       | ALL  | small_u,small_nn  | NULL     | NULL    | NULL          |   10 |   100.00 | NULL                           |
|  1 | SIMPLE      | large | NULL       | ref  | large_nu,large_nn | large_nu | 4       | chris.small.u |    1 |     5.00 | Using where; FirstMatch(small) |
+----+-------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+--------------------------------+

E18

+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------+
| id | select_type        | table | partitions | type   | possible_keys | key     | key_len | ref           | rows | filtered | Extra       |
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------+
|  1 | PRIMARY            | small | NULL       | ALL    | NULL          | NULL    | NULL    | NULL          |   10 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | large | NULL       | eq_ref | large_u       | large_u | 4       | chris.small.u |    1 |   100.00 | Using index |
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------+

E19

+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys | key      | key_len | ref           | rows | filtered | Extra       |
+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+-------------+
|  1 | PRIMARY            | small | NULL       | ALL  | NULL          | NULL     | NULL    | NULL          |   10 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | large | NULL       | ref  | large_nu      | large_nu | 4       | chris.small.u |    1 |   100.00 | Using index |
+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+-------------+

Oracle Database selects the following execution plans. Even though the execution plan of the queries E10/E11/E14/E15 doesn’t completely fulfill the expectations (the table “small” isn’t accessed through a table scan), the difference wouldn’t be noticeable at runtime. Therefore, I consider that all of them fulfill the expectations.

E10

-------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |         |     9 |  1359 |    20   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                         |         |     9 |  1359 |    20   (0)| 00:00:01 |
|   2 |   NESTED LOOPS                        |         |     9 |  1359 |    20   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| SMALL   |     9 |  1269 |     2   (0)| 00:00:01 |
|*  4 |     INDEX FULL SCAN                   | SMALL_N |     9 |       |     1   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN                  | LARGE_U |     1 |       |     1   (0)| 00:00:01 |
|*  6 |   TABLE ACCESS BY INDEX ROWID         | LARGE   |     1 |    10 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   4 - filter("N" IS NOT NULL)
   5 - access("LARGE"."U"="SMALL"."U")
   6 - filter("N"="N")

E11

-------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |         |     9 |  1359 |    20   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                         |         |     9 |  1359 |    20   (0)| 00:00:01 |
|   2 |   NESTED LOOPS                        |         |     9 |  1359 |    20   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| SMALL   |     9 |  1269 |     2   (0)| 00:00:01 |
|*  4 |     INDEX FULL SCAN                   | SMALL_N |     9 |       |     1   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN                  | LARGE_U |     1 |       |     1   (0)| 00:00:01 |
|*  6 |   TABLE ACCESS BY INDEX ROWID         | LARGE   |     1 |    10 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   4 - filter("N" IS NOT NULL)
   5 - access("LARGE"."U"="SMALL"."U")
   6 - filter("N"="NN")

E12

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |    10 |  1510 |    23   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |         |    10 |  1510 |    23   (0)| 00:00:01 |
|   2 |   NESTED LOOPS               |         |    10 |  1510 |    23   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL         | SMALL   |    10 |  1410 |     3   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | LARGE_U |     1 |       |     1   (0)| 00:00:01 |
|*  5 |   TABLE ACCESS BY INDEX ROWID| LARGE   |     1 |    10 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   4 - access("LARGE"."U"="SMALL"."U")
   5 - filter("NN"="N")

E13

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |    10 |  1510 |    23   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |         |    10 |  1510 |    23   (0)| 00:00:01 |
|   2 |   NESTED LOOPS               |         |    10 |  1510 |    23   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL         | SMALL   |    10 |  1410 |     3   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | LARGE_U |     1 |       |     1   (0)| 00:00:01 |
|*  5 |   TABLE ACCESS BY INDEX ROWID| LARGE   |     1 |    10 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   4 - access("LARGE"."U"="SMALL"."U")
   5 - filter("NN"="NN")

E14

-------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |     9 |  1359 |    29   (0)| 00:00:01 |
|   1 |  NESTED LOOPS SEMI                   |          |     9 |  1359 |    29   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| SMALL    |     9 |  1269 |     2   (0)| 00:00:01 |
|*  3 |    INDEX FULL SCAN                   | SMALL_N  |     9 |       |     1   (0)| 00:00:01 |
|*  4 |   TABLE ACCESS BY INDEX ROWID BATCHED| LARGE    |  1000K|  9765K|     3   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN                  | LARGE_NU |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   3 - filter("N" IS NOT NULL)
   4 - filter("N"="N")
   5 - access("LARGE"."NU"="SMALL"."U")

E15

-------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |     9 |  1359 |    21   (0)| 00:00:01 |
|   1 |  NESTED LOOPS SEMI                   |          |     9 |  1359 |    21   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| SMALL    |     9 |  1269 |     2   (0)| 00:00:01 |
|*  3 |    INDEX FULL SCAN                   | SMALL_N  |     9 |       |     1   (0)| 00:00:01 |
|*  4 |   TABLE ACCESS BY INDEX ROWID BATCHED| LARGE    |  1000K|  9765K|     3   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN                  | LARGE_NN |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   3 - filter("N" IS NOT NULL)
   4 - filter("LARGE"."NU"="SMALL"."U")
   5 - access("N"="NN")

E16

-------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |    10 |  1510 |    33   (0)| 00:00:01 |
|   1 |  NESTED LOOPS SEMI                   |          |    10 |  1510 |    33   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL                  | SMALL    |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID BATCHED| LARGE    |  1000K|  9765K|     3   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN                  | LARGE_NU |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   3 - filter("NN"="N")
   4 - access("LARGE"."NU"="SMALL"."U")

E17

-------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |    10 |  1510 |    33   (0)| 00:00:01 |
|   1 |  NESTED LOOPS SEMI                   |          |    10 |  1510 |    33   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL                  | SMALL    |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID BATCHED| LARGE    |  1000K|  9765K|     3   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN                  | LARGE_NN |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   3 - filter("LARGE"."NU"="SMALL"."U")
   4 - access("NN"="NN")

E18

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |    10 |  1460 |    13   (0)| 00:00:01 |
|   1 |  NESTED LOOPS      |         |    10 |  1460 |    13   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| SMALL   |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   INDEX UNIQUE SCAN| LARGE_U |     1 |     5 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------

   3 - access("LARGE"."U"="SMALL"."U")

E19

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |    10 |  1460 |    23   (0)| 00:00:01 |
|   1 |  NESTED LOOPS SEMI |          |    10 |  1460 |    23   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| SMALL    |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN | LARGE_NU |  1000K|  4882K|     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

   3 - access("LARGE"."NU"="SMALL"."U")

PostgreSQL selects the following execution plans. Except for the execution plan of the query E19, the others fulfill the expectations. Note that, in this specific case, the execution plan selected for the query E19 perform well. I checked that when the data changes, also the execution plans changes. So, even though I was surprised by it, it is good.

E10-E13

 Seq Scan on small  (cost=0.00..45.47 rows=5 width=148)
   Filter: (SubPlan 1)
   SubPlan 1
     ->  Index Scan using large_u on large  (cost=0.42..8.44 rows=1 width=4)
           Index Cond: (u = small.u)

E14-E17

 Seq Scan on small  (cost=0.00..45.47 rows=5 width=148)
   Filter: (SubPlan 1)
   SubPlan 1
     ->  Index Scan using large_nu on large  (cost=0.42..8.44 rows=1 width=4)
           Index Cond: (nu = small.u)

E18

 Merge Join  (cost=1.69..2.60 rows=10 width=148)
   Merge Cond: (large.u = small.u)
   ->  Index Only Scan using large_u on large  (cost=0.42..81855.69 rows=1000000 width=4)
   ->  Sort  (cost=1.27..1.29 rows=10 width=148)
         Sort Key: small.u
         ->  Seq Scan on small  (cost=0.00..1.10 rows=10 width=148)

E19

 Merge Semi Join  (cost=0.56..13.59 rows=10 width=148)
   Merge Cond: (small.u = large.nu)
   ->  Index Scan using small_u on small  (cost=0.14..12.29 rows=10 width=148)
   ->  Index Only Scan using large_nu on large  (cost=0.42..81855.69 rows=1000000 width=4)
Subtype E2 – Table “small” in the subquery

E20: SELECT * FROM large WHERE n IN (SELECT n FROM small WHERE small.u = large.u)
E21: SELECT * FROM large WHERE n IN (SELECT nn FROM small WHERE small.u = large.u)
E22: SELECT * FROM large WHERE nn IN (SELECT n FROM small WHERE small.u = large.u)
E23: SELECT * FROM large WHERE nn IN (SELECT nn FROM small WHERE small.u = large.u)
E24: SELECT * FROM large WHERE n IN (SELECT n FROM small WHERE small.nu = large.u)
E25: SELECT * FROM large WHERE n IN (SELECT nn FROM small WHERE small.nu = large.u)
E26: SELECT * FROM large WHERE nn IN (SELECT n FROM small WHERE small.nu = large.u)
E27: SELECT * FROM large WHERE nn IN (SELECT nn FROM small WHERE small.nu = large.u)
E28: SELECT * FROM large WHERE EXISTS (SELECT * FROM small WHERE small.u = large.u)
E29: SELECT * FROM large WHERE EXISTS (SELECT * FROM small WHERE small.nu = large.u)

The execution plan I expect for these queries is a join between two data sets:

  • Access the table “small” through a table scan. The operation is executed one single time.
  • For every row of the table “small,” access the table “large” through an index scan and select the rows that fulfills the WHERE clause.

Note that only for the queries E24-E27/E29 a semi-join is necessary. For the others, since the WHERE clause in the subquery references a unique value in table “small”, a “regular” join can take place.

MySQL selects the following execution plans. The ones for the queries E20-E27 fulfill the expectations. The execution plan of the others, because of the way the table “large” is accessed, is really bad.

E20

+----+-------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys   | key     | key_len | ref           | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+
|  1 | SIMPLE      | small | NULL       | index  | small_u,small_n | small_n | 5       | NULL          |   10 |   100.00 | Using index |
|  1 | SIMPLE      | large | NULL       | eq_ref | large_u,large_n | large_u | 4       | chris.small.u |    1 |     5.00 | Using where |
+----+-------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+

E21

+----+-------------+-------+------------+--------+------------------+----------+---------+---------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys    | key      | key_len | ref           | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+------------------+----------+---------+---------------+------+----------+-------------+
|  1 | SIMPLE      | small | NULL       | index  | small_u,small_nn | small_nn | 4       | NULL          |   10 |   100.00 | Using index |
|  1 | SIMPLE      | large | NULL       | eq_ref | large_u,large_n  | large_u  | 4       | chris.small.u |    1 |     5.00 | Using where |
+----+-------------+-------+------------+--------+------------------+----------+---------+---------------+------+----------+-------------+

E22

+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys    | key     | key_len | ref           | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
|  1 | SIMPLE      | small | NULL       | index  | small_u,small_n  | small_n | 5       | NULL          |   10 |   100.00 | Using index |
|  1 | SIMPLE      | large | NULL       | eq_ref | large_u,large_nn | large_u | 4       | chris.small.u |    1 |     5.00 | Using where |
+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+

E23

+----+-------------+-------+------------+--------+------------------+----------+---------+---------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys    | key      | key_len | ref           | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+------------------+----------+---------+---------------+------+----------+-------------+
|  1 | SIMPLE      | small | NULL       | index  | small_u,small_nn | small_nn | 4       | NULL          |   10 |   100.00 | Using index |
|  1 | SIMPLE      | large | NULL       | eq_ref | large_u,large_nn | large_u  | 4       | chris.small.u |    1 |     5.00 | Using where |
+----+-------------+-------+------------+--------+------------------+----------+---------+---------------+------+----------+-------------+

E24

+----+-------------+-------+------------+--------+------------------+---------+---------+----------------+------+----------+----------------------------+
| id | select_type | table | partitions | type   | possible_keys    | key     | key_len | ref            | rows | filtered | Extra                      |
+----+-------------+-------+------------+--------+------------------+---------+---------+----------------+------+----------+----------------------------+
|  1 | SIMPLE      | small | NULL       | ALL    | small_nu,small_n | NULL    | NULL    | NULL           |   10 |   100.00 | Start temporary            |
|  1 | SIMPLE      | large | NULL       | eq_ref | large_u,large_n  | large_u | 4       | chris.small.nu |    1 |     5.00 | Using where; End temporary |
+----+-------------+-------+------------+--------+------------------+---------+---------+----------------+------+----------+----------------------------+

E25

+----+-------------+-------+------------+--------+-------------------+---------+---------+----------------+------+----------+----------------------------+
| id | select_type | table | partitions | type   | possible_keys     | key     | key_len | ref            | rows | filtered | Extra                      |
+----+-------------+-------+------------+--------+-------------------+---------+---------+----------------+------+----------+----------------------------+
|  1 | SIMPLE      | small | NULL       | ALL    | small_nu,small_nn | NULL    | NULL    | NULL           |   10 |   100.00 | Start temporary            |
|  1 | SIMPLE      | large | NULL       | eq_ref | large_u,large_n   | large_u | 4       | chris.small.nu |    1 |     5.00 | Using where; End temporary |
+----+-------------+-------+------------+--------+-------------------+---------+---------+----------------+------+----------+----------------------------+

E26

+----+-------------+-------+------------+--------+------------------+---------+---------+----------------+------+----------+----------------------------+
| id | select_type | table | partitions | type   | possible_keys    | key     | key_len | ref            | rows | filtered | Extra                      |
+----+-------------+-------+------------+--------+------------------+---------+---------+----------------+------+----------+----------------------------+
|  1 | SIMPLE      | small | NULL       | ALL    | small_nu,small_n | NULL    | NULL    | NULL           |   10 |   100.00 | Start temporary            |
|  1 | SIMPLE      | large | NULL       | eq_ref | large_u,large_nn | large_u | 4       | chris.small.nu |    1 |     5.00 | Using where; End temporary |
+----+-------------+-------+------------+--------+------------------+---------+---------+----------------+------+----------+----------------------------+

E27

+----+-------------+-------+------------+--------+-------------------+---------+---------+----------------+------+----------+----------------------------+
| id | select_type | table | partitions | type   | possible_keys     | key     | key_len | ref            | rows | filtered | Extra                      |
+----+-------------+-------+------------+--------+-------------------+---------+---------+----------------+------+----------+----------------------------+
|  1 | SIMPLE      | small | NULL       | ALL    | small_nu,small_nn | NULL    | NULL    | NULL           |   10 |   100.00 | Start temporary            |
|  1 | SIMPLE      | large | NULL       | eq_ref | large_u,large_nn  | large_u | 4       | chris.small.nu |    1 |     5.00 | Using where; End temporary |
+----+-------------+-------+------------+--------+-------------------+---------+---------+----------------+------+----------+----------------------------+

E28

+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+--------+----------+-------------+
| id | select_type        | table | partitions | type   | possible_keys | key     | key_len | ref           | rows   | filtered | Extra       |
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+--------+----------+-------------+
|  1 | PRIMARY            | large | NULL       | ALL    | NULL          | NULL    | NULL    | NULL          | 989170 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | small | NULL       | eq_ref | small_u       | small_u | 4       | chris.large.u |      1 |   100.00 | Using index |
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+--------+----------+-------------+

E29

+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+--------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys | key      | key_len | ref           | rows   | filtered | Extra       |
+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+--------+----------+-------------+
|  1 | PRIMARY            | large | NULL       | ALL  | NULL          | NULL     | NULL    | NULL          | 989170 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | small | NULL       | ref  | small_nu      | small_nu | 4       | chris.large.u |      1 |   100.00 | Using index |
+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+--------+----------+-------------+

Oracle Database selects the following execution plans. Even though the execution plan of the queries E20-E17 doesn’t completely fulfill the expectations (the table “small” isn’t accessed through a table scan), the difference wouldn’t be noticeable at runtime. Therefore, I consider that all of them fulfill the expectations.

E20

-------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |         |     9 |  1395 |    20   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                         |         |     9 |  1395 |    20   (0)| 00:00:01 |
|   2 |   NESTED LOOPS                        |         |     9 |  1395 |    20   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| SMALL   |     9 |    54 |     2   (0)| 00:00:01 |
|*  4 |     INDEX FULL SCAN                   | SMALL_N |     9 |       |     1   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN                  | LARGE_U |     1 |       |     1   (0)| 00:00:01 |
|*  6 |   TABLE ACCESS BY INDEX ROWID         | LARGE   |     1 |   149 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   4 - filter("N" IS NOT NULL)
   5 - access("SMALL"."U"="LARGE"."U")
   6 - filter("N"="N")

E21

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |    10 |  1550 |    22   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |                  |    10 |  1550 |    22   (0)| 00:00:01 |
|   2 |   NESTED LOOPS               |                  |    10 |  1550 |    22   (0)| 00:00:01 |
|   3 |    VIEW                      | index$_join$_002 |    10 |    60 |     2   (0)| 00:00:01 |
|*  4 |     HASH JOIN                |                  |       |       |            |          |
|   5 |      INDEX FAST FULL SCAN    | SMALL_NN         |    10 |    60 |     1   (0)| 00:00:01 |
|   6 |      INDEX FAST FULL SCAN    | SMALL_U          |    10 |    60 |     1   (0)| 00:00:01 |
|*  7 |    INDEX UNIQUE SCAN         | LARGE_U          |     1 |       |     1   (0)| 00:00:01 |
|*  8 |   TABLE ACCESS BY INDEX ROWID| LARGE            |     1 |   149 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   4 - access(ROWID=ROWID)
   7 - access("SMALL"."U"="LARGE"."U")
   8 - filter("N"="NN")

E22

-------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |         |     9 |  1395 |    20   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                         |         |     9 |  1395 |    20   (0)| 00:00:01 |
|   2 |   NESTED LOOPS                        |         |     9 |  1395 |    20   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| SMALL   |     9 |    54 |     2   (0)| 00:00:01 |
|*  4 |     INDEX FULL SCAN                   | SMALL_N |     9 |       |     1   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN                  | LARGE_U |     1 |       |     1   (0)| 00:00:01 |
|*  6 |   TABLE ACCESS BY INDEX ROWID         | LARGE   |     1 |   149 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("N" IS NOT NULL)
   5 - access("SMALL"."U"="LARGE"."U")
   6 - filter("NN"="N")

E23

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |    10 |  1550 |    22   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |                  |    10 |  1550 |    22   (0)| 00:00:01 |
|   2 |   NESTED LOOPS               |                  |    10 |  1550 |    22   (0)| 00:00:01 |
|   3 |    VIEW                      | index$_join$_002 |    10 |    60 |     2   (0)| 00:00:01 |
|*  4 |     HASH JOIN                |                  |       |       |            |          |
|   5 |      INDEX FAST FULL SCAN    | SMALL_NN         |    10 |    60 |     1   (0)| 00:00:01 |
|   6 |      INDEX FAST FULL SCAN    | SMALL_U          |    10 |    60 |     1   (0)| 00:00:01 |
|*  7 |    INDEX UNIQUE SCAN         | LARGE_U          |     1 |       |     1   (0)| 00:00:01 |
|*  8 |   TABLE ACCESS BY INDEX ROWID| LARGE            |     1 |   149 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   4 - access(ROWID=ROWID)
   7 - access("SMALL"."U"="LARGE"."U")
   8 - filter("NN"="NN")

E24

--------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |         |     9 |  1395 |    13   (8)| 00:00:01 |
|   1 |  NESTED LOOPS                          |         |     9 |  1395 |    13   (8)| 00:00:01 |
|   2 |   NESTED LOOPS                         |         |     9 |  1395 |    13   (8)| 00:00:01 |
|   3 |    SORT UNIQUE                         |         |     9 |    54 |     2   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| SMALL   |     9 |    54 |     2   (0)| 00:00:01 |
|*  5 |      INDEX FULL SCAN                   | SMALL_N |     9 |       |     1   (0)| 00:00:01 |
|*  6 |    INDEX UNIQUE SCAN                   | LARGE_U |     1 |       |     1   (0)| 00:00:01 |
|*  7 |   TABLE ACCESS BY INDEX ROWID          | LARGE   |     1 |   149 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

   5 - filter("N" IS NOT NULL)
   6 - access("SMALL"."NU"="LARGE"."U")
   7 - filter("N"="N")

E25

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |    10 |  1550 |    13   (8)| 00:00:01 |
|   1 |  NESTED LOOPS                |                  |    10 |  1550 |    13   (8)| 00:00:01 |
|   2 |   NESTED LOOPS               |                  |    10 |  1550 |    13   (8)| 00:00:01 |
|   3 |    SORT UNIQUE               |                  |    10 |    60 |     2   (0)| 00:00:01 |
|   4 |     VIEW                     | index$_join$_002 |    10 |    60 |     2   (0)| 00:00:01 |
|*  5 |      HASH JOIN               |                  |       |       |            |          |
|   6 |       INDEX FAST FULL SCAN   | SMALL_NN         |    10 |    60 |     1   (0)| 00:00:01 |
|   7 |       INDEX FAST FULL SCAN   | SMALL_NU         |    10 |    60 |     1   (0)| 00:00:01 |
|*  8 |    INDEX UNIQUE SCAN         | LARGE_U          |     1 |       |     1   (0)| 00:00:01 |
|*  9 |   TABLE ACCESS BY INDEX ROWID| LARGE            |     1 |   149 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   5 - access(ROWID=ROWID)
   8 - access("SMALL"."NU"="LARGE"."U")
   9 - filter("N"="NN")

E26

--------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |         |     9 |  1395 |    13   (8)| 00:00:01 |
|   1 |  NESTED LOOPS                          |         |     9 |  1395 |    13   (8)| 00:00:01 |
|   2 |   NESTED LOOPS                         |         |     9 |  1395 |    13   (8)| 00:00:01 |
|   3 |    SORT UNIQUE                         |         |     9 |    54 |     2   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| SMALL   |     9 |    54 |     2   (0)| 00:00:01 |
|*  5 |      INDEX FULL SCAN                   | SMALL_N |     9 |       |     1   (0)| 00:00:01 |
|*  6 |    INDEX UNIQUE SCAN                   | LARGE_U |     1 |       |     1   (0)| 00:00:01 |
|*  7 |   TABLE ACCESS BY INDEX ROWID          | LARGE   |     1 |   149 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

   5 - filter("N" IS NOT NULL)
   6 - access("SMALL"."NU"="LARGE"."U")
   7 - filter("NN"="N")

E27

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |    10 |  1550 |    13   (8)| 00:00:01 |
|   1 |  NESTED LOOPS                |                  |    10 |  1550 |    13   (8)| 00:00:01 |
|   2 |   NESTED LOOPS               |                  |    10 |  1550 |    13   (8)| 00:00:01 |
|   3 |    SORT UNIQUE               |                  |    10 |    60 |     2   (0)| 00:00:01 |
|   4 |     VIEW                     | index$_join$_002 |    10 |    60 |     2   (0)| 00:00:01 |
|*  5 |      HASH JOIN               |                  |       |       |            |          |
|   6 |       INDEX FAST FULL SCAN   | SMALL_NN         |    10 |    60 |     1   (0)| 00:00:01 |
|   7 |       INDEX FAST FULL SCAN   | SMALL_NU         |    10 |    60 |     1   (0)| 00:00:01 |
|*  8 |    INDEX UNIQUE SCAN         | LARGE_U          |     1 |       |     1   (0)| 00:00:01 |
|*  9 |   TABLE ACCESS BY INDEX ROWID| LARGE            |     1 |   149 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   5 - access(ROWID=ROWID)
   8 - access("SMALL"."NU"="LARGE"."U")
   9 - filter("NN"="NN")

E28

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |    10 |  1520 |    12   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |         |    10 |  1520 |    12   (0)| 00:00:01 |
|   2 |   NESTED LOOPS               |         |    10 |  1520 |    12   (0)| 00:00:01 |
|   3 |    INDEX FULL SCAN           | SMALL_U |    10 |    30 |     1   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | LARGE_U |     1 |       |     1   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| LARGE   |     1 |   149 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   4 - access("SMALL"."U"="LARGE"."U")

E29

-----------------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |          |    10 |  1520 |     8  (13)| 00:00:01 |
|   1 |  NESTED LOOPS                |          |    10 |  1520 |     8  (13)| 00:00:01 |
|   2 |   NESTED LOOPS               |          |    10 |  1520 |     8  (13)| 00:00:01 |
|   3 |    SORT UNIQUE               |          |    10 |    30 |     1   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN          | SMALL_NU |    10 |    30 |     1   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN         | LARGE_U  |     1 |       |     1   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID| LARGE    |     1 |   149 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

   5 - access("SMALL"."NU"="LARGE"."U")

PostgreSQL selects the following execution plans. Only the execution plan of the queries E28/E29 fulfill the expectations. The execution plan of the others, because of the way the table “large” is accessed, is really bad.

E20-E23

 Seq Scan on large  (cost=0.00..598473.00 rows=500000 width=148)
   Filter: (SubPlan 1)
   SubPlan 1
     ->  Seq Scan on small  (cost=0.00..1.12 rows=1 width=4)
           Filter: (u = large.u)

E24-E27

 Seq Scan on large  (cost=0.00..598473.00 rows=500000 width=148)
   Filter: (SubPlan 1)
   SubPlan 1
     ->  Seq Scan on small  (cost=0.00..1.12 rows=1 width=4)
           Filter: (nu = large.u)

E28

 Merge Join  (cost=1.69..2.60 rows=10 width=148)
   Merge Cond: (large.u = small.u)
   ->  Index Scan using large_u on large  (cost=0.42..81855.69 rows=1000000 width=148)
   ->  Sort  (cost=1.27..1.29 rows=10 width=4)
         Sort Key: small.u
         ->  Seq Scan on small  (cost=0.00..1.10 rows=10 width=4)

E29

 Merge Semi Join  (cost=1.69..2.60 rows=10 width=148)
   Merge Cond: (large.u = small.nu)
   ->  Index Scan using large_u on large  (cost=0.42..81855.69 rows=1000000 width=148)
   ->  Sort  (cost=1.27..1.29 rows=10 width=4)
         Sort Key: small.nu
         ->  Seq Scan on small  (cost=0.00..1.10 rows=10 width=4)

Type F – Correlated subqueries with either NOT IN or NOT EXISTS

Subtype F1 – Table “large” in the subquery

F10: SELECT * FROM small WHERE n NOT IN (SELECT n FROM large WHERE large.u = small.u)
F11: SELECT * FROM small WHERE n NOT IN (SELECT nn FROM large WHERE large.u = small.u)
F12: SELECT * FROM small WHERE nn NOT IN (SELECT n FROM large WHERE large.u = small.u)
F13: SELECT * FROM small WHERE nn NOT IN (SELECT nn FROM large WHERE large.u = small.u)
F14: SELECT * FROM small WHERE n NOT IN (SELECT n FROM large WHERE large.nu = small.u)
F15: SELECT * FROM small WHERE n NOT IN (SELECT nn FROM large WHERE large.nu = small.u)
F16: SELECT * FROM small WHERE nn NOT IN (SELECT n FROM large WHERE large.nu = small.u)
F17: SELECT * FROM small WHERE nn NOT IN (SELECT nn FROM large WHERE large.nu = small.u)
F18: SELECT * FROM small WHERE NOT EXISTS (SELECT * FROM large WHERE large.u = small.u)
F19: SELECT * FROM small WHERE NOT EXISTS (SELECT * FROM large WHERE large.nu = small.u)

The execution plan I expect for these queries is an anti-join between two data sets:

  • Access the table “small” through a table scan. The operation is executed one single time.
  • For every row of the table “small,” access the table “large” through an index scan and check whether rows that fulfill the WHERE clause exist.

MySQL selects the following execution plans. All of them fulfill the expectations.

F10

+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+
| id | select_type        | table | partitions | type   | possible_keys   | key     | key_len | ref           | rows | filtered | Extra       |
+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+
|  1 | PRIMARY            | small | NULL       | ALL    | NULL            | NULL    | NULL    | NULL          |   10 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | large | NULL       | eq_ref | large_u,large_n | large_u | 4       | chris.small.u |    1 |   100.00 | Using where |
+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+

F11

+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
| id | select_type        | table | partitions | type   | possible_keys    | key     | key_len | ref           | rows | filtered | Extra       |
+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
|  1 | PRIMARY            | small | NULL       | ALL    | NULL             | NULL    | NULL    | NULL          |   10 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | large | NULL       | eq_ref | large_u,large_nn | large_u | 4       | chris.small.u |    1 |   100.00 | Using where |
+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+

F12

+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+
| id | select_type        | table | partitions | type   | possible_keys   | key     | key_len | ref           | rows | filtered | Extra       |
+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+
|  1 | PRIMARY            | small | NULL       | ALL    | NULL            | NULL    | NULL    | NULL          |   10 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | large | NULL       | eq_ref | large_u,large_n | large_u | 4       | chris.small.u |    1 |    19.00 | Using where |
+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+

F13

+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
| id | select_type        | table | partitions | type   | possible_keys    | key     | key_len | ref           | rows | filtered | Extra       |
+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
|  1 | PRIMARY            | small | NULL       | ALL    | NULL             | NULL    | NULL    | NULL          |   10 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | large | NULL       | eq_ref | large_u,large_nn | large_u | 4       | chris.small.u |    1 |    10.00 | Using where |
+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+

F14

+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys    | key      | key_len | ref           | rows | filtered | Extra       |
+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+-------------+
|  1 | PRIMARY            | small | NULL       | ALL  | NULL             | NULL     | NULL    | NULL          |   10 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | large | NULL       | ref  | large_nu,large_n | large_nu | 4       | chris.small.u |    1 |   100.00 | Using where |
+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+-------------+

F15

+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys     | key      | key_len | ref           | rows | filtered | Extra       |
+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+-------------+
|  1 | PRIMARY            | small | NULL       | ALL  | NULL              | NULL     | NULL    | NULL          |   10 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | large | NULL       | ref  | large_nu,large_nn | large_nu | 4       | chris.small.u |    1 |   100.00 | Using where |
+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+-------------+

F16

+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys    | key      | key_len | ref           | rows | filtered | Extra       |
+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+-------------+
|  1 | PRIMARY            | small | NULL       | ALL  | NULL             | NULL     | NULL    | NULL          |   10 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | large | NULL       | ref  | large_nu,large_n | large_nu | 4       | chris.small.u |    1 |    19.00 | Using where |
+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+-------------+

F17

+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys     | key      | key_len | ref           | rows | filtered | Extra       |
+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+-------------+
|  1 | PRIMARY            | small | NULL       | ALL  | NULL              | NULL     | NULL    | NULL          |   10 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | large | NULL       | ref  | large_nu,large_nn | large_nu | 4       | chris.small.u |    1 |    10.00 | Using where |
+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+-------------+

F18

+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------+
| id | select_type        | table | partitions | type   | possible_keys | key     | key_len | ref           | rows | filtered | Extra       |
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------+
|  1 | PRIMARY            | small | NULL       | ALL    | NULL          | NULL    | NULL    | NULL          |   10 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | large | NULL       | eq_ref | large_u       | large_u | 4       | chris.small.u |    1 |   100.00 | Using index |
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------+

F19

+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys | key      | key_len | ref           | rows | filtered | Extra       |
+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+-------------+
|  1 | PRIMARY            | small | NULL       | ALL  | NULL          | NULL     | NULL    | NULL          |   10 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | large | NULL       | ref  | large_nu      | large_nu | 4       | chris.small.u |    1 |   100.00 | Using index |
+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+-------------+

Oracle Database selects the following execution plans. All of them fulfill the expectations.

F10/F12

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |     9 |  1269 |    18   (0)| 00:00:01 |
|*  1 |  FILTER                      |         |       |       |            |          |
|   2 |   TABLE ACCESS FULL          | SMALL   |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| LARGE   |     1 |    10 |     3   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | LARGE_U |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "LARGE" "LARGE" WHERE "LARGE"."U"=:B1
              AND LNNVL("N"<>:B2)))
   3 - filter(LNNVL("N"<>:B1))
   4 - access("LARGE"."U"=:B1)

F11

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |     9 |  1269 |    18   (0)| 00:00:01 |
|*  1 |  FILTER                      |         |       |       |            |          |
|   2 |   TABLE ACCESS FULL          | SMALL   |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| LARGE   |     1 |    10 |     3   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | LARGE_U |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "LARGE" "LARGE" WHERE "LARGE"."U"=:B1
              AND LNNVL("NN"<>:B2)))
   3 - filter(LNNVL("NN"<>:B1))
   4 - access("LARGE"."U"=:B1)

F13

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |     1 |   151 |    23   (0)| 00:00:01 |
|   1 |  NESTED LOOPS ANTI           |         |     1 |   151 |    23   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL          | SMALL   |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| LARGE   |  1000K|  9765K|     2   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | LARGE_U |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   3 - filter("NN"="NN")
   4 - access("LARGE"."U"="SMALL"."U")

F14/F16

-------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |     9 |  1269 |    23   (0)| 00:00:01 |
|*  1 |  FILTER                              |          |       |       |            |          |
|   2 |   TABLE ACCESS FULL                  | SMALL    |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID BATCHED| LARGE    |     1 |    10 |     4   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN                  | LARGE_NU |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "LARGE" "LARGE" WHERE "LARGE"."NU"=:B1 AND
              LNNVL("N"<>:B2)))
   3 - filter(LNNVL("N"<>:B1))
   4 - access("LARGE"."NU"=:B1)

F15

-------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |     9 |  1269 |    23   (0)| 00:00:01 |
|*  1 |  FILTER                              |          |       |       |            |          |
|   2 |   TABLE ACCESS FULL                  | SMALL    |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID BATCHED| LARGE    |     1 |    10 |     4   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN                  | LARGE_NU |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "LARGE" "LARGE" WHERE "LARGE"."NU"=:B1 AND
              LNNVL("NN"<>:B2)))
   3 - filter(LNNVL("NN"<>:B1))
   4 - access("LARGE"."NU"=:B1)

F17

-------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |     1 |   151 |    33   (0)| 00:00:01 |
|   1 |  NESTED LOOPS ANTI                   |          |     1 |   151 |    33   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL                  | SMALL    |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID BATCHED| LARGE    |  1000K|  9765K|     3   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN                  | LARGE_NN |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   3 - filter("LARGE"."NU"="SMALL"."U")
   4 - access("NN"="NN")

F18

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |     1 |   146 |    13   (0)| 00:00:01 |
|   1 |  NESTED LOOPS ANTI |         |     1 |   146 |    13   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| SMALL   |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   INDEX UNIQUE SCAN| LARGE_U |   900K|  4394K|     1   (0)| 00:00:01 |
------------------------------------------------------------------------------

   3 - access("LARGE"."U"="SMALL"."U")

F19

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |     1 |   146 |    23   (0)| 00:00:01 |
|   1 |  NESTED LOOPS ANTI |          |     1 |   146 |    23   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| SMALL    |    10 |  1410 |     3   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN | LARGE_NU |   900K|  4394K|     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

   3 - access("LARGE"."NU"="SMALL"."U")

PostgreSQL selects the following execution plans. Except for the queries F18/F19, the others fulfill the expectations. However, in this specific case, the execution plan selected for the queries E18/E19 perform well. I checked that when the data changes, also the execution plans changes. So, even though I was surprised by them, they are good.

F10-F13

 Seq Scan on small  (cost=0.00..45.47 rows=5 width=148)
   Filter: (NOT (SubPlan 1))
   SubPlan 1
     ->  Index Scan using large_u on large  (cost=0.42..8.44 rows=1 width=4)
           Index Cond: (u = small.u)

F14-F17

 Seq Scan on small  (cost=0.00..45.47 rows=5 width=148)
   Filter: (NOT (SubPlan 1))
   SubPlan 1
     ->  Index Scan using large_nu on large  (cost=0.42..8.44 rows=1 width=4)
           Index Cond: (nu = small.u)

F18

 Merge Anti Join  (cost=0.56..13.59 rows=1 width=148)
   Merge Cond: (small.u = large.u)
   ->  Index Scan using small_u on small  (cost=0.14..12.29 rows=10 width=148)
   ->  Index Only Scan using large_u on large  (cost=0.42..81855.69 rows=1000000 width=4)

F19

 Merge Anti Join  (cost=0.56..13.59 rows=1 width=148)
   Merge Cond: (small.u = large.nu)
   ->  Index Scan using small_u on small  (cost=0.14..12.29 rows=10 width=148)
   ->  Index Only Scan using large_nu on large  (cost=0.42..81855.69 rows=1000000 width=4)
Subtype F2 – Table “small” in the subquery

F20: SELECT * FROM large WHERE n NOT IN (SELECT n FROM small WHERE small.u = large.u)
F21: SELECT * FROM large WHERE n NOT IN (SELECT nn FROM small WHERE small.u = large.u)
F22: SELECT * FROM large WHERE nn NOT IN (SELECT n FROM small WHERE small.u = large.u)
F23: SELECT * FROM large WHERE nn NOT IN (SELECT nn FROM small WHERE small.u = large.u)
F24: SELECT * FROM large WHERE n NOT IN (SELECT n FROM small WHERE small.nu = large.u)
F25: SELECT * FROM large WHERE n NOT IN (SELECT nn FROM small WHERE small.nu = large.u)
F26: SELECT * FROM large WHERE nn NOT IN (SELECT n FROM small WHERE small.nu = large.u)
F27: SELECT * FROM large WHERE nn NOT IN (SELECT nn FROM small WHERE small.nu = large.u)
F28: SELECT * FROM large WHERE NOT EXISTS (SELECT * FROM small WHERE small.u = large.u)
F29: SELECT * FROM large WHERE NOT EXISTS (SELECT * FROM small WHERE small.nu = large.u)

The execution plan I expect for these queries is an anti-join between two data sets:

  • Access the table “small” through a table scan and put the resulting data into a memory structure. This operation is executed one single time.
  • Access the table “large” through a table scan and, for every row, check the memory structure created by the previous operation to find out whether rows that fulfill the WHERE clause exist. This operation is executed one single time.

MySQL selects the following execution plans. None of them, because of the access to table “small” for every row in the table “large,” fulfill the expectations. They are really bad.

F20

+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+--------+----------+-------------+
| id | select_type        | table | partitions | type   | possible_keys   | key     | key_len | ref           | rows   | filtered | Extra       |
+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+--------+----------+-------------+
|  1 | PRIMARY            | large | NULL       | ALL    | NULL            | NULL    | NULL    | NULL          | 989170 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | small | NULL       | eq_ref | small_u,small_n | small_u | 4       | chris.large.u |      1 |   100.00 | Using where |
+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+--------+----------+-------------+

F21

+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+--------+----------+-------------+
| id | select_type        | table | partitions | type   | possible_keys    | key     | key_len | ref           | rows   | filtered | Extra       |
+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+--------+----------+-------------+
|  1 | PRIMARY            | large | NULL       | ALL    | NULL             | NULL    | NULL    | NULL          | 989170 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | small | NULL       | eq_ref | small_u,small_nn | small_u | 4       | chris.large.u |      1 |   100.00 | Using where |
+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+--------+----------+-------------+

F22

+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+--------+----------+-------------+
| id | select_type        | table | partitions | type   | possible_keys   | key     | key_len | ref           | rows   | filtered | Extra       |
+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+--------+----------+-------------+
|  1 | PRIMARY            | large | NULL       | ALL    | NULL            | NULL    | NULL    | NULL          | 989170 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | small | NULL       | eq_ref | small_u,small_n | small_u | 4       | chris.large.u |      1 |    19.00 | Using where |
+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+--------+----------+-------------+

F23

+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+--------+----------+-------------+
| id | select_type        | table | partitions | type   | possible_keys    | key     | key_len | ref           | rows   | filtered | Extra       |
+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+--------+----------+-------------+
|  1 | PRIMARY            | large | NULL       | ALL    | NULL             | NULL    | NULL    | NULL          | 989170 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | small | NULL       | eq_ref | small_u,small_nn | small_u | 4       | chris.large.u |      1 |    10.00 | Using where |
+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+--------+----------+-------------+

F24

+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+--------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys    | key      | key_len | ref           | rows   | filtered | Extra       |
+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+--------+----------+-------------+
|  1 | PRIMARY            | large | NULL       | ALL  | NULL             | NULL     | NULL    | NULL          | 989170 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | small | NULL       | ref  | small_nu,small_n | small_nu | 4       | chris.large.u |      1 |   100.00 | Using where |
+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+--------+----------+-------------+

F25

+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+--------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys     | key      | key_len | ref           | rows   | filtered | Extra       |
+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+--------+----------+-------------+
|  1 | PRIMARY            | large | NULL       | ALL  | NULL              | NULL     | NULL    | NULL          | 989170 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | small | NULL       | ref  | small_nu,small_nn | small_nu | 4       | chris.large.u |      1 |   100.00 | Using where |
+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+--------+----------+-------------+

F26

+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+--------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys    | key      | key_len | ref           | rows   | filtered | Extra       |
+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+--------+----------+-------------+
|  1 | PRIMARY            | large | NULL       | ALL  | NULL             | NULL     | NULL    | NULL          | 989170 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | small | NULL       | ref  | small_nu,small_n | small_nu | 4       | chris.large.u |      1 |    19.00 | Using where |
+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+--------+----------+-------------+

F27

+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+--------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys     | key      | key_len | ref           | rows   | filtered | Extra       |
+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+--------+----------+-------------+
|  1 | PRIMARY            | large | NULL       | ALL  | NULL              | NULL     | NULL    | NULL          | 989170 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | small | NULL       | ref  | small_nu,small_nn | small_nu | 4       | chris.large.u |      1 |    10.00 | Using where |
+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+--------+----------+-------------+

F28

+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+--------+----------+-------------+
| id | select_type        | table | partitions | type   | possible_keys | key     | key_len | ref           | rows   | filtered | Extra       |
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+--------+----------+-------------+
|  1 | PRIMARY            | large | NULL       | ALL    | NULL          | NULL    | NULL    | NULL          | 989170 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | small | NULL       | eq_ref | small_u       | small_u | 4       | chris.large.u |      1 |   100.00 | Using index |
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+--------+----------+-------------+

F29

+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+--------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys | key      | key_len | ref           | rows   | filtered | Extra       |
+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+--------+----------+-------------+
|  1 | PRIMARY            | large | NULL       | ALL  | NULL          | NULL     | NULL    | NULL          | 989170 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | small | NULL       | ref  | small_nu      | small_nu | 4       | chris.large.u |      1 |   100.00 | Using index |
+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+--------+----------+-------------+

Oracle Database selects the following execution plans. Only the execution plan of the queries F23/F27-F29 fulfill the expectations. The others, because of the missing anti-join optimization, are bad.

F20/F22

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   999K|   142M|  1000K  (1)| 00:00:40 |
|*  1 |  FILTER                      |         |       |       |            |          |
|   2 |   TABLE ACCESS FULL          | LARGE   |  1000K|   142M|  5789   (1)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| SMALL   |     1 |     6 |     1   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | SMALL_U |     1 |       |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "SMALL" "SMALL" WHERE "SMALL"."U"=:B1
              AND LNNVL("N"<>:B2)))
   3 - filter(LNNVL("N"<>:B1))
   4 - access("SMALL"."U"=:B1)

F21

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   999K|   142M|  1000K  (1)| 00:00:40 |
|*  1 |  FILTER                      |         |       |       |            |          |
|   2 |   TABLE ACCESS FULL          | LARGE   |  1000K|   142M|  5789   (1)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| SMALL   |     1 |     6 |     1   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | SMALL_U |     1 |       |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "SMALL" "SMALL" WHERE "SMALL"."U"=:B1
              AND LNNVL("NN"<>:B2)))
   3 - filter(LNNVL("NN"<>:B1))
   4 - access("SMALL"."U"=:B1)

F23

--------------------------------------------------------------------------------------------
| Id  | Operation               | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                  |   999K|   147M|  5794   (1)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT ANTI   |                  |   999K|   147M|  5794   (1)| 00:00:01 |
|   2 |   VIEW                  | index$_join$_002 |    10 |    60 |     2   (0)| 00:00:01 |
|*  3 |    HASH JOIN            |                  |       |       |            |          |
|   4 |     INDEX FAST FULL SCAN| SMALL_NN         |    10 |    60 |     1   (0)| 00:00:01 |
|   5 |     INDEX FAST FULL SCAN| SMALL_U          |    10 |    60 |     1   (0)| 00:00:01 |
|   6 |   TABLE ACCESS FULL     | LARGE            |  1000K|   142M|  5788   (1)| 00:00:01 |
--------------------------------------------------------------------------------------------

   1 - access("SMALL"."U"="LARGE"."U" AND "NN"="NN")
   3 - access(ROWID=ROWID)

F24/F26

-------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |   999K|   142M|  1994K  (1)| 00:01:18 |
|*  1 |  FILTER                              |          |       |       |            |          |
|   2 |   TABLE ACCESS FULL                  | LARGE    |  1000K|   142M|  5789   (1)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID BATCHED| SMALL    |     1 |     6 |     2   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN                  | SMALL_NU |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "SMALL" "SMALL" WHERE "SMALL"."NU"=:B1 AND
              LNNVL("N"<>:B2)))
   3 - filter(LNNVL("N"<>:B1))
   4 - access("SMALL"."NU"=:B1)

F25

-------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |   999K|   142M|  1994K  (1)| 00:01:18 |
|*  1 |  FILTER                              |          |       |       |            |          |
|   2 |   TABLE ACCESS FULL                  | LARGE    |  1000K|   142M|  5789   (1)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID BATCHED| SMALL    |     1 |     6 |     2   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN                  | SMALL_NU |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "SMALL" "SMALL" WHERE "SMALL"."NU"=:B1 AND
              LNNVL("NN"<>:B2)))
   3 - filter(LNNVL("NN"<>:B1))
   4 - access("SMALL"."NU"=:B1)

F27

--------------------------------------------------------------------------------------------
| Id  | Operation               | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                  |   999K|   147M|  5794   (1)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT ANTI   |                  |   999K|   147M|  5794   (1)| 00:00:01 |
|   2 |   VIEW                  | index$_join$_002 |    10 |    60 |     2   (0)| 00:00:01 |
|*  3 |    HASH JOIN            |                  |       |       |            |          |
|   4 |     INDEX FAST FULL SCAN| SMALL_NN         |    10 |    60 |     1   (0)| 00:00:01 |
|   5 |     INDEX FAST FULL SCAN| SMALL_NU         |    10 |    60 |     1   (0)| 00:00:01 |
|   6 |   TABLE ACCESS FULL     | LARGE            |  1000K|   142M|  5788   (1)| 00:00:01 |
--------------------------------------------------------------------------------------------

   1 - access("SMALL"."NU"="LARGE"."U" AND "NN"="NN")
   3 - access(ROWID=ROWID)

F28

--------------------------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |         |   999K|   144M|  5793   (1)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT ANTI|         |   999K|   144M|  5793   (1)| 00:00:01 |
|   2 |   INDEX FULL SCAN    | SMALL_U |    10 |    30 |     1   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL  | LARGE   |  1000K|   142M|  5788   (1)| 00:00:01 |
--------------------------------------------------------------------------------

   1 - access("SMALL"."U"="LARGE"."U")

F29

---------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |   999K|   144M|  5793   (1)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT ANTI|          |   999K|   144M|  5793   (1)| 00:00:01 |
|   2 |   INDEX FULL SCAN    | SMALL_NU |    10 |    30 |     1   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL  | LARGE    |  1000K|   142M|  5788   (1)| 00:00:01 |
---------------------------------------------------------------------------------

   1 - access("SMALL"."NU"="LARGE"."U")

PostgreSQL selects the following execution plans. Only the ones of the queries F28/F29 fulfill the expectations. The others, because of the table scan on “small” for every row in the table “large,” are really bad.

F20-F23

 Seq Scan on large  (cost=0.00..598473.00 rows=500000 width=148)
   Filter: (NOT (SubPlan 1))
   SubPlan 1
     ->  Seq Scan on small  (cost=0.00..1.12 rows=1 width=4)
           Filter: (u = large.u)

F24-F27

 Seq Scan on large  (cost=0.00..598473.00 rows=500000 width=148)
   Filter: (NOT (SubPlan 1))
   SubPlan 1
     ->  Seq Scan on small  (cost=0.00..1.12 rows=1 width=4)
           Filter: (nu = large.u)

F28/F29

 Hash Anti Join  (cost=1.23..44849.14 rows=999990 width=148)
   Hash Cond: (large.u = small.u)
   ->  Seq Scan on large  (cost=0.00..32223.00 rows=1000000 width=148)
   ->  Hash  (cost=1.10..1.10 rows=10 width=4)
         ->  Seq Scan on small  (cost=0.00..1.10 rows=10 width=4)

Summary

The number of queries that the query optimizers handle correctly are the following:

  • Oracle Database 12.2: 72 out of 80
  • MySQL 8.0.3: 67 out of 80
  • PostgreSQL 10.0: 60 out of 80

Since not all queries are handled correctly, for best performance it is sometimes necessary to rewrite them.

Jonathan Lewis's picture

Been There

It’s the end of the year and time for a retrospective of some sort so I was thinking of listing the top 10 most popular pages on my blog, but Connor McDonald beat me to it, so I decided to see if I could remember all the countries I’d visited since starting to work with the Oracle software, and here’s the list in alphabetical order:

Antigua
Australia
Austria
Belgium
Bosnia
Brunei
Bulgaria
Canada
China
Croatia
Czech Republic
Denmark
Dubai
Egypt
Estonia
Finland
France
Germany
Greece
Hungary
Iceland
India
Indonesia
Ireland
Israel
Italy
Japan
Latvia
Lithuania
Malaysia
Netherlands
New Zealand
Norway
Poland
Portugal
Romania
Russia
Serbia
Singapore
Slovakia
Slovenia
South Africa
Spain
Sweden
Switzerland
Thailand
Tunisia
Turkey
UAE
USA

A few of these were holidays rather than work, and I may have forgotten a couple, so if you’ve seen me in your country and it’s not on the list let me know.

The list can be a bit of a nuisance, I had to list “all the countries you’ve visited in the last 10 years” for both the US and Russian visas: the US form only allowed for 5 countries and the Russian one for 40; and the US expected me to list EVERY visit, with dates and city!

connor_mc_d's picture

It’s not about ego … it’s about knowledge

Take a quick look at this blog post by Jonathan Lewis

https://jonathanlewis.wordpress.com/2017/12/30/nvarchar2/

Anyone that has been working with Oracle for any length of time probably knows that Jonathan has a great depth of knowledge in the Oracle database, and is a regular blogger.  But this post is a good example to inspire anyone that is working with Oracle (or any technology for that matter) to start blogging and sharing their experiences with the community, no matter what their level of experience is.

If you read the post, you’ll see that Jonathan presented a well-crafted test case, and presented a hypothesis about NVARCHAR2 and potential side effects of adding columns of this data type to an existing table.

Turns out the hypothesis was wrong, and the observations were unrelated to NVARCHAR2 at all.  A comment from a reader pointed out the true cause of the side effect.

But here’s the important thing.

Has the blog post been deleted ? No

Has the comment been deleted ? No.

Publishing information for the community to digest is not (as we say in Australia) a pissing contest (https://en.wikipedia.org/wiki/Pissing_contest) to show who is the smartest or the fastest or the cleverest.  It is about collectively growing the knowledge base of one’s self and the community.

So don’t be afraid to publish your experiences so that all may benefit.  If your findings or claims are incorrect, then good people in the community will correct you gently and professionally.  And those not-so-good people that choose to point out errors in a condescending or derogatory tone…well….they’ll be doing a lot more damage to their online reputations than they could ever possibly do to yours.

Happy New Year!

Franck Pachot's picture

12c Multitenant internals: PDB_PLUG_IN_VIOLATIONS

In the previous post https://blog.dbi-services.com/12c-multitenant-internals-pdb-replay-ddl-for-common-users I’ve done some DDL on a common user to show how this is replayed later for PDBs that were not opened at that time. But what happens when one of the DDL fails on one PDB?

PDB$LASTREPLAY

In the last post, the C##USER1 common user was created and all pluggable databases (PDB1 with con_id=3 and PDB2 with con_id=4) were opened and synchronized:

SQL> select con_id,scnwrp,scnbas,ctime,name,auxname1,auxname2,opcode,replay#,sqlstmt from containers(pdb_sync$) where opcode=-1 and bitand(flags,8)!=8 order by con_id,replay#;
 
CON_ID SCNWRP SCNBAS CTIME NAME AUXNAME1 AUXNAME2 OPCODE REPLAY# SQLSTMT
------ ------ ------ ----- ---- -------- -------- ------ ------- -------
1 0 0 26-jan-17 01:53:02 PDB$LASTREPLAY -1 10
3 0 0 26-jan-17 01:53:02 PDB$LASTREPLAY -1 10
4 0 0 26-jan-17 01:53:02 PDB$LASTREPLAY -1 10

When REPLAY# in the PDB is equal to the CDB$ROOT one, this means that there are no additional statements to replicate on the PDB.

I have PDB1 opened read write and PDB2 in read only:

SQL> show pdbs
 
CON_ID CON_NAME OPEN MODE RESTRICTED
------ -------- ---- ---- ----------
2 PDB$SEED READ ONLY NO
3 PDB1 READ WRITE NO
4 PDB2 READ ONLY NO

For the demo my user’s default tablespace is SYSTEM:

SQL> select con_id,username,common,default_tablespace from cdb_users where username='C##USER1' order by 1;
 
CON_ID USERNAME COMMON DEFAULT_TABLESPACE
------ -------- ------ ------------------
1 C##USER1 YES SYSTEM
3 C##USER1 YES SYSTEM
4 C##USER1 YES SYSTEM

Failure in opened containers

I want to change the default tablespace for C##USER1 and I have a USERS tablespace in CDB$ROOT (but not in the PDBs):

SQL> alter user C##USER1 default tablespace USERS;
 
Error starting at line : 50 File @ common-users-pdb-sync.sql
In command -
alter user C##USER1 default tablespace USERS
Error report -
ORA-65048: error encountered when processing the current DDL statement in pluggable database PDB1
ORA-00959: tablespace 'USERS' does not exist

As we have seen in the last post, the DDL is executed on all containers that are opened read write. Here it is fine on CDB$ROOT but fails on PDB1.

Then I create the USERS tablespace in PDB1:

SQL> alter session set container=PDB1;
Session altered.
SQL> create tablespace USERS datafile '/u01/oradata/CDB2/PDB1/users.dbf' size 5M;
Tablespace USERS created.
SQL> alter session set container=CDB$ROOT;
Session altered.

And now, the statement is successful in CDB$ROOT, replicated on PDB1:

SQL> alter user C##USER1 default tablespace USERS;
User C##USER1 altered.

This is nice: the statement is successful in all containers or fails. When it is successful, statements are recorded in PDB_SYNC$:

SQL> select con_id,scnwrp,scnbas,ctime,name,auxname1,auxname2,opcode,replay#,sqlstmt from containers(pdb_sync$) where con_id=1 and bitand(flags,8)!=8 order by con_id,replay#;
 
CON_ID SCNWRP SCNBAS CTIME NAME AUXNAME1 AUXNAME2 OPCODE REPLAY# SQLSTMT
------ ------ ------ ----- ---- -------- -------- ------ ------- -------
1 0 0 26-jan-17 01:53:02 PDB$LASTREPLAY -1 11
1 0 852610 26-jan-17 02:57:26 CTXSYS SYS 5 1 alter user CTXSYS account unlock identified by *^@
1 0 853177 26-jan-17 02:57:34 CTXSYS SYS 5 2 alter user CTXSYS password expire account lock^@
1 0 1405359 26-jan-17 03:31:31 SYSTEM SYS 5 3 alter user system account lock password expire^@
1 0 1408693 23-dec-17 11:34:43 SYS SYS 5 4 alter user sys account unlock identified by *^@
1 0 1408703 23-dec-17 11:34:43 SYSTEM SYS 5 5 alter user system account unlock identified by *^@
1 0 1466615 29-dec-17 09:26:56 C##PROFILE1 SYS 7 6 create profile C##PROFILE1 limit inactive_account_time 15 container=all^@
1 0 1466641 29-dec-17 09:26:57 C##ROLE1 SYS 3 7 create role C##ROLE1 container=all^@
1 0 1466748 29-dec-17 09:26:58 C##USER1 SYS 1 8 create user C##USER1 identified by * container=all^@
1 0 1466812 29-dec-17 09:26:59 C##USER1 SYS 5 9 alter user C##USER1 profile C##PROFILE1^@
1 0 1466853 29-dec-17 09:26:59 C##USER1 C##ROLE1 SYS 10 10 grant C##ROLE1 to C##USER1 container=all^@
1 0 1467010 29-dec-17 09:27:01 C##USER1 SYS 5 11 alter user C##USER1 default tablespace USERS^@

Failure in replay at open for closed containers

But PDB2 is not synchronized because it was not opened read write:

SQL> select con_id,scnwrp,scnbas,ctime,name,auxname1,auxname2,opcode,replay#,sqlstmt from containers(pdb_sync$) where con_id>1 and bitand(flags,8)!=8 order by con_id,replay#;
 
CON_ID SCNWRP SCNBAS CTIME NAME AUXNAME1 AUXNAME2 OPCODE REPLAY# SQLSTMT
------ ------ ------ ----- ---- -------- -------- ------ ------- -------
3 0 0 26-jan-17 01:53:02 PDB$LASTREPLAY -1 11
4 0 0 26-jan-17 01:53:02 PDB$LASTREPLAY -1 10

But I don’t have a USERS tablespace in PDB2, so the replay will fail:

SQL> alter pluggable database PDB2 close;
Pluggable database PDB2 altered.
 
SQL> alter pluggable database PDB2 open;
ORA-24344: success with compilation error
Pluggable database PDB2 altered.

This is a warning only. The SQlcl feedback is a bit misleading, mentioning a compilation error because this is where we used to have warnings, but the SQl*Plus message is more clear:

SQL> alter pluggable database PDB2 open;
Warning: PDB altered with errors.

The PDB2 cannot be left closed, because you need to create a tablespace here. But it cannot be opened to everyone, because it is not in sync with CDB$ROOT. So what happens is that the PDB is opened in restricted mode:

SQL> show pdbs
 
CON_ID CON_NAME OPEN MODE RESTRICTED
------ -------- ---- ---- ----------
2 PDB$SEED READ ONLY NO
3 PDB1 READ WRITE NO
4 PDB2 READ WRITE YES

Note that if you look at PDB_SYNC$ in the PDB at that time, it looks like REPLAY#=11 has increased but you also see rows for the statement that has to be run. You have to connect to the PDB because containers() do not run in restricted session containers:

SQL> alter session set container=PDB2;
Session altered.
 
SQL> select con_id,scnwrp,scnbas,ctime,name,auxname1,auxname2,opcode,replay#,sqlstmt from containers(pdb_sync$) where con_id>1 and bitand(flags,8)!=8 order by con_id,replay#;
 
CON_ID SCNWRP SCNBAS CTIME NAME AUXNAME1 AUXNAME2 OPCODE REPLAY# SQLSTMT
------ ------ ------ ----- ---- -------- -------- ------ ------- -------
4 0 0 26-jan-17 01:53:02 PDB$LASTREPLAY -1 11
4 0 1469022 29-dec-17 09:27:02 C##USER1 SYS 5 11 alter user C##USER1 default tablespace USERS^@
 
SQL> alter session set container=CDB$ROOT;
Session altered.

Actually, the attempt to sync has inserted the statements and pushed the last replay indicator. Now, the PDB has all information to do a sync without the need to go to CDB$ROOT. The DDL was not replayed, but has been stored locally. When the sync will be successful, statements will be removed from the local PDB_SYNC$ leaving only the LASTREPLAY indicator.

PDB_PLUG_IN_VIOLATIONS

More info about the warning is stored in PDB_ALERT$ which you query from PDB_PLUG_IN_VIOLATIONS (the strange name reminds the TRANSPORT_SET_VIOLATIONS view used by DBMS_TTS):

SQL> select name, cause, type, status,action,message,time from pdb_plug_in_violations;
 
NAME CAUSE TYPE STATUS ACTION MESSAGE TIME
---- ----- ---- ------ ------ ------- -------------------------------
PDB2 Sync Failure ERROR PENDING Sync PDB failed with ORA-959 during 'alter user C##USER1 default tablespace USERS' 29-DEC-17 09.27.03.266780000 PM

Here you have the statement that failed and the error number, but no recommended ACTION. However, ORA-959 is “tablespace ‘%s’ does not exist” which gives a clue about the problem encountered.

As the PDB is opened a DBA (with RESTRICTED SESSION privilege) can add the tablespace:

SQL> alter session set container=PDB2;
Session altered.
 
SQL> create tablespace USERS datafile '/u01/oradata/CDB2/PDB2/users.dbf' size 5M;
Tablespace USERS created.

But you cannot simply disable restricted session:

SQL> alter system disable restricted session;
SQL> alter system disable restricted session;
 
Error starting at line : 74 File @ common-users-pdb-sync.sql
In command -
alter system disable restricted session
Error report -
ORA-65144: ALTER SYSTEM DISABLE RESTRICTED SESSION is not permitted
65144. 00000 - "ALTER SYSTEM DISABLE RESTRICTED SESSION is not permitted"
*Cause: An attempt was made to disable a restricted session while an unresolved error existed in PDB_PLUG_IN_VIOLATIONS.
*Action: Resolve all of the errors before trying to disable a restricted session.

One solution is to close and open the PDB to get the DDL replay:
SQL> alter session set container=CDB$ROOT;
Session altered.
SQL> alter pluggable database PDB2 close;
Pluggable database PDB2 altered.
SQL> alter pluggable database PDB2 open;
Pluggable database PDB2 altered.

The other solution is to call DBMS_PDB.SYNC_PDB and disable restricted mode:

SQL> exec dbms_pdb.sync_pdb; commit;
PL/SQL procedure successfully completed.
 
SQL> alter system disable restricted session;
System DISABLE altered.

In both case, no warning here, and no restricted mode:

SQL> show pdbs
 
CON_ID CON_NAME OPEN MODE RESTRICTED
------ -------- ---- ---- ----------
2 PDB$SEED READ ONLY NO
3 PDB1 READ WRITE NO
4 PDB2 READ WRITE NO

and the PDB_PLUG_IN_VIOLATIONS is updated to flag the issue as resolved:

SQL> select name, cause, type, status,action,message,time from pdb_plug_in_violations;
 
NAME CAUSE TYPE STATUS ACTION MESSAGE TIME
---- ----- ---- ------ ------ ------- -------------------------------
PDB2 Sync Failure ERROR RESOLVED Sync PDB failed with ORA-959 during 'alter user C##USER1 default tablespace USERS' 29-DEC-17 09.27.04.093659000 PM

At that time, the local PDB_SYNC$ table in PDB2 contains only the PDB$LASTREPLAY row, with the same value as in the CDB$ROOT table. The rows with the statements have been deleted once the DDL has been successfully replayed:

SQL> select con_id,scnwrp,scnbas,ctime,name,auxname1,auxname2,opcode,replay#,sqlstmt from containers(pdb_sync$) where con_id>1 and bitand(flags,8)!=8 order by con_id,replay#;
 
CON_ID SCNWRP SCNBAS CTIME NAME AUXNAME1 AUXNAME2 OPCODE REPLAY# SQLSTMT
------ ------ ------ ----- ---- -------- -------- ------ ------- -------
4 0 0 26-jan-17 01:53:02 PDB$LASTREPLAY -1 11

So what?

The mechanism is simple: record what is done in CDB$ROOT, replicate it in PDBs when possible (opened read-write) and try to replay it, mark the last replay step. For containers that were not writeable, at open, the DDL is replicated on the PDBs that lag being CDB$ROOT and replay step is updated. Then the DDL is replayed. When sucessful, the statement is removed from the replicated DDL. When it fails, you get a warning, and a message in PDB_PLUG_IN_VIOLATIONS, and the PDB is opened in restricted session mode to let you solve the problem.
If you can fix the issue so that the DDL to be replayed is successful, then you can just sync and disable restricted session, or simply close and re-open the PDB.
If you can’t fix it I suppose you need to hack the statements in the local PDB_SYNC$, with Oracle Support agreement of course, and make sure that you arrive to a state which is consistent with the other containers, especially CDB$ROOT.

 

Cet article 12c Multitenant internals: PDB_PLUG_IN_VIOLATIONS est apparu en premier sur Blog dbi services.

Jonathan Lewis's picture

nvarchar2

Here’s an odd little quirk that appeared when I was playing around with default values just recently. I think it’s one I’ve seen before, I may even have written about it many years ago but I can’t find any reference to it at present. Let’s start with a script that I’ll run on 12.2.0.1 (the effect does appear on earlier versions):


rem
rem     Script:         nvarchar2_anomaly.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2017
rem

create table t1 (
        column1  varchar2(10),
        column2  nvarchar2(10)
);

create table t2 (
        column1  varchar2(10)
);

alter table t2 add column2 nvarchar2(10);

create table t3 (
        column1  varchar2(10),
        column2  nvarchar2(10) default 'xxxxxxxx'
);

create table t4 (
        column1  varchar2(10)
);

alter table t4 add column2 nvarchar2(10) default 'xxxxxxxx';

insert into t1(column1) values('a');
insert into t2(column1) values('a');
insert into t3(column1) values('a');
insert into t4(column1) values('a');

All I’ve done it create 4 tables which. when described will all look the same:


 Name                    Null?    Type
 ----------------------- -------- ----------------
 COLUMN1                          VARCHAR2(10)
 COLUMN2                          NVARCHAR2(10)

There is a significant different between the first two and the last two, of course, thanks to the specification of a default value which means that the inserts will produce two possible results: the first two tables will have nulls in column2; the last two will have the nvarchar2 equivalent of ‘xxxxxxxx’ which, in my instance, will be a string of 16 bytes: “0,78,0,78,0,78,0,78,0,78,0,78,0,78,0,78”.

Surprisingly, though, there is a dramatic oddity between t3 and t4 which shows up when I query user_tab_cols:

select
        table_name, column_id, column_name,  segment_column_id, data_default
from    user_tab_cols
where   table_name like 'T_'
order by
        table_name, column_id
;

TABLE_NAME            COLUMN_ID COLUMN_NAME          SEGMENT_COLUMN_ID DATA_DEFAULT
-------------------- ---------- -------------------- ----------------- --------------------
T1                            1 COLUMN1                              1
                              2 COLUMN2                              2

T2                            1 COLUMN1                              1
                              2 COLUMN2                              2

T3                            1 COLUMN1                              1
                              2 COLUMN2                              2 'xxxxxxxx'

T4                            1 COLUMN1                              1
                              2 COLUMN2                              3 'xxxxxxxx'
                                SYS_NC00002$                         2

Table t4 has acquired two columns – a hidden column (which physically exists as the second column in the stored data and is declared as raw(126)) and the column which I had specified. You’ll note that the test shows two differences that may be significant: comparing t3/t4 we see that adding, rather than initially defining, the nvarchar2() column introduces the extra column; comparing t2/t4 we see that adding a varchar2() rather than an nvarchar2() doesn’t produce the same effect. Tentative assumption, therefore, is that there is something special about adding nvarchar2() columns. [Update: wrong, see comment 2 and the link it supplies]

Casting my mind back to various customers who have gone through multiple upgrades of 3rd party applications that invariably seem to add columns to tables, I wondered whether this extra column appeared every time you added an nvarchar2(). I’d not noticed anything in that past that suggested this might be the case, but it’s obviously worth checking: and in my simple tests it looked as if Oracle created just one extra column and used it to capture a value that seemed to be determined by the number and location of columns that had been added.

It’s a curiosity, and leaves room for further investigation – so if anyone has links to related articles please feel free to add them in the comments.