Oakies Blog Aggregator

pete's picture

PFCLScan - Version 3.0

We are very excited to announce that we are currently working to have version 3.0 of PFCLScan our flagship database security scanner for the Oracle database. We will be ready for sale in September and this development is going really....[Read More]

Posted by Pete On 11/07/19 At 03:33 PM

dbakevlar's picture

Oracle and Microsoft’s Cross-Cloud Partnership

A couple weeks back, Oracle and Microsoft announced their cross-cloud partnership.  This was wonderful news to me, as I’ve been working on numerous Oracle projects at Microsoft with Azure.

The Gist

To know that there is now a partnership between the two clouds and that there’s also a large amount of documentation about working between the two clouds is very helpful vs. the amount I’ve been working on based off just my knowledge.  Just as anyone appreciates a second set of eyes, I now have two company’s worth!

If you missed the announcement and curious what it’s about, Oracle has forged a partnership to do cross-cloud support to Azure for the non-database tier for many of their products.  These products include:

There’s multiple support authentications for Ebiz to Azure.  The documentation linked about is for the Oracle Access Manager, but it offers you some insight to how in-depth Oracle and Microsoft have gone into making this a successful venture.

Oracle on Azure

Currently, most of my Oracle customers aren’t looking to leave Oracle-  they just want to use the Azure cloud to house the database instances.  This isn’t a difficult option, as I can build out what they need in Azure bare metal, (VMs) almost anything they require.  The list is long, (and not limited to):

  • Oracle Databases 12c-18c, with hopefully 19c to soon follow
  • Oracle Dataguard
  • Oracle Enterprise Manager
  • Oracle Golden Gate
  • Oracle Essbase
  • Oracle Hyperion
  • Oracle Data Integrator
  • APEX

I haven’t run into any databases that were too large for Azure to handle and for the one customer that was using over 1.7TB of memory, a quick AWR report granted me an opportunity to provide them with insight on how to eliminate a huge percentage of their memory, (and CPU) needs.

RAC

Yes, you can build Oracle RAC on Azure, but it requires a third party tool to support the software clustering such as FlashGrid and it’s not certified for support from Oracle.  Don’t fret though, because before ever going down this path, you should find out the business reason the customer is using RAC to begin with.  There is significant overhead to the software clustering and often the goal isn’t met by the product.

  1. High Availability-  If the nodes all reside in one datacenter, does this meet HA?
  2. Failover #1-  Many times neither the apps are able to reconnect when there is a failover
  3. Failover #2- Due to the extra resource usage by each node/cluster, a failover can cause failure and evictions of more nodes
  4. The Cloud- Azure possesses a number of HA features already built in and due to this, Oracle Dataguard will more than suffice over RAC

So this is part of my new world at Microsoft and I’ve foretold this for over a decade.  No matter what platform you choose, there is always some outlier that is mission critical that you need to manage.  With the introduction of the cloud, which creates easier access to other platforms and technology, our world just keeps getting smaller, only faster.  I’m alright with this and no, I won’t get bored this way.

 



Tags:  , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright ©  [Oracle and Microsoft's Cross-Cloud Partnership], All Right Reserved. 2019.

Uwe Hesse's picture

Decision Support Benchmark TPC-H won by #Exasol

Oops, we did it again </p />
</p></div>
    <div class=»

Franck Pachot's picture

Hi Tuyen, as this removes lot of files from the Oracle Home, many features will not work.

Hi Tuyen, as this removes lot of files from the Oracle Home, many features will not work. Only what requires only the basic binaries and procedures will work.

Jonathan Lewis's picture

Assumptions

Over the last few days I’ve been tweeting little extracts from Practical Oracle 8i, and one of the tweets contained the following quote:

This lead to the question:

connor_mc_d's picture

HIGH_VALUE (and other LONG columns) revisited

Just a quick post to ensure people don’t get caught out by a recent “trick” I saw on an AskTOM question for extracting the HIGH_VALUE out of it’s LONG storage in the dictionary to a more appropriate datatype. A reviewer (I’m sure in good faith) posted the following mechanism to extract the HIGH_VALUE


SQL> create table t1 ( x timestamp )
  2  partition by range ( x )
  3  ( partition p1 values less than ( timestamp '2019-01-01 00:00:00' ),
  4    partition p2 values less than ( timestamp '2019-02-01 00:00:00' )
  5  );

Table created.

SQL>
SQL> set serveroutput on
SQL> declare
  2    v long;
  3    d date;
  4  begin
  5    for i in ( select high_value from user_tab_partitions where table_name = 'T1' )
  6    loop
  7      execute immediate 'select cast(:hv as date) from dual' into d using  i.high_value;
  8      dbms_output.put_line(d);
  9    end loop;
 10  end;
 11  /
01-JAN-19
01-FEB-19

PL/SQL procedure successfully completed.

You can see that we are casting the HIGH_VALUE as a date, and voila, out pops the values in a nice DATE datatype. That all seems well and good, but it only works for TIMESTAMP partitions and not the general case. Let’s replace our table with a DATE based partitioning scheme


SQL> drop table t1 purge;

Table dropped.

SQL> create table t1 ( x date )
  2  partition by range ( x )
  3  ( partition p1 values less than ( date '2019-01-01' ),
  4    partition p2 values less than ( date '2019-02-01' )
  5  );

Table created.

SQL>
SQL> set serveroutput on
SQL> declare
  2    d date;
  3  begin
  4    for i in ( select high_value from user_tab_partitions where table_name = 'T1' )
  5    loop
  6      execute immediate 'select cast(:hv as date) from dual' into d using i.high_value;
  7      dbms_output.put_line(d);
  8    end loop;
  9  end;
 10  /
declare
*
ERROR at line 1:
ORA-01858: a non-numeric character was found where a numeric was expected
ORA-06512: at line 6
ORA-06512: at line 6

There are plenty of methods to overcome the limitations of LONG. Here’s the one I typically use – just being a small variant on the code above


SQL> set serveroutput on
SQL> declare
  2    d date;
  3  begin
  4    for i in ( select high_value from user_tab_partitions where table_name = 'T1' )
  5    loop
  6      execute immediate 'select '||i.high_value||' from dual' into d;
  7      dbms_output.put_line(d);
  8    end loop;
  9  end;
 10  /
01-JAN-19
01-FEB-19

PL/SQL procedure successfully completed.

which I’ve also covered in video form here.

But just be careful using the CAST trick. It might cause you grief depending on the partitioning key you are using.

And if you’re wondering why the image associated with this post is that of a horse. It’s the old joke: “A horse walks into a bar, and the barman says: Why the LONG face?” </p />
</p></div>
    <div class=»

Bertrand Drouvot's picture

Sampling pg_stat_statements based on the active sessions and their associated queryid

Introduction

Now that we have the ability to sample and record the active sessions and their associated queryid with the pg_active_session_history view (see this blog post), it would be interesting to have insights about the queries statistics at the time the sessions were active.

PostgreSQL provides the queries statistics with the pg_stat_statements view. We could query the pg_active_session_history and the pg_stat_statements views and join them on the queryid field, but the queries statistics would be:

  • cumulative
  • the ones at the time we would launch this query

So it would not provide the queries statistics at the time the sessions were active.

What’s new?

To get more granular queries statistics, the pgsentinel extension has evolved so that it now samples the pg_stat_statements:

  • at the same time it is sampling the active sessions
  • only for the queryid that were associated to an active session (if any) during the sampling

The samples are recorded into a new pg_stat_statements_history view.

This view looks like:

                    View "public.pg_stat_statements_history"
       Column        |           Type           | Collation | Nullable | Default 
---------------------+--------------------------+-----------+----------+---------
 ash_time            | timestamp with time zone |           |          | 
 userid              | oid                      |           |          | 
 dbid                | oid                      |           |          | 
 queryid             | bigint                   |           |          | 
 calls               | bigint                   |           |          | 
 total_time          | double precision         |           |          | 
 rows                | bigint                   |           |          | 
 shared_blks_hit     | bigint                   |           |          | 
 shared_blks_read    | bigint                   |           |          | 
 shared_blks_dirtied | bigint                   |           |          | 
 shared_blks_written | bigint                   |           |          | 
 local_blks_hit      | bigint                   |           |          | 
 local_blks_read     | bigint                   |           |          | 
 local_blks_dirtied  | bigint                   |           |          | 
 local_blks_written  | bigint                   |           |          | 
 temp_blks_read      | bigint                   |           |          | 
 temp_blks_written   | bigint                   |           |          | 
 blk_read_time       | double precision         |           |          | 
 blk_write_time      | double precision         |           |          | 

Remarks:

  • The fields description are the same as for pg_stat_statements (except for the ash_time one, which is the time of the active session history sampling)
  • As for pg_active_sessions_history, the pg_stat_statements_history view is implemented as in-memory ring buffer where the number of samples to be kept is configurable (thanks to the pgsentinel_pgssh.max_entries parameter)
  • The data collected are still cumulative metrics but you can make use of the window functions in PostgreSQL to compute the delta between samples (and then get accurate statistics for the queries between two samples)

For example, we could get per queryid and ash_time:  the rows per second, calls per second and rows per call that way:

select ash_time,queryid,delta_rows/seconds "rows_per_second",delta_calls/seconds "calls_per_second",delta_rows/delta_calls "rows_per_call"
from(
SELECT ash_time,queryid,
EXTRACT(EPOCH FROM ash_time::timestamp) - lag (EXTRACT(EPOCH FROM ash_time::timestamp))
OVER (
        PARTITION BY pgssh.queryid
        ORDER BY ash_time
        ASC) as "seconds",
rows-lag(rows)
        OVER (
        PARTITION BY pgssh.queryid
        ORDER BY ash_time
        ASC) as "delta_rows",
calls-lag(calls)
        OVER (
        PARTITION BY pgssh.queryid
        ORDER BY ash_time
        ASC) as "delta_calls"
    FROM pg_stat_statements_history pgssh) as delta
where delta_calls > 0 and seconds > 0
order by ash_time desc;

           ash_time            |      queryid       | rows_per_second  | calls_per_second | rows_per_call
-------------------------------+--------------------+------------------+------------------+----------------
 2019-07-06 11:09:48.629218+00 | 250416904599144140 | 10322.0031121842 | 10322.0031121842 |              1
 2019-07-06 11:09:47.627184+00 | 250416904599144140 | 10331.3930170891 | 10331.3930170891 |              1
 2019-07-06 11:09:46.625383+00 | 250416904599144140 | 10257.7574710211 | 10257.7574710211 |              1
 2019-07-06 11:09:42.620219+00 | 250416904599144140 |  10296.311364551 |  10296.311364551 |              1
 2019-07-06 11:09:41.618404+00 | 250416904599144140 | 10271.6737455877 | 10271.6737455877 |              1
 2019-07-06 11:09:36.612209+00 | 250416904599144140 | 10291.1563299622 | 10291.1563299622 |              1
 2019-07-06 11:09:35.610378+00 | 250416904599144140 | 10308.9798914136 | 10308.9798914136 |              1
 2019-07-06 11:09:33.607367+00 | 250416904599144140 |  10251.230955397 |  10251.230955397 |              1
 2019-07-06 11:09:31.604193+00 | 250416904599144140 | 10284.3551339058 | 10284.3551339058 |              1
 2019-07-06 11:09:30.60238+00  | 250416904599144140 | 10277.4631222064 | 10277.4631222064 |              1
 2019-07-06 11:09:24.595353+00 | 250416904599144140 | 10283.3919912856 | 10283.3919912856 |              1
 2019-07-06 11:09:22.59222+00  | 250416904599144140 | 10271.3534800552 | 10271.3534800552 |              1
 2019-07-06 11:09:21.59021+00  | 250416904599144140 | 10300.1104655978 | 10300.1104655978 |              1
 2019-07-06 11:09:20.588376+00 | 250416904599144140 | 10343.9790974522 | 10343.9790974522 |              1
 2019-07-06 11:09:16.583341+00 | 250416904599144140 | 10276.5525289304 | 10276.5525289304 |              1
  • it’s a good practice to normalize the statistics per second (as the sampling interval might change)
  • With that level of information we can understand the database activity in the past (thanks to the active sessions sampling) and get statistics per query at the time they were active
  • pgsentinel is available in this github repository

Conclusion

The pgsentinel extension now provides:

  • Active session history (through the pg_active_session_history view)
  • Queries statistics history (through the pg_stat_statements_history view), recorded at the exact same time as their associated active sessions
Franck Pachot's picture

strace -k (build with libunwind)

strace -k (built with libunwind)

prints the stack trace with the system calls

PostgreSQL is Open Source and you may think that it is not necessary to trace the system calls as we can read the source code. But even there, strace is a really nice tool for troubleshooting a running process.

https://postgreslondon.org/speaker/dmitrii-dolgov/

Little disclaimer here: attaching strace to a running process may hang. Do not use it in production, except when this (small) risk is an acceptable way to troubleshoot a critical problem.

At PostgresLondon 2019, attending Dmitry Dolgov session “PostgreSQL at low level: stay curious!” I learned something about strace: the -k argument displays the stack trace. That’s awesome: it can display the full stack of C functions in the software at the moment the system call is made. An easy way to get the context of the system call, and understand the reason for the call.

With Oracle, we are used to see the stack trace in dumps, but of course, I immediately wanted to test this strace -k:

$ strace -k
strace: invalid option -- 'k'
Try 'strace -h' for more information.

Ok, not available here (OEL7.6) but man strace gives more information:

-k Print the execution stack trace of the traced processes after each system call (experimental). This option is available only if strace is built with libunwind.

Ok, no, problem, let’s build strace with libunwind

This is what I did, as root, to replace my current strace with the new one:

yum install -y autoconf automake libunwind libunwind-devel
cd /var/tmp
git clone https://github.com/strace/strace.git
cd strace
./bootstrap
./configure --with-libunwind
make
sudo make install

The man strace is also updated:

-k Print the execution stack trace of the traced processes after each system call.

Ready to test. Here is an example where on the Oracle Log Writer for writes:

strace -k -e trace=desc -y -p $(pgrep -f ora_lgwr_CDB1A)

When I have a stack from the oracle executable, I format it as we are used to see in Oracle dumps short stacks:

strace -k -e trace=desc -y -p $(pgrep -f ora_lgwr_CDB1A) 2>&1 | 
awk '/^ > /{gsub(/[()]/," ");sub(/+0x/,"()&");printf "<-"$3;next}{sub(/[(].*/,"()");printf "\n\n"$0}'

and then I can paste it in Frits Hoogland http://orafun.info/stack/ to add some annotations about the Oracle C functions:

http://orafun.info/stack/ created by Frits Hoogland with a little help from Kamil Stawiarski

Again, there is always a risk to attach strace to a running process, and there may be some reasons why the Linux distributions do not build it with libunwind so be careful: Labs, non-critical environment. Or critical ones with a blocking issue to troubleshoot justifies the risk.

fritshoogland's picture

Getting rid of annoying, repetitive messages in /var/log/messages

The primary source of information regarding any change or issue on a linux system is the /var/log/messages file. I am often annoyed when a linux system is setup in such a way that certain messages are written to syslog with a high frequency swamping the messages file with information that is not important. The reason for my annoyance is that this makes it very hard to actually spot important information because you have to skip through a lot of lines before you find the important information, especially if you do not know for sure if there a message in the first place.

Please mind this blogpost is created on a Centos 7 server which uses rsyslog.

There are a couple of ways to manage this. The standard syslog way of managing this is the following, which can be found in /etc/rsyslog.conf:

#### RULES ####

# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.*                                                 /dev/console

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none                /var/log/messages

# The authpriv file has restricted access.
authpriv.*                                              /var/log/secure

# Log all the mail messages in one place.
mail.*                                                  -/var/log/maillog


# Log cron stuff
cron.*                                                  /var/log/cron

# Everybody gets emergency messages
*.emerg                                                 :omusrmsg:*

# Save news errors of level crit and higher in a special file.
uucp,news.crit                                          /var/log/spooler

# Save boot messages also to boot.log
local7.*                                                /var/log/boot.log

What this shows is that certain principal important tasks have their own ‘facility’, which in this example are ‘mail’, ‘cron’, ‘authpriv’, etc. shown above, which are put in their own file.

But how about processes that do not have their own facility, and just write to syslog? One example of these this is dhclient, which produces messages like these:

Jun 30 03:28:32 ip-172-31-12-40 dhclient[19604]: DHCPREQUEST on eth0 to 172.31.0.1 port 67 (xid=0x5dfd88c4)
Jun 30 03:28:32 ip-172-31-12-40 dhclient[19604]: DHCPACK from 172.31.0.1 (xid=0x5dfd88c4)
Jun 30 03:28:35 ip-172-31-12-40 dhclient[19604]: bound to 172.31.12.40 -- renewal in 1726 seconds.
Jun 30 03:57:21 ip-172-31-12-40 dhclient[19604]: DHCPREQUEST on eth0 to 172.31.0.1 port 67 (xid=0x5dfd88c4)
Jun 30 03:57:21 ip-172-31-12-40 dhclient[19604]: DHCPACK from 172.31.0.1 (xid=0x5dfd88c4)
Jun 30 03:57:23 ip-172-31-12-40 dhclient[19604]: bound to 172.31.12.40 -- renewal in 1798 seconds.

Which it does every 5 minutes. This is actually truly annoying…

Luckily, there is a solution: rsyslog has the option to filter based on more properties than the logging facility a process is using. This is done using a script in /etc/rsyslog.d/.

On my server, the majority of the messages are from the dhclient and systemd daemons, and both the messages seem to be informal. In order not to miss anything, I still want that information to be logged, but not in the /var/log/messages file.

This can be actually quite simply be accomplished using the following two scripts in /etc/rsyslog.d/:

dhclient.conf:

if $programname == 'dhclient' then /var/log/dhclient.log
&stop

systemd.conf

if $programname = 'systemd' then /var/log/systemd.log
&stop

Once you created these scripts, you need to make rsyslogd read this new configuration. I thought killall -HUP rsyslogd would accomplish this, but outside of a message in the /var/log/messages file saying it got the HUP signal, it doesn’t execute a new task.

However, executing:

systemctl stop rsyslog.service
systemctl start rsyslog.service

Does make rsyslog read the new configuration, and then both dhclient and systemd log to their own files and do not write to the messages file anymore!

There is one last thing that needs to be done: make sure the newly defined logfiles are cleaned just like all the other files in /var/log. Otherwise these files will endlessly grow, eventually occupying all the space in the filesystem where /var/log is part of.

This too is really easy, the newly defined logfiles can be added to the list of syslog files for logrotate, which is defined in /etc/logrotate.d/syslog:

/var/log/cron
/var/log/maillog
/var/log/messages
/var/log/secure
/var/log/spooler
/var/log/dhclient.log
/var/log/systemd.log
{
    missingok
    sharedscripts
    postrotate
	/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
    endscript
}

As you can see, I simply added /var/log/dhclient.log and /var/log/systemd.log to the list.

Please mind that the filter for dhclient and systemd is the executable name, so even if the severity of the logging message is high, a message from these daemons will still go to the file it is configured to log to.

Jonathan Lewis's picture

DB links

A recent thread on the ODC SQL & PL/SQL forum raised the question of how data moves across a database link – is it row by row or array based (or other)? It’s a good question as it’s fairly common knowledge that distributed queries can be very much slower than the same query when executed on an equivalent set of local tables, so it’s worth having a little poke around to see if there’s anything in the way the data moves across the link that could be tweaked to improve performance.

It’s fairly easy to get some ideas by running a couple of queries and checking session activity stats and wait events – so here’s a little test conducted between a database running 12.2.0.1 and a database running 11.2.0.4. For this type of testing it’s probably sensible to use two database rather than faking things with a loopback database link in case the loopback introduces some special features (a.k.a. dirty tricks).


rem
rem     Script:         db_link_usage.sql
rem     Author:         Jonathan Lewis
rem     Dated:          July 2019
rem
rem     Last tested 
rem             12.2.0.1 -> 11.2.0.4
rem

prompt  ==================================
prompt  SQL*Plus 20,000 rows, arraysize 20
prompt  ==================================

set feedback only
set arraysize 20

execute snap_my_stats.start_snap
execute snap_events.start_snap
select * from all_objects@d11g where rownum <= 20000;

set feedback on
execute snap_events.end_snap
execute snap_my_stats.end_snap

prompt  ====================================
prompt  SQL*Plus 20,000 rows, arraysize 2000
prompt  ====================================

set feedback only
set arraysize 2000

execute snap_my_stats.start_snap
execute snap_events.start_snap
select * from all_objects@d11g where rownum <= 20000;

set feedback on
execute snap_events.end_snap
execute snap_my_stats.end_snap

All I’ve done is select 20,000 rows from view all_objects in a remote database with two different settings for the array fetch size. The “feedback only” option is a 12c option that means the session doesn’t output the data, it shows only the final “N rows selected” report. The two “snap” packages are some code I wrote a couple of decades ago to report changes in my session’s activity stats and wait events.

It turned out that there was very little of interest in the session activity stats although the impact of SQL*Net compression is always worth a quick reminder (here and here) on the other hand the wait events (specifically the wait counts) are most revealing:

================================== 
SQL*Plus 20,000 rows, arraysize 20
================================== 
Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message from client                       1,004           0          15.69        .016     414,828
SQL*Net message to dblink                         1,004           0           0.04        .000           0
SQL*Net message from dblink                       1,004           0          61.02        .061          11
SQL*Net message to client                         1,004           0           0.11        .000           0

====================================
SQL*Plus 20,000 rows, arraysize 2000
====================================
Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message from client                          14           0           4.88        .349     414,828
SQL*Net message to dblink                            13           0           0.00        .000           0
SQL*Net message from dblink                          13           0           7.72        .594          11
SQL*Net message to client                            14           0           0.00        .000           0

SQL*Net more data from dblink                        96           0           6.26        .065           6
SQL*Net more data to client                          96           0           0.16        .002           4

Doing a little arithmetic (and allowing for a few “metadata” messages that introduce small variations in the numbers), we can see that when we fetched the 20,000 rows with an arraysize of 20 this turned into 1,000 (fetch) calls from SQL*Plus to the server, with a matching 1,000 calls from the local server to the remote server. When the arraysize goes up to 2,000 rows, though, the SDU (session data unit) setting for SQL*Net is too small to hold the whole of a single fetch and we see a single fetch from SQL*Plus turning into one “SQL*Net message to client” accompanied by 7 or 8 “SQ:(Net more data to client” with exactly the same pattern of conversation between the local server and the remote server. You could imagine the conversation as something like:

  • Client to local server: “give me 2,000 rows”
  • local server to remote server: “give me 2,000 rows”
  • remote server to local server: “Here, I can manage 120 rows as the first installment”
  • local server to client: “Here’s a first installment”
  • Client to local server: “Got it, where’s the rest?”     *** but no message recorded in the session activity stats
  • Local server to remote server: “Where’s the rest?”
  • Remote server to local server: “You want more – here’s another 120 rows”
  • Local server to client: “Here’s a second installment”
  • Client to local server: “Got it, where’s the rest?”     *** but no message recorded in the session activity stats
  • etc.

In this example the volume of data sent back to the client in each message was limited by the SDU size negotiated between the local server and the remote server as the link opens. In my case I happened to have the default (apparently 8KB) as the SDU_SIZE in the 12c and a non-default 32KB setting in the 11g sqlnet.ora, so the negotiated SDU between servers was 8KB (the lower of the two).

Here’s what the figures looked like after I had restarted with the SDU set to 16KB in the 12c tnsnames.ora, and 16KB in the 11g listener.ora:


==================================
SQL*Plus 20,000 rows, arraysize 20
==================================
Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message from client                       1,004           0          24.23        .024         588
SQL*Net message to dblink                         1,009           0           0.06        .000           0
SQL*Net message from dblink                       1,010           0          77.76        .077           4
SQL*Net message to client                         1,004           0           0.15        .000           0

====================================
SQL*Plus 20,000 rows, arraysize 2000
====================================
Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message from client                          14           0           1.61        .115         588
SQL*Net message to dblink                            13           0           0.00        .000           0
SQL*Net message from dblink                          13           0           4.21        .324           4
SQL*Net message to client                            14           0           0.00        .000           0

SQL*Net more data from dblink                        45           0          13.53        .301           6
SQL*Net more data to client                          45           0           0.13        .003           0

The first set of figures (arraysize 20) don’t change. If an 8KB SDU is large enough to hold an array of 20 rows then it’s not going to make a difference when the SDU is increased. In the second set of figures we see that for each “SQL*Net message from dblink” we now have roughly 3 “SQL*Net more data from dblink” (with matching counts for “SQL*Net more data to client”). With an SDU of double the size it now takes a total of roughly 4 packets to transmit the array fetch rather than the 8 or 9 we saw with the smaller SDU size.

As a final test with SQL*Plus, what happens if we set the SDU size at 32K for the 12c database (and that means for the client calling in and the server calling out) and 8KB for the 11g database? The client negotiates a 32KB SDU with the 12c server, but the 12c server negotiates an 8KB SDU with the 11g server. Here are the stats for just the test with the large fetch size in this case:


====================================
SQL*Plus 20,000 rows, arraysize 2000
====================================
Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message from client                          14           0           5.30        .379     214,570
SQL*Net message to dblink                            13           0           0.00        .000           0
SQL*Net message from dblink                          13           0           4.09        .314          13
SQL*Net message to client                            14           0           0.00        .000           0

SQL*Net more data from dblink                        96           0          14.46        .151           6
SQL*Net more data to client                          20           0           0.08        .004           0

We get the same 10 (plus a few) “message to/from client/dblink”, but now the “more data” waits are dramatically different. When the client calls for the “next fetch” the local server has to call the remote server 4 or 5 times to get enough 8KB data packets to fill a single 32KB packet to return to the client. You can confirm this (and some of my descriptions of the earlier behaviour) by enabling extended tracing for wait states. Here’s an extract of 5 consecutive lines from a suitable trace file (you have to ignore the “#bytes” parameter on the “SQL*Net more data from dblink” waits, they’ve always been garbage:


WAIT #140028098478296: nam='SQL*Net more data from dblink' ela= 2793 driver id=1413697536 #bytes=7 p3=0 obj#=-1 tim=703316387674
WAIT #140028098478296: nam='SQL*Net more data from dblink' ela= 34 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=703316388447
WAIT #140028098478296: nam='SQL*Net more data from dblink' ela= 8 driver id=1413697536 #bytes=2 p3=0 obj#=-1 tim=703316389134
WAIT #140028098478296: nam='SQL*Net more data from dblink' ela= 16 driver id=1413697536 #bytes=12 p3=0 obj#=-1 tim=703316389818
WAIT #140028098478296: nam='SQL*Net more data to client' ela= 73 driver id=1413697536 #bytes=32671 p3=0 obj#=-1 tim=703316389960

Summary Note

Oracle will use the negotiated SDU to do array fetches across a database link. Since a larger arraysize can (for large volumes of data) reduce the total work load on the remote database and on the network you may want the local server to have a specifically crafted tnsnames.ora entry and the remote server to expose a specific service with matching SDU size to help improve the efficiency of transferring data between two databases.

Lagniappe

If you want to look at other cases of how array fetching and SDU sizes interact, here are a couple of examples of using PL/SQL to execute SQL that fetches data across database links. The first is a simple, embedded “cursor for loop” that (apparently) does “row by row” procssing – although an enhancement appeared many versions ago to make this type of code use array fetching of 100 rows under the covers. The second demonstrates the effects of an explicit cursor with “fetch, bulk collect, limit”:


execute snap_events.start_snap

declare
        m_total number := 0;
        m_ct    number := 0;
begin
        for c1 in (select * from all_objects@d11g where rownum < 20000) loop
                m_total := m_total + c1.object_id;
                m_ct    := m_ct + 1;
        end loop;
end;
/

execute snap_events.end_snap

execute snap_events.start_snap

declare

        cursor c1 is  select * from all_objects@d11g where rownum <= 20000;

        type c1_array is table of c1%rowtype index by binary_integer;
        m_tab c1_array;

        m_ct number := 0;

begin
        open c1;
        loop
                fetch   c1
                bulk collect
                into    m_tab
                limit   1000
                ;

                exit when c1%notfound;
                m_ct := m_ct + 1;

        end loop;
        close c1;
end;
/

execute snap_events.end_snap

Checking the session events for these two test we see the following when the SDU_SIZE has been set at 16KB at both ends of the database link (which means the negotiated SDU will be 16KB):


Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message from client                           4           0           0.66        .165         588 
SQL*Net message to dblink                           203           0           0.05        .000           0
SQL*Net message from dblink                         203           0          38.51        .190           5
SQL*Net message to client                             4           0           0.00        .000           0

Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message from client                           4           0           0.08        .021         588
SQL*Net message to dblink                            24           0           0.01        .000           0
SQL*Net message from dblink                          24           0           9.19        .383           5
SQL*Net more data from dblink                        40           0           8.47        .212           6
SQL*Net message to client                             4           0           0.00        .000           0

The critical information we can see in the first example is that getting 20,000 rows from the remote database “row by row” takes 200 (plus noise) round-trips – showing that Oracle really is converting our single-row processing loop in array fetches of 100 rows.

The second example shows us that (allowing for a little noise, again) it takes 20 messages to the remote database to fetch 20,000 rows – so 1,000 rows per message – but the SDU size is too small and we have to send each array in three consecutive packets, the “message from dblink” wait and two “more data from dblink” waits.

Footnote

I did point out that the session activity stats for theses tests weren’t interesting. There is, however, one detail worth mentioning since you could otherwise be fooled into thinking the number of packet travelling between the databases was smaller than it really was. There are two statistics about network roundtrips:


Name                                         Value
----                                         -----
SQL*Net roundtrips to/from client               18         
SQL*Net roundtrips to/from dblink               13

A “roundtrip” is a “message to / message from” pair. It doesn’t account for traffic due to “more data” being requested or sent.

 

To prevent automated spam submissions leave this field empty.