Troubleshooting

Jonathan Lewis's picture

Lost time

Here’s a little puzzle that came up in the ODC database forum yesterday – I’ve got a query that has been captured by SQL Monitor, and it’s taking much longer to run than it should but the monitoring report isn’t telling me what I need to know about the time.

Here’s a little model to demonstrate the problem – I’m going to join a table to itself (the self join isn’t a necessary feature of the demonstration, I’ve just been a bit lazy in preparing data). Here’s a (competely truthful) description of the table:

Jonathan Lewis's picture

IM_DOMAIN$

A few months ago Franck Pachot wrote about a recursive SQL statement that kept appearing in the library cache. I discovered the note today because I had just found a client site where the following statement suddenly appeared near the top of the “SQL ordered by Executions” section of their AWR reports after they had upgraded to 18c.


select domain# from sys.im_domain$ where objn = :1 and col# = :2

I found Franck’s article by the simple expedient of typing the entire query into a Google search – his note was the first hit on the list, and he had a convenient example (based on the SCOTT schema) to demonstrate the effect, so I built the tables from the schema and ran a simple test with extended SQL tracing (event 10046) enabled.

Franck Pachot's picture

Network troubleshooting with tcpdump and strace

Here is a little example using tcpdump and strace to troubleshoot a network issue with an Oracle connection. It may not be the best approach for all cases, but just an example, as this is a copy/paste of my screen after I analyzed it. I just changed the server names.

At some point, the communication between two servers was hanging, with both endpoints waiting on read() — like ‘SQL*Net message from client’ wait event on the server. This issue occurred only on some circumstances: a case that always reproduced was RMAN correctly connected to the catalog, but hanging when we did a ‘show all’.

Client netstat and strace

I have RMAN started and connected to the catalog.

I identify my client process, in this case the RMAN executable. I display its TCP connection with netstat and its current system calls with strace:

Jonathan Lewis's picture

Descending Problem

I’ve written in the past about oddities with descending indexes ( here, here, and here, for example) but I’ve just come across a case where I may have to introduce a descending index that really shouldn’t need to exist. As so often happens it’s at the boundary where two Oracle features collide. I have a table that handles data for a large number of customers, who record a reasonable number of transactions per year, and I have a query that displays the most recent transactions for a customer.

Jonathan Lewis's picture

Counting Rows

Here’s another little utility I use from time to time (usually for small tables) to check how many rows there are in each block of the table, and which blocks are used. It doesn’t do anything clever, just call routines in the dbms_rowid package for each rowid in the table:

Jonathan Lewis's picture

add_colored_sql

The following request appeared recently on the Oracle-L mailing list:

I have one scenario related to capturing of sql statement in history table..  Like dba_hist_sqltext capture the queries that ran for 10 sec or more..  How do I get the sql stmt which took less time say in  millisecond..  Any idea pleae share.

Jonathan Lewis's picture

Problem Solving

Here’s a little question that popped up on the Oracle-L list server a few days ago:

I am facing this issue running this command in 11.2.0.4.0 (also in 12c R2 I got the same error)

SQL> SELECT TO_TIMESTAMP('1970-01-01 00:00:00.0','YYYY-MM-DD HH24:MI:SS.FF') + NUMTODSINTERVAL(2850166802000/1000, 'SECOND') FROM DUAL;
SELECT TO_TIMESTAMP('1970-01-01 00:00:00.0','YYYY-MM-DD HH24:MI:SS.FF') + NUMTODSINTERVAL(2850166802000/1000, 'SECOND') FROM DUAL
ORA-01873: a precisão precedente do intervalo é pequena demais

 

How do you go about finding out what’s going on ? In my case the first thing is to check the translation the error message (two options):

Jonathan Lewis's picture

Case Study

A question about reading execution plans and optimising queries arrived on the ODC database forum a little while ago; the owner says the following statement is taking 14 minutes to return 30,000 rows and wants some help understanding why.

If you look at the original posting you’ll see that we’ve been given the text of the query and the execution plan including rowsource execution stats. There’s an inconsistency between the supplied information and the question asked, and I’ll get back to that shortly, but to keep this note fairly short I’ve excluded the 2nd half of the query (which is a UNION ALL) because the plan says the first part of the query took 13 minutes and 20 second and the user is worried about a total of 14 minutes.

Jonathan Lewis's picture

Error Logging

Error logging is a topic that I’ve mentioned a couple of times in the past, most recently as a follow-up in a discussion of the choices for copying a large volume of data from one table to another, but originally in an addendum about a little surprise you may get when you use extended strings (max_string_size = EXTENDED).

Jonathan Lewis's picture

Masterclass – 1

A recent thread on the Oracle developer community database forum raised a fairly typical question with a little twist. The basic question is “why is this (very simple) query slow on one system when it’s much faster on another?” The little twist was that the original posting told use that “Streams Replication” was in place to replicate the data between the two systems.

To prevent automated spam submissions leave this field empty.
Syndicate content