Troubleshooting

Jonathan Lewis's picture

Trouble-shooting

Here’s an answer I’ve just offered on the ODC database forum to a fairly generic type of problem.

The question was about finding out why a “program” that used to take only 10 minutes to complete is currently taking significantly longer. The system is running Standard Edition, and the program runs once per day. There’s some emphasis on the desirability of taking action while the program is still running with the following as the most recent statement of the requirements:

We have a program which run daily 10minutes and suddenly one day,it is running for more than 10minutes…in this case,we are asked to look into the backend session to check what exactly the session is doing.I understand we have to check the events,last sql etc..but we need to get the work done by that session in terms of buffergets or physical reads(in case of standard edition)

Jonathan Lewis's picture

Redo Dumps

A thread started on the Oracle-L list-server a few days ago asking for help analysing a problem where a simple “insert values()” (that handled millions of rows per day) was running very slowly. There are many reasons why this might happen, ranging from the trivial (someone has locked the table in exclusive mode), through the slightly subtle (we’re trying to insert a row that collides on a uniqueness constraint with an uncommitted insert from another session) to the subtle (Oracle has to read through the undo to check current versions of blocks against read-consistent versions) ending up at the esoteric (the ASSM space management blocks are completely messed up again).

Jonathan Lewis's picture

Execution Plan Puzzle

Here’s an execution plan that’s just been published on the ODC database forum. The plan comes from a call to dbms_xplan.display_cursor() with rowsource execution statistics enabled.

There’s something unusual about the execution statistics that I don’t think I’ve seen before – can anyone else see anything really odd, or (better still) anything which they would expect others to find odd but which they can easily explain.

A couple of hints:

Jonathan Lewis's picture

Occurence

Before you comment – I do know that the title has a spelling mistake in it. That’s because the Oracle code uses exactly this spelling in one of the little-used features of tracing.

Jonathan Lewis's picture

Lost time

Here’s a little puzzle that came up in the ODC database forum yesterday – I’ve got a query that has been captured by SQL Monitor, and it’s taking much longer to run than it should but the monitoring report isn’t telling me what I need to know about the time.

Here’s a little model to demonstrate the problem – I’m going to join a table to itself (the self join isn’t a necessary feature of the demonstration, I’ve just been a bit lazy in preparing data). Here’s a (competely truthful) description of the table:

Jonathan Lewis's picture

IM_DOMAIN$

A few months ago Franck Pachot wrote about a recursive SQL statement that kept appearing in the library cache. I discovered the note today because I had just found a client site where the following statement suddenly appeared near the top of the “SQL ordered by Executions” section of their AWR reports after they had upgraded to 18c.


select domain# from sys.im_domain$ where objn = :1 and col# = :2

I found Franck’s article by the simple expedient of typing the entire query into a Google search – his note was the first hit on the list, and he had a convenient example (based on the SCOTT schema) to demonstrate the effect, so I built the tables from the schema and ran a simple test with extended SQL tracing (event 10046) enabled.

Franck Pachot's picture

Network troubleshooting with tcpdump and strace

Here is a little example using tcpdump and strace to troubleshoot a network issue with an Oracle connection. It may not be the best approach for all cases, but just an example, as this is a copy/paste of my screen after I analyzed it. I just changed the server names.

At some point, the communication between two servers was hanging, with both endpoints waiting on read() — like ‘SQL*Net message from client’ wait event on the server. This issue occurred only on some circumstances: a case that always reproduced was RMAN correctly connected to the catalog, but hanging when we did a ‘show all’.

Client netstat and strace

I have RMAN started and connected to the catalog.

I identify my client process, in this case the RMAN executable. I display its TCP connection with netstat and its current system calls with strace:

Jonathan Lewis's picture

Descending Problem

I’ve written in the past about oddities with descending indexes ( here, here, and here, for example) but I’ve just come across a case where I may have to introduce a descending index that really shouldn’t need to exist. As so often happens it’s at the boundary where two Oracle features collide. I have a table that handles data for a large number of customers, who record a reasonable number of transactions per year, and I have a query that displays the most recent transactions for a customer.

Jonathan Lewis's picture

Counting Rows

Here’s another little utility I use from time to time (usually for small tables) to check how many rows there are in each block of the table, and which blocks are used. It doesn’t do anything clever, just call routines in the dbms_rowid package for each rowid in the table:

Jonathan Lewis's picture

add_colored_sql

The following request appeared recently on the Oracle-L mailing list:

I have one scenario related to capturing of sql statement in history table..  Like dba_hist_sqltext capture the queries that ran for 10 sec or more..  How do I get the sql stmt which took less time say in  millisecond..  Any idea pleae share.

To prevent automated spam submissions leave this field empty.
Syndicate content