Troubleshooting

oakroot's picture

[Oracle] Researching internal latch implementation (ksl_get_shared_latch, kslfre, kslgetsl_w) and crashing PMON

Introduction

Last week the DOAG 2014 conference took place in Nuremberg and it was a blast with a lot of useful presentations and especially great conversations and meet ups with Oracle friends. I had a nice talk about the Oracle latch implementation with a participant, who told me that his instance crashes every time, if he (manually) sets a shared latch in exclusive mode and tries to release it afterwards. It sounded really interesting as i have done this so many times without ever noticing such an issue. He also told me that this issue is reproducible at least on Oracle 11g R2 and 12c R1. I had no immediate answer or clue about the described issue and needed to research it furthermore.

Jonathan Lewis's picture

push_pred – evolution

Here’s a query (with a few hints to control how I want Oracle to run it) that demonstrates the difficulty of trying to solve problems by hinting (and the need to make sure you know where all your hinted code is):

Jonathan Lewis's picture

Parse Time

Here’s a little query I wrote some time ago to see where my time went while running a query. It’s nothing sophisticated, just one of those simple things you can do with v$active_session_history (or dba_hist_active_sess_history, if you don’t get to the crime scene in time).

Jonathan Lewis's picture

Line Numbers

One of the presentations I went to at the DOAG conference earlier on this month was called “PL/SQL Tuning, finding the perf. bottleneck with hierarchical profiler” by Radu Parvu from Finland. If you do a lot of PL/SQL programming and haven’t noticed the dbms_hprof package yet make sure you take a good look at it.

A peripheral question that came up at the end of the session asked about problems with line numbers in pl/sql procedures; why, when you get a run-time error, does the reported line number sometimes look wrong, and how do you find the right line. I can answer (or give at least one reason for) the first part, but not the second part; Julian Dontcheff had an answer for the second bit, but unfortunately I failed to take a note of it.

Jonathan Lewis's picture

Baselines

I’m not very keen on bending the rules on production systems, I’d prefer to do things that look as if they could have happened in a completely legal fashion, but sometimes it’s necessary to abuse the system and here’s an example to demonstrate the point. I’ve got a simple SQL statement consisting of nothing more than an eight table join where the optimizer (on the various versions I’ve tested, including 12c) examines 5,040 join orders (even though _optimizer_max_permutations is set to the default of 2,000 – and that might come as a little surprise if you thought you knew what that parameter was supposed to do):

Jonathan Lewis's picture

Plan puzzle

I was in Munich a few weeks ago running a course on Designing Optimal SQL and Troubleshooting and Tuning, but just before I flew in to Munich one of the attendees emailed me with an example of a statement that behaved a little strangely and asked me if we could look at it during the course.  It displays an odd little feature, and I thought it might be interesting to write up what I did to find out what was going on. We’ll start with the problem query and execution plan:

Jonathan Lewis's picture

Cardinality Feedback

A fairly important question, and a little surprise, appeared on Oracle-L a couple of days ago. Running 11.2.0.3 a query completed quickly on the first execution then ran very slowly on the second execution because Oracle had used cardinality feedback to change the plan. This shouldn’t really be entirely surprising – if you read all the notes that Oracle has published about cardinality feedback – but it’s certainly a little counter-intuitive.

oakroot's picture

[Oracle] - New diagnostic event "wait_event[]" (and others) for troubleshooting / researching purpose with Oracle 12c

Introduction

A long time has gone since my last blog post on SCN in March 2014, but i was quite busy with Oracle RAC implementations and troubleshooting performance issues in the last month. It was a quite interesting time for me and i have learned a lot of new stuff about Oracle 12c and so in consequence this is just a tiny blog post about a new diagnostic event called "wait_event[]", which was introduced with Oracle 12c R1 (12.1.0.1). Oracle has re-engineered its kernel diagnostics & tracing infrastructure with Oracle 11g, which allows you to be much more detailed and extensive by tracing and dumping diagnostic / low level (internals) information. Please check the reference section for more detailed information about that "new" kernel diagnostics & tracing infrastructure, if you have never heard of it until yet.

 

Jonathan Lewis's picture

sreadtim

Here’s a question that appeared in my email a few days ago:

 

Based on the formula: “sreadtim = ioseektim + db_block_size/iotrfrspeed” sreadtim should always bigger than ioseektim.

But I just did a query on my system, find it otherwise, get confused,

Jonathan Lewis's picture

Unusual Deadlock

Prompted by a question on OTN I came up with a strategy for producing an ORA-00060 deadlock that DIDN’T produce a deadlock graph (because there isn’t one) and didn’t get reported in the alert log (at least, not when tested on 11.2.0.4). It’s a situation that shouldn’t arise in a production system because it’s doing the sorts of things that you shouldn’t do in a production system: but possibly if you’re trying to do some maintenance or upgrades while keeping the system live it could happen. Here’s the starting code:

To prevent automated spam submissions leave this field empty.
Syndicate content