rac

rshamsud's picture

ORA-4031 and Shared Pool Duration

After reading my earlier post on shared pool A stroll through shared pool heap , one of my client contacted me with an interesting ORA-4031 issue. Client was getting ORA-4031 errors and shared pool size was over 4GB ( in a RAC environment). Client DBA queried v$sgastat to show that there is plenty of free memory in the shared pool. We researched the issue and it is worth blogging. Client DBA was confused as to how there can be ORA-4031 errors when the shared pool free memory is few GBs.

Heapdump Analysis

At this point, it is imperative to take heapdump in level 2 and Level 2 is for the shared pool heap dump. [ Please be warned that it is not advisable to take shared pool heap dumps excessively, as that itself can cause performance issue. During an offline conversation, Tanel Poder said that heapdump can freeze instance as his clients have experienced.]. This will create a trace file in user_dump_dest destination and that trace file is quite useful in analyzing the contents of shared pool heap. Tanel Poder has an excellent script heapdump_analyzer . I modified that script adding code for aggregation at hea, extent and type levels to debug this issue further and it is available as heapdump_dissect.ksh . ( with a special permission from Tanel to publish this script.)

Shared pool review

rshamsud's picture

RAC, parallel query and udpsnoop

I presented about various performance myths in my ‘battle of the nodes’ presentation. One of the myth was that how spawning parallel query slaves across multiple RAC instances can cause major bottleneck in the interconnect. In fact, that myth was direct result of a lessons learnt presentation from a client engagement. Client was suffering from performance issues with enormous global cache waits running in to 30+ms average response time for global cache CR traffic and crippling application performance. Essentially, their data warehouse queries were performing hundreds of parallel queries concurrently with slaves spawning across three node RAC instances.

Of course, I had to hide the client details and simplified using a test case to explain the myth. Looks like either a)my test case is bad or b) some sort of bug I encountered in 9.2.0.5 version c) I made a mistake in my analysis somewhere. Most likely it is the last one :-( . Greg Rahn questioned that example and this topic deserves more research to understand this little bit further. At this point, I don’t have 9.2.0.5 and database is in 10.2.0.4 and so we will test this in 10.2.0.4.

udpsnoop

UDP is one of the protocol used for cache fusion traffic in RAC and it is the Oracle recommended protocol. In this article, UDP traffic size must be measured. Measuring Global cache traffic using AWR reports was not precise. So, I decided to use a dtrace tool kit tool:udpsnoop.d to measure the traffic between RAC nodes. There are two RAC nodes in this setup. You can read more about udpsnoop.d. That tool udpsnoop.d can be downloaded from dtrace toolkit . Output of this script is of the form:

karlarao's picture

Diagnosing and Resolving “gc block lost”

Last week, one of our clients had a sudden slow down on all of their applications which is running on two node RAC environment

Below is the summary of the setup:
– Server and Storage: SunFire X4200 with LUNs on EMC CX300
– OS: RHEL 4.3 ES
– Oracle 10.2.0.3 (database and clusterware)
– Database Files, Flash Recovery Area, OCR, and Voting disk are located on OCFS2 filesystems
– Application: Forms and Reports (6i and also lower)

As per the DBA, the workload on the database was normal and there were no changes on the RAC nodes and on the applications. Hmm, I can’t really tell because I haven’t really looked into their workload so I don’t have past data to compare.

dannorris's picture

Collaborate 09: Don’t miss these sessions

Collaborate 09 starts on Sunday, May 3 (a few days from now!) in Orlando. I’ve been offline for several weeks (more on that later), but will be returning to the world of computers and technology in full force in Orlando. I’ve had a few inquiries about whether or not I’ll be at Collaborate, so I thought I’d resurrect my blog with a post about where I’ll be and some of the highlights I see at Collaborate 09.

First, where I’ll be presenting:

karlarao's picture

Single Instance and RAC Kernel/OS upgrade

This document will serve as a guide for the Kernel and OS upgrade activities for

  1. Single Instance on ASM using raw devices
  2. RAC with ASM (using ASMlib) and OCFS2

Upgrading the Kernel and OS is easy and will just need some few commands. The critical part is the dependencies once the Kernel gets updated, so if you’re using ASMlib and OCFS2 you’ll notice that after the upgrade they’re not working anymore… you can’t startup the ASM, then if your OCR and Voting Disk are on OCFS2 the CRS stack wont start all because the RPMs of ASMlib and OCFS2 are kernel dependent, also there are similar components/softwares that are kernel dependent so you have to check them out and do a risk analysis before doing the upgrade.

dannorris's picture

ADV: RAC Attack Hands-on Event at Collaborate09

The RAC SIG, Oracle and IOUG are thrilled to present the hands-on event dubbed “RAC Attack!” at Collaborate09 in Orlando, FL. It is a half-day University Session in the IOUG Forum scheduled for the morning of Thursday, May 7th.

Each participant will have their own private RAC cluster to use. You’ll be able to install a new cluster, test session failover, perform backup and recovery and just about anything else you’d like to try (time permitting). The session will have lab outlines with very specific instructions that cater to beginners. Advanced users are welcome to test anything they like. If you try something that doesn’t work, we have mechanisms in place to help “reset” your cluster in 15 minutes and let you continue working and testing.

dannorris's picture

Congratulations New Oracle ACE, Jeremy Schneider!

I’ll be the first to offer a large congratulations to Jeremy Schneider on being the most recent appointment to the Oracle ACE program. He certainly deserves it (I nominated him, so I suppose I would think so) and I continue to look for great things to come.

dannorris's picture

Start Database Services automatically after instance startup

Those of us that have dealt with RAC environments for a while are familiar with the behavior of Oracle Services in an Oracle Cluster. Services are an essential component for managing workload in a RAC environment. If you’re not defining any non-default services in your RAC database, you’re making a mistake. To learn more about services, I strongly recommend reading the definitive whitepaper by Jeremy Schneider on the topic.

karlarao's picture

Security, Forecasting Oracle Performance and Some stuff to post… soon…

I’ve been busy this February “playing around/studying” on the following:

1) Oracle Security products (Advance Security Option, Database Vault, Audit Vault, Data Masking, etc. etc.). Well, every organization must guard their digital assets against any threat (external/internal) because once compromised it could lead to negative publicity, lost revenue, litigation, lost of trust.. and the list goes on.. I’m telling you, Oracle has a lot to offer (breadth of products and features, some of them are even free!) on this area and you just need to have the knowledge to stitch them..

To prevent automated spam submissions leave this field empty.
Syndicate content