Oakies Blog Aggregator

Chris Antognini's picture

Does the Query Optimizer Cost PX Distribution Methods?

The short answer to this question is “yes”, it does. Unfortunately, the distribution costs are not externalized through the execution plans and, as a result, this limitation (yes, it is really a limitation in the current implementation, not a bug) confuses everyone that carefully look at the information provided in an execution plan of a [...]

jonah.harris's picture

NEXTGRES Gateway: MySQL Emulator for Oracle

So, a few people have asked me what NEXTGRES Gateway is.  My short answer, the ultimate database compatibility server. Sorry if this blog entry sounds very marketing-oriented, but I’ve been working on this personal project non-stop for the last 8 months and am really excited about it. NEXTGRES Gateway in a nutshell: Designed to assist […]

khailey's picture

The most common passwords!

32 Million passwords were analyzed at rockyou.com and these were the top 20

To put it a different perspective, if a hacker used the first 116 password on each account, they would break into 5% of the accounts. With the first 5000 passwords tried on each account, they'd break into 20% of the accounts!
Its so much more meaningful for me to see real life data.
For Oracle specific security, password hacking and protection see
Pete Finnigan's site at

Greg Rahn's picture

The Core Performance Fundamentals Of Oracle Data Warehousing – Partitioning

[back to Introduction] Partitioning is an essential performance feature for an Oracle data warehouse because partition elimination (or partition pruning) generally results in the elimination of a significant amount of table data to be scanned. This results in a need for less system resources and improved query performance. Someone once told me “the fastest I/O is the one that never happens.” This is precisely the reason that partitioning is a must for Oracle data warehouses – it’s a huge I/O eliminator. I frequently refer to partition elimination as the anti-index. An index is used to find a small amount data that is required; partitioning is used to eliminate vasts amounts of data that is not required. Main Uses For Partitioning I would classify the main reasons to use partitioning in your Oracle data warehouse into these four areas: Data Elimination Partition-Wise Joins Manageability (Partition Exchange Load, Local Indexes, etc.) Information Lifecycle Management (ILM) Partitioning Basics The most common partitioning design pattern found in Oracle data warehouses is to partition the fact tables by range (or interval) on the event date/time column. This allows for partition elimination of all the data not in the desired time window in queries. For example: If I have a [...]

khailey's picture

ASH simulation Scripts (SASH and SASHMON)


(see http://sites.google.com/site/embtdbo/wait-event-documentation/sash-1)

I've gotten a few questions on SASH and SASHMON recently, so I decided to repost the following info:

Active Sessoin history, aka ASH, is only available in Oracle 10g and higher and also requires the diagnostic package license. If you are are an older version of Oracle or don't have the diagnostic package license then you have the option of simulating ASH (S-ASH) yourself. Your milage may vary on the scripts below. No guarentees on them working correctly on your systems. Make sure you test and understand them.

For the scripts below it is highly recommended to put the repository on a different database than the database being monitored. I've only tested resource consumption for gathering data (less than 1% of 1 CPU) and not the resource usage of actually storing the data.

--------------------------------------------

(v1 is available at http://ashmasters.com/ash-simulation/)

V2 Installs on Windows or Unix on Standard Edition

The scripts below are newer so there will probably be some hickups in them (some modules such as SQL and Session statistics collection aren't implemented). Please comment on the blog on any issues and/or solutions you find. I wrote these scripts as both an option for installing on Windows (since the first version was a shell script) and for allowing Standard Edition as a repository while still supporting data purging using "poor man's partitioning", ie having separate tables for each day of the week, and using view with union all of the 7 tables in the data mining scripts. I find the following scripts easier to follow, read, understand and change than the single script above.

SASH creates a view called v$active_session_history and scripts that run on the "real" v$active_session_history should also work on SASH.v$active_session_history.

Repository Creation -

repo_0_user.sql - run as SYS or SYSTEM. Creates SASH user
repo_1_tables.sql - run as SASH only !! ( WARNING - Installs the SASH schema on repository database including simulated DBA_ and V$ views. Will cause problems if run as SYS or SYSTEM)

repo_2_pkg.sql - run as SASH, optional, create an automatic purge procedure on repository machine
repo_3_jobs.sql - run as SASH, optional, start a job on repository machine to purge oldest day of data

Monitored Database Setup (do this on each database to be monitored)

targ_1_pkg.sql - install collection package on each database to be monitored
targ_2_jobs.sql - start up collection in a job on each database to be monitored

Data Mining S-ASH

When running scripts written explicity for ASH on SASH data there are a couple of issues. (or running ASHMON)

1) WAIT GROUPS : SASH doesn't collect wait groups because its set up mainly for version 8 and 9 (since ASH is already on 10g+). The wait groups are required to run ASHMON or scripts that use "wait_class" in v$active_session_history.

repo_4_waitgroups.sql - run as SASH, sets up wait groups

2) CURRENT DATABASE : SASH collects data for multiple databases into the same schema, thus scripts and ASHMON have to filter by the correct DBID. I do this by having a table SASH_TARGET that contains the DBID that I'm interested in. Then views like v$active_session_history include a filter on DBID from SASH_TARGET. Thus to change databases, I just change the DBID in SASH_TARGET.

repo_5_curdb.sql - run as SASH, changes the DBID in SASH_TARGET (shows a list of monitored databases and prompts for the one to filter for)

ASHMON - graphical monitor

When running ASHMON, the user to connect to is SASH, ie the owner of the repository data


davidkurtz's picture

Performance Metrics and XML Reporting in PeopleSoft

I am working with a PeopleSoft system that makes extensive use of XML Publisher reporting.  Generally these reports are based on SQL queries that are defined in the PeopleSoft PS/Query utility. 

These queries are run by an Application Engine program PSXPQRYRPT that runs the SQL query and produces the report in a single process.  This is a generic application engine program that runs any XML report.  Line 45 (in PeopleTools 8.49) of step MAIN.ExecRPT executes the report with the ProcessReport PeopleCode command.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
&oRptDefn.ProcessReport(&TemplateId, &Languaged, &AsOfDate, &oRptDefn.GetOutDestFormatString(&OutDestFormat));

Batch Timings
Analysis of the Application Engine batch timings indicate that nearly all the time in this Application Engine program is spent in PeopleCode, and that this not SQL execution time.  This is misleading.  The ProcessReport command is PeopleCode, but behind the scenes it also issues the SQL in the report data source.  Not all the time is SQL, but the Application Engine Batch Timings does not count any of this as SQL because it is not in the PeopleCode SQL Class. 

Let’s look at an example Batch Timings report (I have edited it down, removing zero and insignificant timings).

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
PeopleSoft Application Engine Timings
(All timings in seconds)

C o m p i l e E x e c u t e F e t c h Total
SQL Statement Count Time Count Time Count Time Time
------------------------------ ------- -------- ------- -------- ------- -------- --------
PeopleCode
Record.SelectByKey PSDBFIELD 8 0.1 8 0.0 0.1
SELECT PSPRCSRQST 2 0.1 2 0.0 0.1
SELECT PSXPTMPLDEFN 3 0.1 3 0.0 0.1
--------
0.3
AE Program: PSXPQRYRPT
MAIN.ExecRpt.H 1 0.0 1 0.0 1 0.0 0.0
--------
0.0
------------------------------------------------------------------------------------------
Call Non-SQL SQL Total
PeopleCode Count Time Time Time
------------------------------ ------- -------- -------- --------
AE Program: PSXPQRYRPT
MAIN.ExecRpt 1 643.2 0.3 643.5
-------- -------- --------
643.2 0.3 643.5
------------------------------------------------------------------------------------------
E x e c u t e
PEOPLECODE Builtin/Method Count Time
------------------------------------------------------- ------- --------
Boolean(Type 5) BuiltIns 90 0.1
DateTime(Type 11) BuiltIns 1 0.1
SQL(Type 524290) Methods 19 0.1
SQL(Type 524290) BuiltIns 9 0.1
Record(Type 524291) Methods 1104 0.1
Session(Type 524303) Methods 207 633.2
JavaObject(Type 524315) BuiltIns 6 2.2
PostReport(Type 524324) Methods 2 0.7
------------------------------------------------------------------------------------------
Total run time : 644.0
Total time in application SQL : 0.3 Percent time in application SQL : 0.0%
Total time in PeopleCode : 643.2 Percent time in PeopleCode : 99.9%
------------------------------------------------------------------------------------------
  • The total execution time of is 644 seconds.
  • 643.2s are reported as being in PeopleCode, and as coming from the MAIN.ExecRpt step.
  • Only 0.3s of that is SQL time, and that comes from PeopleCode functions in the SQL Class  (CreateRecord , CreateSQL etc.)
  • The ProcessReport method does not appear in the PeopleCode SQL analysis section of the report, because it is not classed as SQL.

Identifying the Report ID and Report Source
So, the next question is how can I find out which report this process is running.  Different instances of this report may be running different queries.

You can get the list files generated by a process from the Content Management PeopleTools table PS_CDM_FILE_LIST.  Application Engine processes usually produce various log files (with extensions .aet, .trc and .log), the name of the other file is the same as the name of the report followed by an extension that will vary depending on format.

You can look at the report definition on-line under Reporting Tools -> XML Publisher -> Report Definition, and that will show you the Data Source ID

The Data Source is defined in a component accessible from the previous entry in the same menu.

Analysing Execution Times
I have written this query to aggregate execution time for PSXPQRYRPT by output file name. 

  • PSPRCSRQST is the process request table, from which I get the run time of the process.
  • PS_CDM_FILE_LIST lists the files generated by the process which would be posted to the Report Repository. If I exclude the usual trace files, I am left with the report ID.
  • PSXPRPTDEFN is the report definition record, from which I can get the data source ID
  • PSXPDATSRC specifies the data source.  A data source type (DS_TYPE) of QRY indicates a PeopleSoft PS/Query.  If it is a private query, the OPRID will have a value.
#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
column report_defn_id heading 'Report ID'      format a12 
column ds_type heading 'Type' format a4
column ds_id heading 'Data Source ID' format a30
column oprid heading 'Owner' format a10
column processes heading 'Prcs' format 990
column reprots heading 'Reps' format 9,990
column secs heading 'Total|Seconds' format 999,990
column median heading 'Median|Seconds' format 999,990
column variance heading 'Variance' format 9990.0
SELECT d.report_defn_id, d.ds_type, d.ds_id, d.oprid
, SUM(y.processes) processes
, SUM(y.reports) reports
, SUM(y.secs) secs
, median(y.secs) median
, variance(y.secs) variance
FROM (
SELECT x.prcsinstance
, x.filename
, COUNT(DISTINCT x.prcsinstance) processes
, COUNT(*) reports
, SUM(x.secs) secs
FROM (
SELECT r.prcsinstance
, f.filename
, 86400*(r.enddttm-r.begindttm)*ratio_to_report(1) over (partition by r.prcsinstance) secs
FROM sysadm.psprcsrqst r
, sysadm.ps_cdm_file_list f
WHERE r.prcsname = 'PSXPQRYRPT'
AND r.prcsinstance = f.prcsinstance
AND NOT f.cdm_file_type IN('AET','TRC','LOG')
AND r.begindttm >= TRUNC(SYSDATE)
) x
GROUP BY x.prcsinstance, x.filename
) y
, sysadm.psxprptdefn d
WHERE d.report_defn_id = SUBSTR(y.filename,1,instr(y.filename,'.')-1)
GROUP BY d.report_defn_id, d.ds_type, d.ds_id, d.oprid
ORDER BY secs DESC
/

I can now see which report process has been executed run how many times, how many copies of the report have been produced, and where the processing time is being spent, and so which one I should look at first.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
Total   Median
Report ID Type Data Source ID Owner Prcs Reps Seconds Seconds Variance
------------ ---- ------------------------------ ---------- ---- ------ -------- -------- --------
XXX_WK_LATE QRY XXX_WKLY_LATENESS_RPT 20 20 2,973 148 3702.9
XXX_HRAM_CON QRY XXX_HRPD_CNT_AMD_RPT_QRY 92 92 2,677 27 108.4
XXX_CKOUT_RP QRY XXX_CHECKOUT_REPORT 47 47 2,043 41 347.7
XXX_BNK_RPT QRY XXX_BNK_DTLS_SCH_QRY 1 75 141 141 0.0

Conclusion

If you have a SQL performance problem with an XML report because the source PS/Query performs poorly, the batch timings will lead you to believe that you have a PeopleCode problem and not a SQL problem. 

Beware, this may not be the case.

Most of the execution time for XML Reports sourced from PS/Queries is almost certain to be SQL execution time.  The above query will tell you which queries are consuming how much time, and so inform your tuning effort.

davidkurtz's picture

Performance Metrics and XML Reporting in PeopleSoft

I am working with a PeopleSoft system that makes extensive use of XML Publisher reporting.  Generally these reports are based on SQL queries that are defined in the PeopleSoft PS/Query utility. 

These queries are run by an Application Engine program PSXPQRYRPT that runs the SQL query and produces the report in a single process.  This is a generic application engine program that runs any XML report.  Line 45 (in PeopleTools 8.49) of step MAIN.ExecRPT executes the report with the ProcessReport PeopleCode command.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
&oRptDefn.ProcessReport(&TemplateId, &Languaged, &AsOfDate, &oRptDefn.GetOutDestFormatString(&OutDestFormat));

Batch Timings
Analysis of the Application Engine batch timings indicate that nearly all the time in this Application Engine program is spent in PeopleCode, and that this not SQL execution time.  This is misleading.  The ProcessReport command is PeopleCode, but behind the scenes it also issues the SQL in the report data source.  Not all the time is SQL, but the Application Engine Batch Timings does not count any of this as SQL because it is not in the PeopleCode SQL Class. 

Let’s look at an example Batch Timings report (I have edited it down, removing zero and insignificant timings).

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
PeopleSoft Application Engine Timings
(All timings in seconds)

C o m p i l e E x e c u t e F e t c h Total
SQL Statement Count Time Count Time Count Time Time
------------------------------ ------- -------- ------- -------- ------- -------- --------
PeopleCode
Record.SelectByKey PSDBFIELD 8 0.1 8 0.0 0.1
SELECT PSPRCSRQST 2 0.1 2 0.0 0.1
SELECT PSXPTMPLDEFN 3 0.1 3 0.0 0.1
--------
0.3
AE Program: PSXPQRYRPT
MAIN.ExecRpt.H 1 0.0 1 0.0 1 0.0 0.0
--------
0.0
------------------------------------------------------------------------------------------
Call Non-SQL SQL Total
PeopleCode Count Time Time Time
------------------------------ ------- -------- -------- --------
AE Program: PSXPQRYRPT
MAIN.ExecRpt 1 643.2 0.3 643.5
-------- -------- --------
643.2 0.3 643.5
------------------------------------------------------------------------------------------
E x e c u t e
PEOPLECODE Builtin/Method Count Time
------------------------------------------------------- ------- --------
Boolean(Type 5) BuiltIns 90 0.1
DateTime(Type 11) BuiltIns 1 0.1
SQL(Type 524290) Methods 19 0.1
SQL(Type 524290) BuiltIns 9 0.1
Record(Type 524291) Methods 1104 0.1
Session(Type 524303) Methods 207 633.2
JavaObject(Type 524315) BuiltIns 6 2.2
PostReport(Type 524324) Methods 2 0.7
------------------------------------------------------------------------------------------
Total run time : 644.0
Total time in application SQL : 0.3 Percent time in application SQL : 0.0%
Total time in PeopleCode : 643.2 Percent time in PeopleCode : 99.9%
------------------------------------------------------------------------------------------
  • The total execution time of is 644 seconds.
  • 643.2s are reported as being in PeopleCode, and as coming from the MAIN.ExecRpt step.
  • Only 0.3s of that is SQL time, and that comes from PeopleCode functions in the SQL Class  (CreateRecord , CreateSQL etc.)
  • The ProcessReport method does not appear in the PeopleCode SQL analysis section of the report, because it is not classed as SQL.

Identifying the Report ID and Report Source
So, the next question is how can I find out which report this process is running.  Different instances of this report may be running different queries.

You can get the list files generated by a process from the Content Management PeopleTools table PS_CDM_FILE_LIST.  Application Engine processes usually produce various log files (with extensions .aet, .trc and .log), the name of the other file is the same as the name of the report followed by an extension that will vary depending on format.

You can look at the report definition on-line under Reporting Tools -> XML Publisher -> Report Definition, and that will show you the Data Source ID

The Data Source is defined in a component accessible from the previous entry in the same menu.

Analysing Execution Times
I have written this query to aggregate execution time for PSXPQRYRPT by output file name. 

  • PSPRCSRQST is the process request table, from which I get the run time of the process.
  • PS_CDM_FILE_LIST lists the files generated by the process which would be posted to the Report Repository. If I exclude the usual trace files, I am left with the report ID.
  • PSXPRPTDEFN is the report definition record, from which I can get the data source ID
  • PSXPDATSRC specifies the data source.  A data source type (DS_TYPE) of QRY indicates a PeopleSoft PS/Query.  If it is a private query, the OPRID will have a value.
#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
column report_defn_id heading 'Report ID'      format a12 
column ds_type heading 'Type' format a4
column ds_id heading 'Data Source ID' format a30
column oprid heading 'Owner' format a10
column processes heading 'Prcs' format 990
column reprots heading 'Reps' format 9,990
column secs heading 'Total|Seconds' format 999,990
column median heading 'Median|Seconds' format 999,990
column variance heading 'Variance' format 9990.0
SELECT d.report_defn_id, d.ds_type, d.ds_id, d.oprid
, SUM(y.processes) processes
, SUM(y.reports) reports
, SUM(y.secs) secs
, median(y.secs) median
, variance(y.secs) variance
FROM (
SELECT x.prcsinstance
, x.filename
, COUNT(DISTINCT x.prcsinstance) processes
, COUNT(*) reports
, SUM(x.secs) secs
FROM (
SELECT r.prcsinstance
, f.filename
, 86400*(r.enddttm-r.begindttm)*ratio_to_report(1) over (partition by r.prcsinstance) secs
FROM sysadm.psprcsrqst r
, sysadm.ps_cdm_file_list f
WHERE r.prcsname = 'PSXPQRYRPT'
AND r.prcsinstance = f.prcsinstance
AND NOT f.cdm_file_type IN('AET','TRC','LOG')
AND r.begindttm >= TRUNC(SYSDATE)
) x
GROUP BY x.prcsinstance, x.filename
) y
, sysadm.psxprptdefn d
WHERE d.report_defn_id = SUBSTR(y.filename,1,instr(y.filename,'.')-1)
GROUP BY d.report_defn_id, d.ds_type, d.ds_id, d.oprid
ORDER BY secs DESC
/

I can now see which report process has been executed run how many times, how many copies of the report have been produced, and where the processing time is being spent, and so which one I should look at first.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
Total   Median
Report ID Type Data Source ID Owner Prcs Reps Seconds Seconds Variance
------------ ---- ------------------------------ ---------- ---- ------ -------- -------- --------
XXX_WK_LATE QRY XXX_WKLY_LATENESS_RPT 20 20 2,973 148 3702.9
XXX_HRAM_CON QRY XXX_HRPD_CNT_AMD_RPT_QRY 92 92 2,677 27 108.4
XXX_CKOUT_RP QRY XXX_CHECKOUT_REPORT 47 47 2,043 41 347.7
XXX_BNK_RPT QRY XXX_BNK_DTLS_SCH_QRY 1 75 141 141 0.0

Conclusion

If you have a SQL performance problem with an XML report because the source PS/Query performs poorly, the batch timings will lead you to believe that you have a PeopleCode problem and not a SQL problem. 

Beware, this may not be the case.

Most of the execution time for XML Reports sourced from PS/Queries is almost certain to be SQL execution time.  The above query will tell you which queries are consuming how much time, and so inform your tuning effort.

Richard Foote's picture

New Additions To My Blogroll

Thought it was time to update my blogroll with a couple of new additions. Both Charles Hooper and Randolf Geist have provided excellent advice and information on the OTN forums for quite some time and both have excellent blogs that are well worth checking out. They’re also both co-authors in a new Oracle book I’m keen to get [...]

Improving performance with pipelined table functions

Using pipelined functions as a performance tuning tool. January 2010

chenshap's picture

Build Less (DB Design Version)

37Signals, the company behind few highly successful web-based applications, has published a book about their business building experience. Knowing that the company is both successful and has an unconventional business development philosophy, I decided to browse a bit.

One of the essays that caught my attention is “Build Less”. The idea is that instead of having more features than the competition (or more employees or whatever), you should strive to have less. To avoid any sense of irony – the essay is very short :)

One of the suggestions I would add to the essay is:
“Keep less data”

Keeping a lot of data is a pain. Indexing, partitioning, tuning, backup and recovery – everything is more painful when you have terabytes instead of gigabytes. And when it comes to cleaning data out, it always causes endless debates on how long to keep the data (3 month? 7 years?) and different life-cycle options (move to “old data” system? archiving? how to purge? What is the process?).

What’s more, a lot of the time customers would really prefer we won’t keep the data. Maybe its privacy concerns (when we keep a lot of search history) or difficulty in generating meaningful reports or just plain confusion caused by all those old projects floating around.

Google taught us that all the data should be stored forever. But perhaps your business can win by keeping less data.

To prevent automated spam submissions leave this field empty.