A post by Jonathan Lewis inspired me to finally complete my version of the Oracle session performance snapper script, which main characteristics are
- it reports Oracle session level performance counter and wait information in real time
- it does NOT require any database objects to be created
If you are a DBA or consultant working on ad-hoc performance issues, you will like it!
Are you familiar with following situation?
(Monday morning)
Customer: Hey, we need your help! We have serious performance issues in our production environment. It’s a database with x000 online users, several parallel data feeds, continuous batch jobs and reporting going on.
Users have started experiencing occasional bad performance and some batch jobs as well.
You: Ok, lets see it. Do you have any performance monitoring tools installed?
Customer: Yes, we have Xxxxx Xxxxxxxxx installed which shows us a nice green or red light on big screen depending whether there are any problems or not.
You: What color is it showing now?
Customer: Green
You: But you still have the performance problem you described earlier?
Customer: Yes
You: Do you have ANY performance monitoring tools installed?
Customer: Well, we also have a statspack snapshot taken every morning and evening to capture the business workload.
You: Um… Ok… you know forget about it. Can I create a small package to capture some useful performance info on the problematic sessions?
Customer: Yes, but we need to put the DDL scripts through the change review board which gathers every Thursday… but we can’t wait that long!
You: Can we enable tracing?
Customer: Enabling tracing is a change and all changes must go through review board. Also, we don’t really know which exact sessions are affected, it just happens for seemingly random ones… and we can’t just trace every session, can we?
You: Ok, give me a sqlplus window and Excel, we’ll figure something out.
…And now follows a tedious manual SQL execution and copy & paste exercise from various V$ views for getting some meaningful performance information out of Oracle.
Well, not anymore, because The Oracle Session Snapper is in town!
If you know vmstat for Unix, you know it reports you various system level statistic counter deltas over a period you choose.
Well, the Oracle Session Snapper output looks somewhat similar, but it reports you session level deltas of Oracle v$sesstat counters, wait events and starting from 10g also session time model statistics.
All info can be reported in real time, without a need for running and timing multiple SQL scripts and manual calculation of deltas.
And the key unique point of the Session Snapper is – it does not require creation of any database objects, thus no changes in the database at all! Everything is done from within a sqlplus script or anonymous PL/SQL block.
This means that you will be able to get quick session-level performance snapshots even in heavily change-controlled environments, where no object creation whatsoever is allowed without going through a long process.
You can read all the usage details from the script header, but here’s one example of its output:
Tanel@Sol01> @snapper out 1 3 475 -- Session Snapper v1.03 by Tanel Poder ( ) -------------------------------------------------------------------------------------------------------------------------------------------- -- SID, SNAPSHOT START , SECONDS , TYPE, STATISTIC , DELTA, D/SEC, HDELTA, HD/SEC -------------------------------------------------------------------------------------------------------------------------------------------- DATA, 475, 20070820 01:17:47, 1, STAT, session logical reads , 88232, 88232, 88.23k 88.23k DATA, 475, 20070820 01:17:47, 1, STAT, consistent gets , 88233, 88233, 88.23k 88.23k DATA, 475, 20070820 01:17:47, 1, STAT, consistent gets from cache , 88232, 88232, 88.23k 88.23k DATA, 475, 20070820 01:17:47, 1, STAT, calls to get snapshot scn: kcmgss , 556, 556, 556 556 DATA, 475, 20070820 01:17:47, 1, STAT, no work - consistent read gets , 87677, 87677, 87.68k 87.68k DATA, 475, 20070820 01:17:47, 1, STAT, table scans (short tables) , 139, 139, 139 139 DATA, 475, 20070820 01:17:47, 1, STAT, table scan rows gotten , 7429598, 7429598, 7.43M 7.43M DATA, 475, 20070820 01:17:47, 1, STAT, table scan blocks gotten , 87676, 87676, 87.68k 87.68k DATA, 475, 20070820 01:17:47, 1, STAT, buffer is pinned count , 138, 138, 138 138 -- End of snap 1 DATA, 475, 20070820 01:17:48, 1, STAT, session logical reads , 87779, 87779, 87.78k 87.78k DATA, 475, 20070820 01:17:48, 1, STAT, consistent gets , 87772, 87772, 87.77k 87.77k DATA, 475, 20070820 01:17:48, 1, STAT, consistent gets from cache , 87772, 87772, 87.77k 87.77k DATA, 475, 20070820 01:17:48, 1, STAT, calls to get snapshot scn: kcmgss , 552, 552, 552 552 DATA, 475, 20070820 01:17:48, 1, STAT, no work - consistent read gets , 87210, 87210, 87.21k 87.21k DATA, 475, 20070820 01:17:48, 1, STAT, table scans (short tables) , 138, 138, 138 138 DATA, 475, 20070820 01:17:48, 1, STAT, table scan rows gotten , 7389897, 7389897, 7.39M 7.39M DATA, 475, 20070820 01:17:48, 1, STAT, table scan blocks gotten , 87211, 87211, 87.21k 87.21k DATA, 475, 20070820 01:17:48, 1, STAT, buffer is pinned count , 136, 136, 136 136 -- End of snap 2 DATA, 475, 20070820 01:17:49, 1, STAT, session logical reads , 87580, 87580, 87.58k 87.58k DATA, 475, 20070820 01:17:49, 1, STAT, consistent gets , 87587, 87587, 87.59k 87.59k DATA, 475, 20070820 01:17:49, 1, STAT, consistent gets from cache , 87587, 87587, 87.59k 87.59k DATA, 475, 20070820 01:17:49, 1, STAT, calls to get snapshot scn: kcmgss , 552, 552, 552 552 DATA, 475, 20070820 01:17:49, 1, STAT, no work - consistent read gets , 87046, 87046, 87.05k 87.05k DATA, 475, 20070820 01:17:49, 1, STAT, table scans (short tables) , 138, 138, 138 138 DATA, 475, 20070820 01:17:49, 1, STAT, table scan rows gotten , 7375781, 7375781, 7.38M 7.38M DATA, 475, 20070820 01:17:49, 1, STAT, table scan blocks gotten , 87041, 87041, 87.04k 87.04k DATA, 475, 20070820 01:17:49, 1, STAT, buffer is pinned count , 137, 137, 137 137 -- End of snap 3 PL/SQL procedure successfully completed.
The output contains 3 x 1 second snapshot of session 475 doing heavy nested looping. Note that even though the CPU time used was not updated, the logical IO counts for that session had still increased.
So this tool can be very valuable diagnosing what’s going on when the session seems to be 100% on CPU doing something.
And yes, as I mentioned earlier, this tool can also capture wait and session time model statistics, as seen below:
Tanel@Sol01> @snapper out,gather=wt 30 1 475 -- Session Snapper v1.03 by Tanel Poder ( ) -------------------------------------------------------------------------------------------------------------------------------------------- -- SID, SNAPSHOT START , SECONDS , TYPE, STATISTIC , DELTA, D/SEC, HDELTA, HD/SEC -------------------------------------------------------------------------------------------------------------------------------------------- DATA, 475, 20070820 01:24:26, 30, TIME, hard parse elapsed time , 1041, 35, 1.04ms 34.7us DATA, 475, 20070820 01:24:26, 30, TIME, parse time elapsed , 15383, 513, 15.38ms 512.77us DATA, 475, 20070820 01:24:26, 30, TIME, failed parse elapsed time , 1614, 54, 1.61ms 53.8us DATA, 475, 20070820 01:24:26, 30, TIME, DB CPU , 3217049, 107235, 3.22s 107.23ms DATA, 475, 20070820 01:24:26, 30, TIME, sql execute elapsed time , 18862882, 628763, 18.86s 628.76ms DATA, 475, 20070820 01:24:26, 30, TIME, DB time , 18903962, 630132, 18.9s 630.13ms DATA, 475, 20070820 01:24:26, 30, WAIT, db file sequential read , 512218, 17074, 512.22ms 17.07ms DATA, 475, 20070820 01:24:26, 30, WAIT, db file scattered read , 17562414, 585414, 17.56s 585.41ms DATA, 475, 20070820 01:24:26, 30, WAIT, SQL*Net message to client , 762, 25, 762us 25.4us DATA, 475, 20070820 01:24:26, 30, WAIT, SQL*Net message from client , 9549187, 318306, 9.55s 318.31ms DATA, 475, 20070820 01:24:26, 30, WAIT, SQL*Net break/reset to client , 1070, 36, 1.07ms 35.67us -- End of snap 1 PL/SQL procedure successfully completed.
The output is in CSV and is deliberately formatted the way it could be loaded into a spreadsheet or fed into text manipulation utilities should further processing be required. You should have the sqlplus window at least 140 characters wide when outputting to sqlplus client.
With “trace” option you can also output to your server process tracefile, which allows you to get convenient real-time output using tail -f on the tracefile. However you need execute rights on dbms_system for trace output as the script is using dbms_output.ksdwrt() for writing to tracefile.
The first 5 columns of output should be self-explanatory, the last four columns mean:
- DELTA – counter raw deltas between snapshots
- D/SEC – counter raw deltas per second
- HDELTA – human-readable delta (e.g. M instead of million and ms instead of thousand microseconds etc)
- HD/SEC – human-readable deltas per second
The script doesn’t aim to be an example of good programming style (as it started out as just a ad-hoc hack for quickly diagnosing a performance problem).
Also there are quite a few things still in TODO list, but I think this script already fills its purpose – providing quick and simple session level performance snapshots, without any changes to database whatsoever.
So, if you’re still interested, feel free to download The Session Snapper (in form of a single SQL script) from here: https://github.com/tanelpoder/tpt-oracle/blob/master/snapper.sql (Obviously you need to proofread it before using in any of your databases).
I have already used a predecessor of this tool in many places, so I hope you find the Snapper useful too. Feel free to feed me back any shortcomings or desired improvements so I could improve it further.
Update: Over time I have improved Snapper and optimized its output format. Here are the update and new features articles: