Have you experienced such situation before?
- A performance issue happens in production – let say some batch job has ran way over time
- You enable SQL trace on the problem session (while the problem is already ongoing)
- In tracefile you see lots of waits (or execs or fetches) caused by cursor X
- You grep for “PARSING IN CURSOR #X” above the waits in the tracefile but don’t find the corresponding parsing entry nor SQL text there (this is a “top-truncated” tracefile)
- You really want to know which SQL corresponds to all those WAIT #X/FETCH #X lines
An example output would be here:
WAIT #2: nam='db file scattered read' ela= 25190 file#=1 block#=863 blocks=2 obj#=113 tim=548703769817 WAIT #2: nam='db file sequential read' ela= 51397 file#=1 block#=864 blocks=1 obj#=113 tim=548719015123 WAIT #2: nam='db file scattered read' ela= 348553 file#=1 block#=5969 blocks=8 obj#=113 tim=548732315966 WAIT #2: nam='db file scattered read' ela= 32275 file#=1 block#=5972 blocks=5 obj#=113 tim=548771073619
In above case the cursor #2 was causing all these WAIT lines. The problem is, that if the cursor #2 was parsed before the SQL trace was enabled, we won’t have the SQL statement dumped in the tracefile! Thus, by looking into the tracefile only, there’s no way to know which exact SQL statement is causing those waits.
If this was a single long-running SQL statement, you could easily look into v$session.sql_hash_value and map this to SQL text using v$sql. However things aren’t as simple when you have lots of PARSE,EXEC,FETCH calls of other statements happening in between (as the sql_hash_value would be constantly changing due each such call.
Also, other v$ (or x$) views don’t show you anything about the WAIT #X mapping to SQL statement as this stuff is not stored in SGA.
As a workaround, you could take the obj# and map it back to object number the (io) wait happened on and make some guesses about SQL statement from there. Or you could dump the datablock using file# and block# numbers, look at the data and make some further guesses about SQL statement, but guesses are no good as they can be wrong and misleading – and you would be dependent on luck in your troubleshooting. But you want to be systematic, right? ;-)
So, in case of the top-truncated tracefile, if you want to be sure which WAIT# (or FETCH#,EXEC#) lines correspond to which SQL statement, we need to use some other way than v$views or trying to guess based on bits in the tracefile.
And guess what – its easy! Well, as long as the cursor of interest is still open… Time to talk about cursordump!
Since Oracle 10g there’s a new dump event, called CURSORDUMP in Oracle. Depending on level, it can dump either shared cursors from library cache or opened/cached cursors of a session.
As CURSORDUMP is a little documented event and as it crashes on some early Oracle 10g versions, I don’t recommend to use that.
But there are the widely known and Metalink-documented ERRORSTACK and PROCESSSTATE dumps which also dump the cursordump data!
When you dump ERRORSTACK at level 3 or PROCESSSTATE at level 10 with oradebug, you will have cursors from target process’es session open cursor array dumped to trace.
Among a lot of other info, you will see an overview section which mentions how many cursors are opened by that session and in which state they are (like parsing or fetch-ready state):
******************** Session Cursor Dump ********************** Current cursor: 2, pgadep: 0 Open cursors(pls, sys, hwm, max): 2(0, 0, 64, 300) NULL 0 SYNTAX 0 PARSE 0 BOUND 0 FETCH 2 ROW 0 Cached frame pages(total, free): 4k(11, 2), 8k(0, 0), 16k(0, 0), 32k(0, 0) pgactx: 257EED54 ctxcbk: 00000000 ctxqbc: 00000000 ctxrws: 257F5B8C
But the more interesting details, follow. Search for “Cursor#X” – where X is the number you see after WAIT #X or FETCH #X lines in SQL trace output. So, in my test case I had to search for “Cursor#2“, and here’s what I found:
Cursor#2(08CB0C64) state=FETCH curiob=08CCC1C4 curflg=46 fl2=0 par=00000000 ses=28B267BC sqltxt(24D31E04)=select count(*) from dba_source, dba_source where rownum <= :x <-- this is our query! hash=f6f68207617635b44139982d3d1ba449 parent=25A0616C maxchild=01 plk=263617CC ppn=n cursor instantiation=08CCC1C4 child#0(2844ABBC) pcs=25A06370 clk=263DDEAC ci=24AED5A0 pn=22D8DB44 ctx=257EED54 kgsccflg=0 llk[08CCC1C8,08CCC1C8] idx=0 xscflg=c0110476 fl2=5200009 fl3=42222008 fl4=0 Bind bytecodes Opcode = 2 Bind Twotask Scalar Sql In (may be out) Copy oacdef = 248dbf18 Offsi = 36, Offsi = 0 kkscoacd Bind#0 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=03 fl2=1000000 frm=00 csi=00 siz=24 off=0 kxsbbbfp=08cccd6c bln=22 avl=06 flg=05 value=1234567890 < -- we can even see the current bind variable values! Frames pfr 08CCD0F4 siz=20196 efr 08CCD13C siz=20132 Cursor frame dump enxt: 9.0x0000000c enxt: 8.0x0000059c enxt: 7.0x00000f9c enxt: 6.0x00000d2c enxt: 5.0x00000c8c enxt: 4.0x00000de8 enxt: 3.0x000004f0 enxt: 2.0x00000014 enxt: 1.0x00000cc8 pnxt: 2.0x00000038 pnxt: 1.0x00000008 kxscphp 08CB6454 siz=1000 inu=0 nps=436 kxscdfhp 08CB65A0 siz=1000 inu=0 nps=56 kxscehp 08CB6834 siz=1000 inu=0 nps=108 kxscwhp 08CB693C siz=5279776 inu=0 nps=5274832 < -- by the way, these entries under "Cursor frame dump" give useful info about how much private memory a cursor is using for its different heaps (to be continued in a future post... ;)
So, extracting both the SQL statement and bind variable info is pretty easy with ERRORSTACK level 3 and PROCESSSTATE level 10 dumps.
By the way, this number X what you see in WAIT #X or Cursor#X lines in trace, is the slot number in open cursor array in UGA.
Update:
After Randolf’s comment below I think one thing needs elaboration – how to detect when a cursor was closed from a SQL trace file.
I’ve written a little note about it here: