There was a question in Oracle-L mailinglist recently, regarding excessive SQL*Net break/reset to client
waiting by a session.
A quote is below:
We are seeing an insert statement reporting
SQL*Net break/reset to client
as over 33% of its time.On the face of it this event suggests network issues but nothing else backs this up as the cause.
So I looked at the Java code in question and a trace of one of the sessions.
What is happening is that an attempt is made to insert a row, most of the time a duplicate error results, the code catches this exception and does an update.
I was wondering if its the duplicate error and the exception handling which results in this wait event showing up.
My answer to that was following:
Yes, a SQL*Net break/reset happens when an error/unhandled exception is raised during a call (which means that the call executed didn’t complete normally, thus the call state must be reset).
The approach of “insert -> if failed then update” is basically what MERGE does.
You need to change the application to use MERGE command. Alternatively you could use an “update -> if no-rows-updated then insert” approach, but MERGE makes much more sense nowadays.
So, these waits aren’t really indicating any network bottleneck, but rather just the fact that the client needs to be notified if it’s call has failed and Oracle has to clean up after the failed call. Apparently on some (network protocol) architectures or in some scenarios these operations have taken significant time (and may have required a system call), so a kernel developer has decided to wait instrument them.
In this blog entry I want to share a small test case which illustrates the point. First lets check how many times my session has waited for a SQL*Net break/reset wait
event:
SQL> select event, total_waits from v$session_event where event like '%reset%' and sid = (select sid from v$mystat where rownum = 1); no rows selected
As V$SESSION_EVENT (like also V$SYSTEM_EVENT) doesn’t display events for which nobody has waited yet, I see that my session’s wait count for that event is zero.
Now let’s create an error, but also let’s handle it in a PL/SQL exception handler:
SQL> begin 2 execute immediate 'drop table non_existent'; 3 exception 4 when others then null; 5 end; 6 / PL/SQL procedure successfully completed. SQL> select event, total_waits from v$session_event where event like '%reset%' and sid = (select sid from v$mystat where rownum = 1); no rows selected
Still no breaks/resets as the error was handled and the OPI call completed properly.
Now let’s run the same PL/SQL block without exception handler:
SQL> begin 2 execute immediate 'drop table non_existent'; 3 end; 4 / begin * ERROR at line 1: ORA-00942: table or view does not exist ORA-06512: at line 2 SQL> select event, total_waits from v$session_event where event like '%reset%' and sid = (select sid from v$mystat where rownum = 1); EVENT TOTAL_WAITS ---------------------------------------------------------------- ----------- SQL*Net break/reset to client 2
You see, there are 2 break/reset waits registered.
Also normal SQL (not fired from PL/SQL but from top level instead) causes breaks/resets if it errors:
SQL> select * from non_existent; select * from non_existent * ERROR at line 1: ORA-00942: table or view does not exist SQL> select event, total_waits from v$session_event where event like '%reset%' and sid = (select sid from v$mystat where rownum = 1); EVENT TOTAL_WAITS ---------------------------------------------------------------- ----------- SQL*Net break/reset to client 4 SQL>
Why 2 waits for a single error? I don’t know exact details, but the 10046 trace reveals that one of the waits is registered by a break operation and another one (apparently) not, thus it might be the call state reset operation just accounted separately:
===================== PARSE ERROR #10:len=27 dep=0 uid=0 oct=3 lid=0 tim=1967444453 err=942 select * from non_existent WAIT #10: nam='SQL*Net break/reset to client' ela= 2 driver id=1413697536 break?=1 p3=0 obj#=-1 tim=1967444525 WAIT #10: nam='SQL*Net break/reset to client' ela= 180 driver id=1413697536 break?=0 p3=0 obj#=-1 tim=1967444718 WAIT #10: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1967444752
Anyway, as I said in my Oracle-L reply, these breaks are caused by bad application design which allows too many unhandled exceptions to be propagated all the way up to the client.
The solutions would be to reduce or eliminate the number of errors occurring, or at least put that code into PL/SQL blocks where errors would be handled and not propagated back to client every time.