Reliable latch waits and a new blog

Tanel Poder

2009-01-20

Here’s a link to Alex Fatkulin’s blog if you haven’t seen it already: http://afatkulin.blogspot.com/

He has some good Oracle internals information in there, I also like his research style.

Alex just blogged about a finding (on Oracle 11g on Linux) that when Oracle process doesn’t get a latch after spinning, it goes to sleep using semop() system call, which never wakes up unless this semaphore is posted by another process. From past versions we remember that Oracle processes go to sleep for a short period of time, wake up, try to get the latch and sleep again for a longer period of time if unsuccessful (up to _max_exponential_sleep centiseconds). This kind of sleeping with timeout is done using semtimedop() syscall on Linux.

Also, for some latches the latch waiter posting was available. If a process failed to get a latch, it put a pointer to its state object into the waiter list for that latch and went to sleep for some centiseconds. If the latch holder released this latch, it scanned through a waiter list for that latch and posted the waiters, so that they would not have to sleep until the end of this x centisecond sleep. This used to be controllable using _latch_wait_posting parameter, but since 9i this parameter has been removed and most latches do have wait posting enabled by default.

With semaphores and posting, on most Unixes there have been problems in past with missed posts/wakeups, sometimes due bugs, sometimes just due the implementation of signals and semaphore operations in Unix kernel. So that’s why the past Oracle versions always have some kind of timeout for latch sleeps (and most enqueue sleeps and buffer busy waits as well, as a matter of fact). If a process manages to miss the wakeup call, it will wake up after some timeout anyway. Performance suffers, but at least the process won’t hang infinitely. And this was achieved using a semtimedop() systemcall (on Linux).

So, how come Alex saw just semop() calls in his test?

The answer is that apparently the minimum required Linux kernel used for Oracle 10.2+ does support reliable posting using semaphores, so Oracle is taking use of this.

There is a new parameter called _enable_reliable_latch_waits in 10.2+ and (at least) on Linux it is true. When it’s true, Oracle trusts that all wakeup calls (through semaphores) are received by the sleeping processes, thus there’s no need to periodically wake up to check whether the latch has become “available”.

Here’s a little test case:

I ran my lotshparses.sql script in three sqlplus sessions (on a 2-CPU machine) to create some shared pool latch contention (and sleeping for the latches).

Then I used strace to see what kind of semaphore operations one of the processes is using.

This is what I see when _enable_reliable_latch_waits is true ( the default in 10.2+, reliable wakeups )

$ strace -e trace=semop,semtimedop -p 15423
Process 15423 attached – interrupt to quit
  
semop(622592, 0xbf8afc70, 1) = 0
semop(622592, 0xbf8b0f30, 1) = 0
semop(622592, 0xbf8afc1c, 1) = 0
semop(622592, 0xbf8b0f30, 1) = 0
semop(622592, 0xbf8b0f30, 1) = 0
semop(622592, 0xbf8b0f30, 1) = 0
semop(622592, 0xbf8afc70, 1) = 0

This is what I see when I set _enable_reliable_latch_waits = false ( old fashioned behaviour, non-reliable wakeups, thus need to wake up every 300000000 nanoseconds ):

$ strace -e trace=semop,semtimedop -p 15423
Process 15423 attached – interrupt to quit
  
semtimedop(622592, 0xbf8af4b8, 1, {0, 300000000}) = 0
semtimedop(622592, 0xbf8aef10, 1, {0, 300000000}) = 0
semtimedop(622592, 0xbf8ac9c4, 1, {0, 300000000}) = 0
semtimedop(622592, 0xbf8ac9c4, 1, {0, 300000000}) = 0
semtimedop(622592, 0xbf8b23f8, 1, {0, 300000000}) = 0
semtimedop(622592, 0xbf8b083c, 1, {0, 300000000}) = 0
semtimedop(622592, 0xbf8b083c, 1, {0, 300000000}) = 0
semtimedop(622592, 0xbf8b083c, 1, {0, 300000000}) = 0
semtimedop(622592, 0xbf8b0f30, 1, {0, 300000000}) = 0
semtimedop(622592, 0xbf8b0468, 1, {0, 300000000}) = 0

  1. Updated video course material to be announced soon:
    Advanced Oracle SQL Tuning training. Advanced Oracle Troubleshooting training, Linux Performance & Troubleshooting training.
    Check the current versions out here!
  2. Get randomly timed updates by email or follow Social/RSS