Monday, June 2, 2014

RDBMS-- _fastpin_enable, buffer pinning, latches, reference counting & CAS

In Oracle 11g there is an optimization in reaching the buffers. It is a feature about the consistent gets for reducing the latch contention. It is controlled with _fastpin_enable hidden parameter. _fastpin_enable is set to 1 by default. It enables reference count based fast pins.

As we know, a Buffer can  be pinned by a session , and thus can be revisited without having to perform another logical read in the same fetch call. Pinning a Buffer also eliminates the need for acquiring the cache buffers chains to locate the same buffer again. So we can say pinning a buffer decreases the logical read..
We can also say that, pinning a buffer increases visiting the same block (or vice-versa) ,thus decreases the logical reads. We can even say that having a good clustering factor on an index can make oracle to pin the blocks as it would visit the same blocks for accessing the needed data resides on the indexed table.. :)
There are buffer is pinned count and buffer is not pinned count statistics can be used to track it.
Anyways, _fastpin_enable is an optimization made to enable fastpath buffer gets.
When this optimization is used, you will see consistent gets from cache (fastpath) or db block gets from cache (fastpath) session statistics increased.

In the description of this hidden parameter, it says :_fastpin_enable : enable reference count based fast pins..
So what is this reference count based approach then?
In computer science, reference counting is a technique of storing the number of references, pointers, or handles to a resource such as an object, block of memory, disk space or other resource.

So, by using reference counting approach,  my first guess is Oracle may have implemented a new design for buffer pinning and constructed a routine for creating smart pointers for buffers to provide this fast pin optimization... But when I make a research about smart pointers, I dont see anything that may accelerate this pinning process..

Reference counting process works like below;

Attach a counter to a memory region.
When a new pointer is connected to that memory, increment the counter.
When a pointer is removed, decrement the counter.
Any region with a 0 counter is garbage, and can be re-used.

This seems not related with fast pinning again. 
Then how is Oracle accelerate the buffer pining using a reference count based approach? 
I 'm suspecting from the description.
I have read that fast pinning is supplied using a compare & swap (CAS) technique..

Here is a general definition for it;

CAS is used for implementing synchronization primitives like semaphores and mutexes, likewise more sophisticated lock-free and wait-free algorithms. 
Algorithms built around CAS typically read some key memory location and remember the old value. Based on that old value, they compute some new value. Then they try to swap in the new value using CAS, where the comparison checks for the location still being equal to the old value. 
ın multiprocessor systems, it is usually impossible to disable interrupts on all processors at the same time. Even if it were possible, two or more processors could be attempting to access the same semaphore's memory at the same time, and thus atomicity would not be achieved. The compare-and-swap instruction allows any processor to atomically test and modify a memory location, preventing such multiple-processor collisions.

So, what does pin mean ? 

how can it be safe to visit a buffered block without first using a latch to protect it - the answer is that
you have to anticipate using it several times, so you use latching to acquire it the first time and latching to release it when you have finished with it, but pin it in the interim so that you can visit it several times without having to go through the CPU intensive process of competing for latches.  (Ref: Jonathan Levis)

Buffers (technically buffer headers) can be pinned in exclusive mode or shared mode. If you pin a buffer in exclusive mode (which sets the mode_held in x$bh to the value 2) then other sessions that want to pin the buffer will attach their pins to a “waiters list” on the buffer header, and go into a “buffer busy waits” state

Jonathan Levis also explains the buffer pinning process as follows;

In order to pin a buffer header a session must first acquire a buffer handle, and the first step to doing this is to grab the cache buffer handles latch to protect the integrity of the x$kcbbf array. Of course if the session had to grab this latch every time it wanted to pin a buffer the latch would become a major point of contention - so each session is allowed to build a little “reserved set” or cache of handles. The limit on the number of reserved handles that a session can mark is 5 - set by the parameter _db_handles_cached; a session does not need to get the latch to use, and re-use, this small set of handles.

Also about latches and pins ; I have found a very good explanation in community; 
Reference: Jonathan Levis

Option 1:
Acquire the relevant "cache buffers chains" latch in shared mode (i.e. increment the read count)
walk the buffer chain to find the buffer
read the block
Release the latch (i.e. decrement the read count)

Option 2:
Acquire the relevant "cache buffers chains" latch in exclusive mode (i.e. set the write bit)
walk the buffer chain to find the buffer
attach a "pin" structure to the buffer
release the latch (i.e. clear the write bit)
read the block
-- the next steps may be postponed until the end of the current database call
Acquire the relevant "cache buffers chains" latch in exclusive mode
detach the "pin" structure
release the latch


So according to this information , I still can not see how do CAS or Reference Count Based approach accelearete the pinning process ?
It must be something in the code, an opmization exaclty in the code that acquires the pin..

Here is an implementation of CAS;  Reference: http://preshing.com

void LockFreeQueue::push(Node* newHead)
{
    for (;;)
    {
        // Copy a shared variable (m_Head) to a local.
        Node* oldHead = m_Head;

        // Do some speculative work, not yet visible to other threads.
        newHead->next = oldHead;

        // Next, attempt to publish our changes to the shared variable.
        // If the shared variable hasn't changed, the CAS succeeds and we return.
        // Otherwise, repeat.
        if (_InterlockedCompareExchange(&m_Head, newHead, oldHead) == oldHead)
            return;
    }
}

So when we put these together;
There are 2 assumptions;

When _fastpin_enable is set to 1;

Assumption 1: 

If not pinned already, Oracle pins the buffers using latches.. When acquiring these latches , Oracle uses compare and swap.. It uses compare and swap for checking the availability of the latches.. Reference counting is not used here..  The method for acquring the latches increases the buffer pinning performance.

Or it may be something like the following;

Assumption 2:

If no one is using the buffer that a process wants to pin;  the session/process sees it using Reference counting (reference count=0) and uses CAS to pin the buffer. (as CAS is automatic, no race condition can crash the concurrency here) ..  So no latches(or fewer latches) are needed and because of the CAS instructions, so the buffer pinning performance increases.. 


Note that: I am still not use how and where these fastpath pin code optimization take place.. I have sent an email to Tanel Poder.. I will update this post, when I will get his reply.

No comments :

Post a Comment