This post is about the Private Strand Flush not complete waits which can be encountered in the Redolog write phase of a running Oracle Database.
In the scope of redologs, we have the most important waits are caused by the Redo Copy Latches, Redo Allocation Latches and Redo writing Latch. Note that Oracle Latches are like memory locks that protect important memory structures to get logically damaged because of the concurrent accesses.. In a manner, they are like the mutexes protects the critical section of a program 's code..
So after giving the information above, lets come back to our subject.. "Private strand flush not complete"
You can see this message in alert log of a 11gR2 Oracle database, and you may worry about your database performance, but according to the Oracle, this is expected behavior..
Take a look at the following blog post of mine; I expain strands in detail;
http://ermanarslan.blogspot.com.tr/2013/06/database-premature-archivelogs-log.html
So , like in the above figure; Writes to the redologs are made strand by strand.. The redolog is divided into parts, sized with the strand size, and each strand is written to the relevant part of the redolog. In other words; the strands of log buffers, which are in memory, are mapped to the redolog file.
Oracle Database keeps track and capture history of changes in the redolog files. They are like the first point of contact for database operations, which modifies, deletes or inserts data as well as metadata.
I will not go in details about the redologs, Log writer processes and Log buffers as they are not the main subjects of this post, rahter I will be focus on the Private strand flush not complete waits, those I have seen in some customer's site nowadays.
As we now, our processes write to the log buffer .. Lgwr background process writes to the redolog files.. Redologs files are switched when they are filled or manually. Dbwr works on the background to write the dirty buffers to disks..
The above sequence of events need to be quick, need to be in a regular fashion and need to be done without damaging the consistency of the logical process cycles.
So we can use them to understand the processing work flows.
For instance , when a process need to write in to the Log buffer, it needs to acquire Redo Copy latches. The latch name gives us the clue that makes us understand the event.. So this writing operations is a copy that is made from the process memory to the log buffer.
Redo Allocation Latch is acquired for allocating space in the Log buffer. This latch and redo copy latches seem to alike, as Oracle process acquire Redo allocation latch for small redo records, and do the copy operation without a need to acquire a Redo Copy latch.
Redo Writing Latch prevents the Oracle processes to not to send the needed signal to the LGWR process for a log buffer flush or a log switch operation, concurrently.
So as you see above there are some important latches which can create waits on those events..
For instance,
log file sync wait occurs if a process waits for LGWR to flush the data from the log buffer to redolog files. That is , the user process commits and waits on this event till LGWR will finish the data to the redolog files, till LGWR will send a signal indicating that the flush request is finished..
If LGWR is idle, I mean ony waiting, it will wait on Rdbms ipc message..
If LGWR is updating headers of the redolog files, you will see log file single write waits..
End if ; LGWR is writing the redo data from log buffer to redolog group, you will see log file parallel write waits, as this operation can be done in parallel.
If LGWR is updating headers of the redolog files, you will see log file single write waits..
End if ; LGWR is writing the redo data from log buffer to redolog group, you will see log file parallel write waits, as this operation can be done in parallel.
So after giving the information above, lets come back to our subject.. "Private strand flush not complete"
You can see this message in alert log of a 11gR2 Oracle database, and you may worry about your database performance, but according to the Oracle, this is expected behavior..
Take a look at the following blog post of mine; I expain strands in detail;
http://ermanarslan.blogspot.com.tr/2013/06/database-premature-archivelogs-log.html
So , like in the above figure; Writes to the redologs are made strand by strand.. The redolog is divided into parts, sized with the strand size, and each strand is written to the relevant part of the redolog. In other words; the strands of log buffers, which are in memory, are mapped to the redolog file.
According to the techinal info above, all strands need to be flushed when a log switch is being initiated.
That s why "checkpoint not complete" and "private strand flush not complete" are similar events..
This messages mean that there are some dirty blocks, some active sessions when the log switch is being initiated. A strand sill have transactions active which need to be flushed before this redo can be overwritten.
So it seems, Oracle writes this information to the alert log and keeps going;
For example;
Sqlplus >
SQL> update erm set X='Y';
1 row updated
"note that I do not commit and my session is active"
Another Sqlplus >
SQL>alter system switch logfile;
Alert Log >
Thread 1 cannot allocate new log, sequence 11
Private strand flush not complete
Current log# 1 seq# 222 mem# 0: /Erman/redolog01.log
So this is not a problem , but expected behavior..
Neverthenless, check the time between "private strand flush not complete" and "Advanced to log sequence"/"Current log#" message..
If the time between those messages are significant, then an I/O or DBWR or LGWR tunning may be required.
Hi Erman,
ReplyDeleteGreat Info. Nice way to explain.
I have come across a similar problem in a Production Database. We have noticed frequent (Every 3 minutes) message in ASH report "Private Strand Flush not complete".
As mentioned by you in the blog, that some tuning would be required in case the time between "private strand flush not complete" and "Advanced to log sequence"/"Current log#" message is significant.
from my Alert log, i found that the time between these tow messages is around 3 to 4 secs. Can it be considered significant ? How much time should be considered significant?
Thanks in advance.
Regards,
Pankaj
Hi Pankaj,
ReplyDeleteIt depends on your strand size, redologsize and transactions actually.