Tuesday, August 26, 2014

RDBMS -- Undo tablespace in another aspect, sizing undo, undo_retention

Undo space become a throuble sometimes, especially for crowded databases..
We configure and control undo space using Undo tablespace size and Undo_retention parameter generally. That is , we size our undo according to the needs of our databsae sessions and monitor the usage..
When setting those parameters, we also concentrate on the Flashback needs of our database. In order to retain the undo needed for being able to flashback our database objects to a point in time, we need to adjust the undo parameters in a way that almost guarantees our flashback data to be there when we need it.
Note that , there are also ORA-01555 snaphot too old error may be waiting for us , if the adjustments are not done properly..So we need to know our database character very well, we need to analyze statistics  and then set those parameters to the optimal values that we gather from our analysis. (Note : the undo information for committed transactions can be overwritten , it is not related with that..)
Note that : We have seen and learned that ORA-01555 errors mostly caused by long queries which try to get overwritten--older images of changed blocks.. As these images are overwritten due to undo management mechanism, those queries have returned ORA-01555 errors. Also consider the unsual ORa-01555 scenarios : http://ermanarslan.blogspot.com.tr/2014/05/rdbms-unusual-ora-01555.html

There are script to find the optimal undo retention; like the following;
SELECT 

       ROUND((erm.undo_size / (to_number(f.value) *
g.undo_block_per_sec))) "OPTIMAL UNDO RETENTION [Sec]"
  FROM (
       SELECT SUM(a.bytes) undo_size
          FROM v$datafile a,
               v$tablespace b,
               dba_tablespaces c
         WHERE c.contents = 'UNDO'
           AND c.status = 'ONLINE'
           AND b.name = c.tablespace_name
           AND a.ts# = b.ts#
       ) erm,
       v$parameter e,
       v$parameter f,
       (
       SELECT MAX(undoblks/((end_time-begin_time)*3600*24))
              undo_block_per_sec
         FROM v$undostat
       ) g
WHERE e.name = 'undo_retention'
  AND f.name = 'db_block_size'
  
As you see above, it takes db_block_size and multiplies it with undo block per seconds to find the actual undo size that is produced in a second..
Then it divides undo size to the undo size per second to find the optimal undo retention value..
Actually, it finds optimal undo retention according to the current undo tablespace size..
It makes sense right? It finds generated undo in a sec, and then divides undo size to this value to find how many undo per sec(it is in bytes/or kbytes), it can retain with the current undo size..
In other words; "How many undo per sec (note that it s a quantity but in seconds..) can be retained with the current undo table space" gives us the optimal undo retention value..
Also, there is tool named Undo Advisor for adjusting the undo parameters. 
You may look the details if you are interested with it..

So far so good..

We size undo on disk but do we think of memory? Yes, we dont most of the times...
We know undo is cached in the buffer cache.. It is cached like any other data blocks.. 
Oracle Buffer cache mechanism flushes the blocks according to the LRU.. When the cache is almost full / or when the checkpoint occurs, Dbwr does the job and writes the dirty blocks to the disk..  So if we have a lot of dirty blocks , Dbwr must make a lot of operations to flush the buffer cache and  this will affect Dbwr performance..
Dbwr need to make these flush operations even when your query is working with the undo blocks for the first time.. That is if you have  a big tablespace and if your undo parameters let Oracle to use the new undo blocks rather than overwriting, Oracle will load the undo blocks to cache which will trigger Dbwr if there are no available cells in the cache for storing these undo blocks.  Maybe Dbwr will flush the dirty blocks and then fill the overwritable cells with the undo blocks and then flush again and so on..

You can look to the buffers in the buffer cache using the query below;

SELECT /*+ LEADING (BH DO) */
  DF.TABLESPACE_NAME,
  BH.OBJD,
  BH.FILE#,
  BH.BLOCK#,
  BH.STATUS,
  BH.CLASS#
FROM
  V$BH BH,
  DBA_OBJECTS DO,
  DBA_DATA_FILES DF
WHERE
  BH.OBJD=DO.DATA_OBJECT_ID(+)
  AND DO.DATA_OBJECT_ID IS NULL
  AND BH.FILE#=DF.FILE_ID;

Class number:
1       data or index blocks
2       sort blocks
3       save undo blocks
4       segment headers
5       save undo segment header blocks
6       free list blocks
7       system undo segment header blocks
8       system undo segment blocks
7+(n*2) undo segment n header block
8+(n*2) undo segment n block 

Anayways.. The solution to this problem, would be using the same bunch of undo blocks for undo operations.. So when we talk about optimal values in here , we need to consider Dbwr performance as well..  Also, while considering Dbwr performance and buffer cache activities, we need to consider the errors that might be hit because of under-valued undo parameters..(ORA-1555 and inability to flashback)
What I try to mean is, 
we need to set the optimal undo tablespace size and retention parameters in order to have the ability the flashback,  not to hit ORA-1555 and not to affect Dbwr performance..

To set undo tablespace size and undo_retention , we need to consider the following facts;

If the Undo Tablespace autoextend -  undo retention will be your low threshold. Oracle will never overwrite undo which is retained less than 30 minutes.
Also Oracle will tune the undo retention parameter by itself internally to supply the needed undo for the long queries in  the database.(if it have space)
Finding Tuned undo retention : SELECT TO_CHAR(BEGIN_TIME, 'MM/DD/YYYY HH24:MI:SS') BEGIN_TIME,
TUNED_UNDORETENTION FROM V$UNDOSTAT;

IF the Undo tablespace size is fixed - Oracle ignores undo_retention value.. It will automatically tune undo to have a maximum possible retention period.
Note that :  When non-autoextensible undo space is used, tuned_undoretention is calculated based on a percentage of the undo tablespace size. In some cases especially with large undo tablespace, This will make it to be calculated so large and this may result a large undo_retention..
To fix this behaviour, Set _smu_debug_mode=33554432:
With this setting, TUNED_UNDORETENTION is not calculated based on a percentage of the fixed size undo tablespace. Instead it is set to the maximum of (MAXQUERYLEN secs + 300) and UNDO_RETENTION.

So in brief, undo_retention declares the minimum threshold.. The database behaviour depends on the undo tablespace size and the undo datafile's autoextendability.

Okay, how to find the optimal undo tablespace size and undo retention ?
To find the optimal values; we need to analyze.
I mean, by knowing the facts and using the information gathered fro mthe analysis; we can set the optimal undo values for our Mission Critic databases..
We can start by making the undo tablespace  autoextend in the first place.. Then we can monitor and see till where it is extended? While our undo tablespace is being extended, we need to consider our buffer cache size, too .. As we dont want to exploit it.  We also need the consider our required Flashback time..
So for all of these, we can use undo advisor at this time...
On the other hand, If we dont have undo advisor or dont want to use undo advisor, we should follow the following  method, and set the optimal values for undo..


  • Tune your application queries.. Very Long processess may encounter failures while extending undo tablespace (even if your undo tablespace is big in size) , Very long queries may encounter failures while trying to gather an old undo image which may be overwritten.
  • In my opinion, dont use unlimited autoextend undo datafiles.. If you use autoextend undos , you may end up with a big undo tablespace and you may miss out the problems with processes, which use undo tablespace unreasonably.
  • The UNDO_RETENTION value should at least be equal to the length of longest running query (find : select max(maxquerylen) from v$undostat;) on a given database instance.
  • Consider your Flashback requirements while setting undo retention.. When you enable the Guarentee option, the database never overwrites unexpired undo data. That is undo data whose age is less than the undo retention period..
  • Also think about Guarantee option; if  flashback guarantee is a must for you..
  • If you have decided your minimum Undo_retention, then Set your undo tablespace using the following formula; " Undo_retention * Generated Undo Blocks per second * block size " .. The values used in this formula should be derived from the Peak times.. 
    • Supposing , Undo retention is already set and The blocksize is constant..
    • Generated Undo Blocks per second ->SELECT undoblks/((end_time-begin_time)*86400) "Peak Undo Block Generation" FROM v$undostat WHERE undoblks=(SELECT MAX(undoblks) FROM v$undostat); 

    Okay, subject began to fall apart :) Lets summarize the things mentioned in this blog post and come to a conclusion.. 

    In Brief, "decide your undo_retention based on the longest running query in your database and flashback requirements of yours .. Then enable autoextend with a reasonable limit on your undo datafiles to satisfy this undo retention."
    References:
FAQ – Automatic Undo Management (AUM) / System Managed Undo (SMU) (Doc ID 461480.1)
Expert Oracle Database 11g Administration, Sam Alapati
Asktom, Tom Kyte
How To Size UNDO Tablespace For Automatic Undo Management (Doc ID 262066.1)

How to Shrink the datafile of Undo Tablespace
Bug 5387030 - Automatic tuning of undo_retention causes unusual extra space allocation
Bug 5442919 - Expired extents not being reused (ORA-30036)
10g NEW FEATURE on AUTOMATIC UNDO RETENTION
Bug 4070480 - Unexpired extents used when there is free space available in the UNDO tablespace
Managing Rollback/Undo Segments in AUM (Automatic Undo Management)

No comments :

Post a Comment