Tuesday, September 15, 2020

ASM / GRID -- My Thougths on %15 free size rule --- rebalance, imbalance, calculations, bugs and all that

The rule states that, in order to be in the safe side in case of a cell/or disk failure in Exadata/ASM environments, we need to have some free space in the relevant diskgroups. This actually guarantees the rebalance, which should be done after a disk or cell failure, to be succesful. The rule states that, at least %15 of a diskgroup should be free.

Well, I found this magic number, or let's say this magic percentage (%15) a little interesting and that's why I want to share my thoughths on this with you.

Normally, we have a metric named USABLE_FILE_MB as you may already know. It may depend on the version but, normally this metric gives us the safe allocatable size considering a case of a disk failure.. In the old versions, this was reporting the safe allocatable size, a value which can be taken as a reference for being safe even in a cell failure.

In simple logic, we can say that; we have no risks, ofcourse if the USABLE_FILE_MB has a positive value and if we think it will stay positive even when we consider potential new future allocations.

Moreover, USABLE_FILE_MB is derived by considering the REQUIRED_MIRROR_FREE_MB, which is the required size for a rebalance operation to complete in the worst case scenario.

The formulas are as follows;

Normal Redundancy
USABLE_FILE_MB = (FREE_MB – REQUIRED_MIRROR_FREE_MB) / 2

High Redundancy
USABLE_FILE_MB = (FREE_MB – REQUIRED_MIRROR_FREE_MB) / 3

If USABLE_FILE_MB is a negative value, then we can directly say that the normal redundancy environments are in danger, but in any case we can still check FREE_MB. If the value that we see in FREE_MB is bigger than the disk size (if the disk sizes are equal.. If they are not equal, then FREE_MB should be bigger than the largest disk size), we can still rebalance in case of a disk failure. 

So far so good. These are all related with disk failures. (as I mentioned earlier, we need to check the version and conclude what the USABLE_FILE_MB reports to us.. Usable file mb even in the case of a disk failure or Usable file mb even in the case of a cell failure)

Of course, if we lose a cell and if the USABLE_FILE_MB considers only the disk failures, the situation is different. We need to multiple the USABLE_FILE_MB with the count of disks in the cell.

It is independent from the redundancy being normal or high; for instance , if the USABLE_FILE_MB is 10 and it reporting us the usable file mb in the case of disk failures and if we have 12 disks in a cell, then we have to  multiply that value 10 with 12. This makes 120 and that 's minimum usable file mb that we need to see in USABLE_FILE_MB in order to be safe even in  a case of a cell failure.

At this point and in this context, following article of Emre Baransel might be nice for reading.

https://www.doag.org/formes/pubfiles/8587254/2016-INF-Emre_Baransel-A_Deep_Dive_into_ASM_Redundancy_in_Exadata-Manuskript.pdf 

Until here, if you notice, I have never mentioned the 15% rule.  So I have explained  the subject ignoring this rule, but actually this rule must not be ignored.

Now it is time to explain that rule:)

Well, we first revisit the MOS note named, "Understanding ASM Capacity and Reservation of Free Space in Exadata (Doc ID 1551288.1)".

In MOS note, we have a script that calculates the reserve space and capacity for the disk failure coverage and it has a reserve factor of 0.15 and that's where the %15 rule comes in..

When we examine the script, we can say that, it directly multiplies the raw total disk size by %15 and then, it substract that value from the raw total disk size.

In my opinion, it shouldn't be that way.. I mean, there shouldn't be a %15 rule and I think this subject is a little buggy.

Note that, at the moment;  we need to consider the %15 rule and we must follow it!

Anways; if we reserve %15 of space , are we safe ? Well, probably.. But, the following bug says that, even if we have %15 reserve space ,we still may have problem during rebalance..

Bug 21083850  ORA-15041 during rebalance despite having free space -> Bug 21083850 - ORA-15041 during rebalance despite having free space (Doc ID 21083850.8)

The cause of this bug is probably the imbalance during rebalance -> 

When a disk is force dropped, its partners lose a partner.
As a result, the partners of its partners get more extents relocated to them, causing an imbalance.
This imbalance results in the ORA-15041, because some disks run out of space faster than others.

In the document above, we see a patch is addressed. However, in another Oracle script, we see a comment like the following -> "Use the new 15% of DG size rule for single disk failure, regardless of redundancy type (Bug 21083850)" 

This makes me think that this subject is buggy :) The %15 rule is there not only to address that specific bug, but it is there due to other bugs as well.  In my opinion, these kinds of rules are there because of other problems.. In this specific case, probably because of imbalance, or let's say it is probably due to the ASM extents not being distributed properly.

Normally, when we lose a disk, ASM will distribute the mirror extents of that failing/lost disk to the other disks that are available on the relevant diskgroup (ofcourse, according to the redundancy type).. That comes from the logic of disk mirroring. However, probably, ASM distributes these extents not evenly and overloads some discs in some cases and that's where we get ORA-15041.

This situation can also be explained by ; having those disks already overloaded even before the rebalance.. So as you may guess, if ASM uses them aggressively during the rebalance they get full and the rebalance code returns an error.

Ofcouse, imbalance  may be normal in some cases.. For instance when we have fail groups .. 

That is; when we have a fail group configuration, ASM will have a more difficult job during the rebalance.. I mean, when we have fail groups; ASM will have less choices for distributing the mirror extents when a disk is dropped.. Still, I don't think that these kinds of causes should not be enough to reveal such a  rule (%15 rule)

Well, these are my thought on this subject... Please feel free to comment and correct me if I'm wrong. Please share your thoughts on this subjects by commenting to this blog post.

1 comment :

  1. While checking error ORA-15041 found doc, Doc ID 1367078.1. Here one sentence took my attention ;

    From Doc:
    " If any one disk is short of free_mb, then the error might be seen, even if there is sufficient free space in the whole diskgroup."

    This supports your thesis i believe, since expected behavior of ASM is to distribute extensions evenly, which seems ASM not smart enough till version 19c.

    Also another line from doc;

    "Starting 10.2, the total size of the disk is taken into consideration for allocations. So there will be imbalanced IO to disks. A future task would be to add/drop disks to have all the disks of same size."

    So starting with 10.2 allocation method uses total disk size not individual which apparently causes imbalanced disks.And future task might completed in 19c.

    ReplyDelete

If you will ask a question, please don't comment here..

For your questions, please create an issue into my forum.

Forum Link: http://ermanarslan.blogspot.com.tr/p/forum.html

Register and create an issue in the related category.
I will support you from there.