Tuesday, April 7, 2015

ZFS -- NFS shares unreachable , IPMP, Probe Based vs Link Based Failure detection

IPMP supplies an network multipathing mechanism for ZFS storages.. It is a feature in Solaris Operating Systems and means the same thing as device mapper multipath for me..
When there are two interfaces in your system, you can make create and IPMP group in front of them and thus benefit from the fault tolerancy and load balancing..


IPMP detects the problems in the interfaces and takes the necessary actions.. For example: in an active-standby configuration, IPMP regularly checks the active interface and put the standby in use if necessary.. It is implemented in L3, so the switch does not know or need to know anything about it, we just plug the cables to the switch and dont do any configuration ..
So it differs from LACP..


IPMP can make this detection using 2 different methods. IPMP can use a probe based detection to detect the problems, or it may use a Link based detection to recognize the problems..

In Probe Based Detection; IPMP uses ICMP probes to check the interfaces.
So in order to work properly, Probe based detection requires at least one neighbour or a default gateway, which can respond to ICMP probes, must be present in the same subnet. Probe Based Detection gets activated when the test addresses are used in the IPMP configuration. IPMP uses these tests addresses to make the ICMP probes That is the IPMP daemon sends ICMP probes on test address to one or more target systems on the same subnet. The target systems are determined dynamically.. (First, the routing table will be scanned for gateways (routers) on the same subnet as the IP interface's test address and up to five will be selected. If no gateways on the same subnet were found, the system will send a multicast ICMP probe (to 224.0.01. for IPv4 or ff02::1 for IPv6) and select the first five systems on the same subnet that respond)

In Link Based Detection; IPMP uses the interface kernel drivers and check the changes in IFF_RUNNING flag on the interfaces.. So it does not send ICMP probes and does not require any IP address to be present and allocated for the test addresses.

It is actually the default failure detection in IPMP, but it gets activated in ZFS when "0.0.0.0/8" is used for the test addresses.

So both mechanism can be choosen while implementing IPMP in ZFS Storages.. But I have to say that Probe Based Detection is a little delicate..

I have also need to add that in the "Backup and Recovery Performance and Best Practices using the Sun ZFS Storage Appliance with the Oracle Exadata Database Machine", Link based Detection is used for IPMP.

I have seen probing fails in an Exadata environment; and I think; the following real life example can increase our motivation for using Link Based Detection rather than Probe based detection in ZFS environments.

Note that for complex network architectures; link based detection may be misleading.
Check the following link to understand what I meant : http://www.c0t0d0s0.org/archives/6294-Less-known-Solaris-features-IP-Multipathing-Part-3-Foundations-2.html

Environment:
Solaris ZFS Storage ZS3-2 connected to Exadata via ethernet , through Cisco Switch

Change : 
A recent network change in the infrastructure. Evertying is working properly , but the ZFS shares..

Problem:
Unable to reach the NFS shares of ZFS.
IPMP is down in ZFS.
IPMP uses Probe Based Detection, so it is dependent to the other systems in the network, as it needs to send ICMP probes to them..
Network has ping problems.

64 bytes from 10.10.10.11: icmp_seq=13 ttl=255 time=0.120 ms
64 bytes from 10.10.10.11: icmp_seq=14 ttl=255 time=0.136 ms --- A gap.. from seq 14 to 27
64 bytes from 10.10.10.11: icmp_seq=27 ttl=255 time=0.123 ms
64 bytes from 10.10.10.11: icmp_seq=28 ttl=255 time=0.095 ms
64 bytes from 10.10.10.11: icmp_seq=29 ttl=255 time=0.147 ms
64 bytes from 10.10.10.11: icmp_seq=30 ttl=255 time=0.119 ms

Effect:
Can not backup the PROD database into ZFS..

Diagnostic and Solution:
Check the logs using CLI > ssh to Management ip
issue the command -> "maintenance logs select alert show"

*Problem :
All Interfaces in group groupname have failed
Network connectivity via datalink ixgbe1 has been lost.
connectivity via interface ixgbe1 has been lost due to probe-based failure., Major alert
It seemed, IPMP made all the interfaces because of a single point of failure.. Specifying specific hosts for probing may solve this one, but no need. We have only one hope, which is connected to  Exadata's Cisco switch,so link based detection can be used in here..

Reach the ZFS using BUI (or CLI may be used too..)
URL: https://management_ip_address_of_ZFS:215
Check the IPMP configuration and change it to used Link Based failure detection mechanism.
Choose Network > Configuration > Click on IPMP group  > Click on the interfaces and  update their ip addresses with 0.0.0.0/8..


This action makes the link based detection become active and this in-turn makes the interfaces so the IPMP group up&ready.. As a result, shares are available again..

In conclusion, 
IPMP in the network stack of ZFS increases the availability and its failure detection mechanism provides proactivity .. Probe Based failure detection in IPMP is more sopisticated than the Link based one.
On the other hand, sometimes using the advanced methods can bring you a disadvantage.. Like you see in this example.. Altough there was a problem in ICMP ping, mount could work.. Rman could write to the related NFS shares, but the sophisticated probe based failure detection algorithm have sensed an error in the network and disabled the interfaces.. This affected the continuity but there was a real a problem in the network though.. A sophisticated mechanism can made us realize this kind of problems.. 
Anyways,I would think that choosing the sophisticated method is not the best way for all the times, but I cant.. It might stopped us mounting the nfs share like it did in this case, on the other hand; it made us recognize a real network problem..

At the bottom line, 
If you have a complex network architecture, I still recommend using Probe Based failure detection in ZFS storages.  But I also recommend proceeding with the Link based failure detection if Probe based detection encounters a failure.. 
In such a specific configuration like connecting ZFS to Exadata using Cisco switch, just use Link based Failure detection for IPMP, as it is already recommended by MAA.
An Oracle White Paper April 2012 Backup and Recovery Performance and Best Practices using the Sun ZFS Storage Appliance with the Oracle Exadata Database Machine

No comments :

Post a Comment

If you will ask a question, please don't comment here..

For your questions, please create an issue into my forum.

Forum Link: http://ermanarslan.blogspot.com.tr/p/forum.html

Register and create an issue in the related category.
I will support you from there.