Recently struggled with an issue in a mission critical environment. The issue was the relocating VIPs. It started all of a sudden and diagnostics indicated some kind of a network problem.
The issue was related with failed pings. The pingtarget concept of Oracle was in the stage and due justified reasons, causing VIPs to failover to the secondary node of the RAC.
Some background information about Ping target : Delivered with 12C (12.1.0.2), useful and relevant in virtualized environments. It is there for detecting and take actions in case where network failures are not recognized in the guest VMs. It is related with the public network only, since private networks already have their own heart beat check mechanisms designed with care. So basically, if the target ip(s) can not be pinged from a RAC node, or if there is a significant delay in those pings, VIPs are failed over to the secondary node(s). The parameter is set via srvctl modify nodeapps -pingtarget command.It seems innocent since it has nothing to do with the interconnect, but actually it is vital. VIP transfer(s) etc. are happening according to this routine.
In our case, a switch problem caused everything. The default gateway was set to the firewall's ip address and the responses of the firewall to ping(s) were sometimes mixed up.
We were lucky that the ping target parameter could be set to more than one IP. ( the fault tolerance), and that saved the day.
But here is an important thing to note: We should not set ping target to the IPs that are against the logic of this. It is necessary to set our ping target to the ip addresses of the physical and stable devices that provide connection to the outside world and that will respond to ping.
If more than one IP is to be given, those IP addresses must be the ones that belong to the devices that are directly related to the public network connections.
Also, a final note on this subject: when you set this parameter to more than one IP, there may be Oracle routines that cannot manage it. Of course, I am not talking about DB or GI, but for example, we faced this in an ODA DB System creation. DB System creation could not continue when the ping target was set to more than one IP address, we had to temporarily set the parameter to a single IP address, and then set it to multiple IP addresses again when the DB System creation finished.
Well, the following is the error we got;
[Grid stack creation] - DCS-10001:Internal error encountered: Failed to set ping target on public network.\\\",\\\"taskName\\\":\\\"Grid stack
No comments :
Post a Comment
If you will ask a question, please don't comment here..
For your questions, please create an issue into my forum.
Forum Link: http://ermanarslan.blogspot.com.tr/p/forum.html
Register and create an issue in the related category.
I will support you from there.