Monday, February 15, 2021

RDBMS / RAC -- Inteconnect: RDS , IPC , RAC relink/make, skgxpinfo & ORA-27300 - bind_fail

Recently my team reported this issue to me.. An Oracle 11.2.0.4 database couldn't start on an Exadata with a freshly installed RAC Home. Startup was encountering ORA-27300 errors and it was clear that we couldn't bind to a ip:port during initial phase of the instance startup...

ORA-00603: ORACLE server session terminated by fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:bind failed with status: 11
ORA-27301: OS failure message: Resource temporarily unavailable
ORA-27302: failure occurred at: sskgxpsoc

or

ORA-00603: ORACLE server session terminated by fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:bind_fail failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpvifconf
ORA-27303: additional information: requested interface X.X.X.X failed bind. Check output from ifconfig command

These errors in the bottom stack can be sskgxpsoc and/or skgxpvifconf and according to the following MOS note, we have some patches already available for these types of issues.

Getting ORA-00603, ORA-27504, ORA-27300, ORA-27301, ORA-27302; with "bind failed" with EAGAIN and "bind_fail failed", on "sskgxpsock" and "skgxpvifconf" (Doc ID 1524444.1)

We have also workaround and it is based on setting cluster_interconnects parameter to the infiniband ip addresses.

In this blog post, I want to go a little bit further on this topic and give you some extra information about RDS, IPC and the related libraries linked or made available for Oracle to use for interconnect and the relation between these types errors and the RDS/IPC interconnect configuıration.

It is all about oracle binaries, the libraries used by them and the options used while relinking/make'ing oracle...

If you relink ioracle with RDS (ipc_rds), then you need to have interfaces that support RDS in your environment and in this kind of an environment, you set the cluster_interconnects and tell oracle -> use them!

Of course we are talking about this environment where there is no functionality like finding available interfaces automatically. I mean, if you don't have an infiniband-like interface, then you need to turn off rds. Oracle just doesn't start even when we set the cluster_interconnect to the IP adresses of a classical tcp/udp ethernet interface. It just can't bind... This is what is seen even on single node environments.

If you are going to use a single node non-rac, you can link the  ioracle with rac_off.. I mean, you can disable the rac option, and this move solves our problem as well.... 
Eventually, relinking with rac_off removes the rds mode. Note that, relinking with rac_on uses the configuration as is...(unless we have an ipc_none -- dummy configuration .. I will touch on this topic later) 

I mean, if your oracle binary is linked with rds, it will be relinked with rds even after you relink your oracle with rac_on.. So rac_off disables the rds mode and enables the ipc_g mode, but rac_on leaves it as is.. (at least in 11.2.0.4 :))

You can understand this by reviewing the related make file or outputs of the relink command;

$ORACLE_HOME/rdbms/lib/ins_rdbms.mk rac_off ioracle
rm -f /u01/dbebs/PROD/db/tech_st/11.2.0/lib/libskgxp11.so
cp /u01/dbebs/PROD/db/tech_st/11.2.0/lib//libskgxpg.so .
/u01/dbebs/PROD/db/tech_st/11.2.0/lib/libskgxp11.so

Note that --> libskgxpg is the non RDS one.. IPC related one.. So rac_off make oracle to use that..

Rds enabled or not; after those make operations; the library name used by Oracle is libskgxp11 .. However; its contents are different.. It is actually based on the relink options used. 

If we use rds on ( ipc_rds), the library named libskgxpr is used (I mean it is renamed and copied as libskgxp11.. If we use no rds (ipc_g), the library named libskgxpg is used (it is renamed and copied to libskgxp11)

If you want to see what oracle is currently linked to (or more precisely; to see which library oracle is currently configured to use for interconnect related things), you can use the command below ->

[oracle@ebstestdb lib]$ $ORACLE_HOME/bin/skgxpinfo -v
Oracle UDP/IP (generic)

the name skgxp is related to -> system kernel generic interface inter-process communications

skgxpinfo is just a little binary available in Oracle Home for getting that info...

But! How do I check that manually? :)

If rds is off (ipc_g) ->

[oracle@ebstestdb lib]$ strings /u01/dbebs/PROD/db/tech_st/11.2.0/lib/libskgxp11.so |grep -i sskgxphack| awk '{print $16}'

output: skgxpg_rds_off.c -> This means oracle uses ipc_g :)

Actually, we have the following string in libskgxp11.so;

-> comment:Intel(R) C++ Compiler for applications running on Intel(R) 64, Version 10.1 Build 20100527 %s : skgxpg_rds_off.c

If rds is on (ipc_rds)

We can check with the same command that we used above and expect to see sskgxphack.c as the output(rather than kgxpg_rds_off.c)

[oracle@ebstestdb lib]$  strings /u01/dbebs/PROD/db/tech_st/11.2.0/lib/libskgxp11.so |grep -i sskgxphack| awk '{print $16}'

This way we get that info we need manually: ) Note that, we see sskgxphack.c when the rds is on.

Actually, we have the following string in libskgxp11.so; ->comment:Intel(R) C++ Compiler for applications running on Intel(R) 64, Version 10.1 Build 20100527 %s : sskgxphack.c

One last note, we have also an entry named ipc_none in the ins_rdbms.mk (makefile) and this seems to be a dummy driver for IPC. As far as I can see, this ipc_none is used while deinstalling Oracle software using the installer (OUI). Other than that, I don't see a point of using that.. When we relink an oracle binary with rac_on, which is currently have ipc_none configuration, it is automatically configured as ipc_g .. Well, that's the last thing I want to mention on this topic :)

No comments :

Post a Comment

If you will ask a question, please don't comment here..

For your questions, please create an issue into my forum.

Forum Link: http://ermanarslan.blogspot.com.tr/p/forum.html

Register and create an issue in the related category.
I will support you from there.