Recently encountered an issue while making an health check for an EBS on EXADATA environment. The issue appeared in one of the phases, where we have used orachk utility, which is pretty good by the way...
orachk reported the following stataments about it;
Anyways, It was pretty easy to check ;
$ORACLE_HOME/bin/skgxpinfo -v (supposing your db version >= 11.2.0.2)
orachk reported the following stataments about it;
Oracle RAC Communication is NOT using RDS protocol on Infiniband Network
Database Home is NOT properly linked with RDS library
So this was an issue for me, because RDS is the latest technology for interconnect communication in the world of Oracle .. But this database on Exadata was not using it..
This was pretty strange but I have found the reason, as the database in question was not created during the Exadata deployment phase.. This database was an EBS database and must have been created manually and having a factory standard Exadata database deployment seemed to be the cause for it. So watch out for this in your next EBS to EXADATA migrations..
Anyways, It was pretty easy to check ;
$ORACLE_HOME/bin/skgxpinfo -v (supposing your db version >= 11.2.0.2)
The output should be "rds" , if it s not , if it is udp for example, then it s not correct.
It was not so hard to correct ;
It was not so hard to correct ;
Recommendation was already there in orachk ' html report output,
If the instance is not using the RDS protocol over InfiniBand, relink the Oracle binary using the following commands (with variables properly defined for each home being linked): (as oracle) Shutdown any processes using the Oracle binary If and only if relinking the grid infrastructure home, then (as root) GRID_HOME/crs/install/rootcrs.pl -unlock (as oracle) cd $ORACLE_HOME/rdbms/lib (as oracle) make -f ins_rdbms.mk ipc_rds ioracle If and only if relinking the Grid Infrastructure home, then (as root) GRID_HOME/crs/install/rootcrs.pl -patch Note: Avoid using the relink all command due to various issues. Use the make commands provided
Also, following is a good note for this topic:
Oracle Clusterware and RAC Support for RDS Over Infiniband (Doc ID 751343.1)
One last thing; you can have a better understanding about RDS by reading one of my previous posts;
Hi ,
ReplyDeleteMY exachk report notifies that I am on UDP. I need to configure it to RDS.
Reference note :
How to Check Whether Oracle Binary/Instance is RAC Enabled and Relink Oracle Binary in RAC [ID 284785.1]
[oracle@dm01dbadm01 ~]$ ar -t $ORACLE_HOME/rdbms/lib/libknlopt.a|grep kcsm.o
kcsm.o
Oracle MOS suggested the below steps :
a) (as oracle) Shutdown any processes using the Oracle binary
b) (as oracle) cd $ORACLE_HOME/rdbms/lib
c) (as oracle) make -f ins_rdbms.mk ipc_rds ioracle
Can you suggest if there could be any impact / possible think i should check before executing the command ?
I do not have an exadata setup elsewhere to check.
That note is an Oracle Support Note, so it is supported by Oracle. If you still asking for a negative impact, i would say ; the consequence may be not being able start the database.
ReplyDeleteSo , before relinking you can check the following;
Be sure that , your interconnect is on infiniband links.(Ensure that the interface used of interconnect is an InfiniBand interface)
CrossCheck your route(interconnect route) with your ethernet interfaces (from node1 > route -v|grep "node2's private address" and ifconfig|grep "interface derived from route output, it should be bondib0)
Check rds kernel module is loaded. (the output should be similar to the following)
[root@exanode1 ~]# lsmod|grep rds
rds_rdma 117120 1741
rds 169864 3481 rds_rdma
rdma_cm 72852 2 rds_rdma,rdma_ucm
ib_core 112128 12 rds_rdma,ib_ipoib,rdma_ucm,rdma_cm,ib_ucm,ib_uverbs,ib_umad,ib_cm,iw_cm,ib_sa,mlx4_ib,ib_mad
Thanks. I have updated the SR asking about the same ( with respect to negative impact).
DeleteSharing my outputs below :
First Output:
[root@dm01dbadm01 Exachk_Software]# route -v
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 192.168.101.1 0.0.0.0 UG 0 0 0 bondeth0
link-local * 255.255.128.0 U 0 0 0 ib0
169.254.128.0 * 255.255.128.0 U 0 0 0 ib1
ct-jasper-01 192.168.100.1 255.255.255.255 UGH 0 0 0 eth0
192.168.8.0 * 255.255.252.0 U 0 0 0 ib1
192.168.8.0 * 255.255.252.0 U 0 0 0 ib0
sfitolddb01.sna 192.168.100.1 255.255.255.255 UGH 0 0 0 eth0
192.168.100.0 * 255.255.255.0 U 0 0 0 eth0
192.168.101.0 * 255.255.255.0 U 0 0 0 bondeth0
< I dont see my node 2 Private address anywhere >
Second Output :
[root@dm01dbadm01 Exachk_Software]# lsmod|grep rds
rds_rdma 121677 1937
rds 236128 3873 rds_rdma
ib_ipoib 94699 1 rds_rdma
rdma_cm 60865 2 rds_rdma,rdma_ucm
ib_core 75061 12 rds_rdma,ib_ipoib,rdma_ucm,ib_ucm,ib_uverbs,ib_umad,rdma_cm,ib_cm,iw_cm,mlx4_ib,ib_sa,ib_mad
Please advice on this once. Thanks.
I think it is okay
ReplyDeleteAlso your Exa machine should be at least an Exadata X4, as the name used for the Infiniband interface changed from BONDIB0 to IB0 and IB1 on Oracle Exadata Database Machine X4-2 systems using release 11.2.3.3.0 and later :)
Mine is X5-2. Thanks too much for your confirmation. :)
Delete