Thursday, June 4, 2015

Exadata -- Check your Oracle Home 's RDS usage for Interconnect

Recently encountered an issue while making an health check for an EBS on EXADATA environment. The issue appeared in one of the phases, where we have used orachk utility, which is pretty good by the way...

orachk reported the following stataments about it;
Oracle RAC Communication is NOT using RDS protocol on Infiniband Network
Database Home is NOT properly linked with RDS library

So this was an issue for me, because RDS is the latest technology for interconnect communication in the world of Oracle .. But this database on Exadata was not using it..
This was pretty strange but I have found the reason, as the database in question was not created during the Exadata deployment phase.. This database was an EBS database and must have been created manually and having a factory standard Exadata database deployment seemed to be the cause for it. So watch out for this in your next EBS to EXADATA migrations..

Anyways, It was pretty easy to check ;

$ORACLE_HOME/bin/skgxpinfo -v   (supposing your db version >= 11.2.0.2)
The output should be "rds" , if it s not , if it is udp for example, then it s not correct.

It was not so hard to correct ;
Recommendation was already there in orachk ' html report output, 
If the instance is not using the RDS protocol over InfiniBand, relink the Oracle binary using the following commands (with variables properly defined for each home being linked): (as oracle) Shutdown any processes using the Oracle binary If and only if relinking the grid infrastructure home, then (as root) GRID_HOME/crs/install/rootcrs.pl -unlock (as oracle) cd $ORACLE_HOME/rdbms/lib (as oracle) make -f ins_rdbms.mk ipc_rds ioracle If and only if relinking the Grid Infrastructure home, then (as root) GRID_HOME/crs/install/rootcrs.pl -patch Note: Avoid using the relink all command due to various issues. Use the make commands provided

Also, following is a good note for this topic:
Oracle Clusterware and RAC Support for RDS Over Infiniband (Doc ID 751343.1)

One last thing; you can have a better understanding about RDS by reading one of my previous posts;

5 comments :

  1. Hi ,

    MY exachk report notifies that I am on UDP. I need to configure it to RDS.
    Reference note :
    How to Check Whether Oracle Binary/Instance is RAC Enabled and Relink Oracle Binary in RAC [ID 284785.1]
    [oracle@dm01dbadm01 ~]$ ar -t $ORACLE_HOME/rdbms/lib/libknlopt.a|grep kcsm.o
    kcsm.o

    Oracle MOS suggested the below steps :
    a) (as oracle) Shutdown any processes using the Oracle binary
    b) (as oracle) cd $ORACLE_HOME/rdbms/lib
    c) (as oracle) make -f ins_rdbms.mk ipc_rds ioracle

    Can you suggest if there could be any impact / possible think i should check before executing the command ?

    I do not have an exadata setup elsewhere to check.

    ReplyDelete
  2. That note is an Oracle Support Note, so it is supported by Oracle. If you still asking for a negative impact, i would say ; the consequence may be not being able start the database.

    So , before relinking you can check the following;

    Be sure that , your interconnect is on infiniband links.(Ensure that the interface used of interconnect is an InfiniBand interface)
    CrossCheck your route(interconnect route) with your ethernet interfaces (from node1 > route -v|grep "node2's private address" and ifconfig|grep "interface derived from route output, it should be bondib0)

    Check rds kernel module is loaded. (the output should be similar to the following)
    [root@exanode1 ~]# lsmod|grep rds
    rds_rdma 117120 1741
    rds 169864 3481 rds_rdma
    rdma_cm 72852 2 rds_rdma,rdma_ucm
    ib_core 112128 12 rds_rdma,ib_ipoib,rdma_ucm,rdma_cm,ib_ucm,ib_uverbs,ib_umad,ib_cm,iw_cm,ib_sa,mlx4_ib,ib_mad

    ReplyDelete
    Replies
    1. Thanks. I have updated the SR asking about the same ( with respect to negative impact).

      Sharing my outputs below :

      First Output:

      [root@dm01dbadm01 Exachk_Software]# route -v
      Kernel IP routing table
      Destination Gateway Genmask Flags Metric Ref Use Iface
      default 192.168.101.1 0.0.0.0 UG 0 0 0 bondeth0
      link-local * 255.255.128.0 U 0 0 0 ib0
      169.254.128.0 * 255.255.128.0 U 0 0 0 ib1
      ct-jasper-01 192.168.100.1 255.255.255.255 UGH 0 0 0 eth0
      192.168.8.0 * 255.255.252.0 U 0 0 0 ib1
      192.168.8.0 * 255.255.252.0 U 0 0 0 ib0
      sfitolddb01.sna 192.168.100.1 255.255.255.255 UGH 0 0 0 eth0
      192.168.100.0 * 255.255.255.0 U 0 0 0 eth0
      192.168.101.0 * 255.255.255.0 U 0 0 0 bondeth0

      < I dont see my node 2 Private address anywhere >


      Second Output :

      [root@dm01dbadm01 Exachk_Software]# lsmod|grep rds
      rds_rdma 121677 1937
      rds 236128 3873 rds_rdma
      ib_ipoib 94699 1 rds_rdma
      rdma_cm 60865 2 rds_rdma,rdma_ucm
      ib_core 75061 12 rds_rdma,ib_ipoib,rdma_ucm,ib_ucm,ib_uverbs,ib_umad,rdma_cm,ib_cm,iw_cm,mlx4_ib,ib_sa,ib_mad


      Please advice on this once. Thanks.

      Delete
  3. I think it is okay

    Also your Exa machine should be at least an Exadata X4, as the name used for the Infiniband interface changed from BONDIB0 to IB0 and IB1 on Oracle Exadata Database Machine X4-2 systems using release 11.2.3.3.0 and later :)

    ReplyDelete
    Replies
    1. Mine is X5-2. Thanks too much for your confirmation. :)

      Delete

If you will ask a question, please don't comment here..

For your questions, please create an issue into my forum.

Forum Link: http://ermanarslan.blogspot.com.tr/p/forum.html

Register and create an issue in the related category.
I will support you from there.