The environment that is the subject of this article is a Bare Metal ODA X5 machine.
In this blog post, I will go through a real life case, where we needed to replace a failed hard drive of an ODA X5 machine.
The cause that made me write this blog post, is actually a lack of a crucial info. You will understand what I mean when I will go in to the details of our case, but first take a look at the process of changing a disk drive in an ODA machine.
The disk replacement on ODA is easy.
We basically follow the MOS document named: How to Replace an ODA (Oracle Database Appliance) FAILED/ PredictiveFail Shared Storage Disk ( Doc ID 1435946.1 )
In case of ODA, we(customers or consultants) are replacing the disks , no need for Oracle Field engineers for that. That's why, the disk components are called CRU's (Customer Replacable Units)
The actions to be taken for replacing a failed disk can be summarized as follows;
- Identify the failed disk (do the diagnostics)
- Take out the failed fisk
- Wait 2-3 mins
- Attach the new disk
- Check the conditions and take post actions if necessary
Physically replacing the disk is so easy, that's why I will not concantrate on that in this article.
However; the steps: identifying the failed disk and the post actions are interesting. So here in this post; I will give some details about them.
Before the replacement, we just check the failed disk and be sure that it is in failed status (STATE_DETAILS=DiskRemoved or PredictiveFail)
We use oakcli show disk command for checking the disks and their status.
Note that, the failed disk may not be there in the output and if this happens is totally fine.
It means the failed disk is removed.
In this example, I m doing my check for a failed disk drive on slot 10 ( the slot number can be obtained by looking at the machine rack itself as well).
Here, first I use the oakcli show disk command and see the disk in slot 10 is not there at all.
[root@ermanoda0 ~]# oakcli show disk
NAME PATH TYPE STATE STATE_DETAILS
e0_pd_00 /dev/sdc HDD ONLINE Good
e0_pd_01 /dev/sdd HDD ONLINE Good
e0_pd_02 /dev/sde HDD ONLINE Good
e0_pd_03 /dev/sdf HDD ONLINE Good
e0_pd_04 /dev/sdg HDD ONLINE Good
e0_pd_05 /dev/sdh HDD ONLINE Good
e0_pd_06 /dev/sdi HDD ONLINE Good
e0_pd_07 /dev/sdj HDD ONLINE Good
e0_pd_08 /dev/sdaa HDD ONLINE Good
e0_pd_09 /dev/sdac HDD ONLINE Good
---Attention no output for e0_pd_10--
e0_pd_11 /dev/sdag HDD ONLINE Good
e0_pd_12 /dev/sdai HDD ONLINE Good
e0_pd_13 /dev/sdak HDD ONLINE Good
e0_pd_14 /dev/sdam HDD ONLINE Good
e0_pd_15 /dev/sdao HDD ONLINE Good
e0_pd_16 /dev/sdab SSD ONLINE Good
e0_pd_17 /dev/sdad SSD ONLINE Good
e0_pd_18 /dev/sdaf SSD ONLINE Good
e0_pd_19 /dev/sdah SSD ONLINE Good
e0_pd_20 /dev/sdaj SSD ONLINE Good
e0_pd_21 /dev/sdal SSD ONLINE Good
e0_pd_22 /dev/sdan SSD ONLINE Good
e0_pd_23 /dev/sdap SSD ONLINE Good
As I see the disk is not there in the output, I m doing more checks to be sure that the disk is not seen by OS or any other software component on ODA system.
--I do my checks on both of the ODA nodes..
First, checking the multipath devices;
multipath -ll output:
HDD_E0_S10_992975636 (35000cca23b2f9b14) dm-14
size=7.2T features='0' hwhandler='0' wp=rw
"no disk paths listed here"
Well. The failed disk's multipath device name should be dm-14, as there is no disk paths listed for it.
I also don't see any slaves for it..
cd /sys/block/dm-14/slaves
ls -al
total 0
drwxr-xr-x 2 root root 0 Aug 1 10:28 .
drwxr-xr-x 8 root root 0 Mar 1 2016 ..
No devices...
This should give the real device names..
For ex: working a dm:
cd /sys/block/dm-14/slaves
ls -al
total 0
drwxr-xr-x 2 root root 0 Aug 1 10:28 .
drwxr-xr-x 8 root root 0 Mar 1 2016 ..
lrwxrwxrwx 1 root root 0 Jan 3 18:19 sde -> ../../sde
lrwxrwxrwx 1 root root 0 Jan 3 18:19 sdo -> ../../sdo
[root@ermanoda0 mapper]# ls -lrt|grep S10
brw-rw---- 1 grid asmadmin 252, 14 Mar 1 2016 HDD_E0_S10_992975636
brw-rw---- 1 grid asmadmin 252, 49 Jul 24 17:26 HDD_E0_S10_992975636p2
brw-rw---- 1 grid asmadmin 252, 36 Sep 24 09:59 HDD_E0_S10_992975636p1
Well.. the multipath devices are there,. But these are multipath devices right? They are not physical. So it is not a bad thing.
Continuing my diagnostics..
Next, I check the OAK logs;
log/ermanoda0/oak/oakd.l45:2016-12-15 11:22:08.886: [CLSFRAME][4160715072]{0:35:2} payload=|OAKERR : 9009 : Couldn't find the resource: e0_pd_10||
log/ermanoda0/oak/oakd.l45:2016-12-15 11:22:08.886: [CLSFRAME][4160715072]{0:35:2} String params:CmdUniqId=|ServiceName=e0_pd_10|pname=Error|
log/ermanoda0/oak/oakd.l45:2016-12-15 11:22:08.886: [ OAKFW][4160715072]{0:35:2} PE sending last reply for: MIDTo:1|OpID:1|FromA:{Relative|Node:0|Process:35|Type:2}|ToA:{Relative|Node:0|Process:0|Type:1}|MIDFrom:4|Type:1|Pri2|Id:8:Ver:2Value params:payload=|OAKERR : 9009 : Couldn't find the resource: e0_pd_10||String params:CmdUniqId=|ServiceName=e0_pd_10|pname=Error|Int params:ErrCode=0|MsgId=4359|flag=2|sflag=64|
OAK says, I can't find the e0_pd_10 resource, which actually corresponds to our failed disk. So this is normal.
In OAKCLI logs, I see the Disk Removed state for our failed disk, which is totally expected.
log/ermanoda0/client/oakcli.log:2016-08-01 11:02:51.999: [ OAKCLI][2575886656] e0_pd_10 /dev/sdae HDD FAILED DiskRemoved
I check the physical device name, and it is not there (as expected)
--> ls-al /dev/sdae
ls: /dev/sdae: No such file or directory
I also check the fishwrap logs and see that the failed disk is deleted.
log/fishwrap/fishwrap.log:Sun Jul 24 17:27:39 2016: deleting an old disk: /dev/sg17
In Fishwrap log: Sun Jul 24 17:27:39 2016: Slot [10] sas-addr = 5000cca23b2f9b16
Sun Jul 24 17:27:39 2016: fwr_scsi_tree_topology_update finish, device num = 51
In this log, I see that, the SCSI deviced count was 53 before the failed disk. After the failed disk it have become 51. (this is also expected)
EARLIER:
Tue Mar 1 14:13:37 2016: Number of SCSI device found = 53, existing = 53
NOW:
Sun Jul 24 17:27:38 2016: Number of SCSI device found = 51, existing = 53
Sun Jul 24 17:27:38 2016: fwr_scsi_tree_topology_update: expander update start
I execute a Storage Diagnostics and see the following in its output;
8 : fwupdate
[INFO]: fwupdate does not see disk from both controllers
9 : Fishwrap
[INFO]: Fishwrap not able to discover disk
10 : Check for shared disk write cache status
[INFO]: Unable to find OS devices for slot 10
11 : SCSI INQUIRY
[INFO]: Unable to run scsi inquiry command on disk as OS device are absent
[INFO]: Unable to run scsi inquiry command on disk as OS device are absent
12 : Multipath Conf for device
multipath {
wwid 35000cca23b2f9b14
alias HDD_E0_S10_992975636
}
13 : Last few LSI Events Received for slot 10
[INFO]: No LSI events are recorded in OAKD logs
14 : Version Information
OAK : 12.1.2.4.0
kernel : 2.6.39-400.250.6.el5uek
mpt2sas : 17.00.06.00
Multipath : 0.4.9
15 : OAK Conf Parms
[INFO]: No scsi devices found for slot 10
In summary;
I only see the Multipath device names are present for the failed disk and all the other things that are related with the failed disk are removed.
That 's why, I conclude that the failed disk is eliminated from OS and OAK.
So, the disk is ready to be replaced and the environment was totally in an expected state.
I did such diagnostics to ensure the disk is removed properly ,because I could not see the failed disk with a failure status in oakcli output ... (Remember oakcli show disk doesn't report this disk at all)
So, oak removed the disk and it is not listing any more. Maybe it was the behaviour of it after a reboot, but anyways; there is a thing that needs to be added to the documentation right? :)
Well. After the disk is replaced online, every OS and OAK related thing is done automatically and transparently. Here is the status of the checks;
fwupdate list disk
===============
ID Manufacturer Model Chassis Slot Type Media Size(GiB) FW Version XML Support
-----------------------------------------------------------------------------------------------------------
c2d0 HGST H7280A520SUN8.0T 0 0 sas HDD 7325 P554 N/A
c2d1 HGST H7280A520SUN8.0T 0 1 sas HDD 7325 P554 N/A
c2d2 HGST H7280A520SUN8.0T 0 2 sas HDD 7325 P554 N/A
c2d3 HGST H7280A520SUN8.0T 0 3 sas HDD 7325 P554 N/A
c2d4 HGST H7280A520SUN8.0T 0 4 sas HDD 7325 P554 N/A
c2d5 HGST H7280A520SUN8.0T 0 5 sas HDD 7325 P554 N/A
c2d6 HGST H7280A520SUN8.0T 0 6 sas HDD 7325 P554 N/A
c2d7 HGST H7280A520SUN8.0T 0 7 sas HDD 7325 P554 N/A
c2d8 HGST H7280A520SUN8.0T 0 8 sas HDD 7325 P554 N/A
c2d9 HGST H7280A520SUN8.0T 0 9 sas HDD 7325 P554 N/A
c2d10 HGST H7280A520SUN8.0T 0 10 sas HDD 7325 P9E2 N/A
c2d11 HGST H7280A520SUN8.0T 0 11 sas HDD 7325 P554 N/A
c2d12 HGST H7280A520SUN8.0T 0 12 sas HDD 7325 P554 N/A
c2d13 HGST H7280A520SUN8.0T 0 13 sas HDD 7325 P554 N/A
c2d14 HGST H7280A520SUN8.0T 0 14 sas HDD 7325 P554 N/A
c2d15 HGST H7280A520SUN8.0T 0 15 sas HDD 7325 P554 N/A
c2d16 HGST HSCAC2DA4SUN400G 0 16 sas SSD 373 A29A N/A
c2d17 HGST HSCAC2DA4SUN400G 0 17 sas SSD 373 A29A N/A
c2d18 HGST HSCAC2DA4SUN400G 0 18 sas SSD 373 A29A N/A
c2d19 HGST HSCAC2DA4SUN400G 0 19 sas SSD 373 A29A N/A
c2d20 HGST HSCAC2DA6SUN200G 0 20 sas SSD 186 A29A N/A
c2d21 HGST HSCAC2DA6SUN200G 0 21 sas SSD 186 A29A N/A
c2d22 HGST HSCAC2DA6SUN200G 0 22 sas SSD 186 A29A N/A
c2d23 HGST HSCAC2DA6SUN200G 0 23 sas SSD 186 A29A N/A
[root@ermanoda0 ~]# oakcli show disk
NAME PATH TYPE STATE STATE_DETAILS
e0_pd_00 /dev/sdc HDD ONLINE Good
e0_pd_01 /dev/sdd HDD ONLINE Good
e0_pd_02 /dev/sde HDD ONLINE Good
e0_pd_03 /dev/sdf HDD ONLINE Good
e0_pd_04 /dev/sdg HDD ONLINE Good
e0_pd_05 /dev/sdh HDD ONLINE Good
e0_pd_06 /dev/sdi HDD ONLINE Good
e0_pd_07 /dev/sdj HDD ONLINE Good
e0_pd_08 /dev/sdaa HDD ONLINE Good
e0_pd_09 /dev/sdac HDD ONLINE Good
e0_pd_10 /dev/sdp HDD ONLINE Good
e0_pd_11 /dev/sdag HDD ONLINE Good
e0_pd_12 /dev/sdai HDD ONLINE Good
e0_pd_13 /dev/sdak HDD ONLINE Good
e0_pd_14 /dev/sdam HDD ONLINE Good
e0_pd_15 /dev/sdao HDD ONLINE Good
e0_pd_16 /dev/sdab SSD ONLINE Good
e0_pd_17 /dev/sdad SSD ONLINE Good
e0_pd_18 /dev/sdaf SSD ONLINE Good
e0_pd_19 /dev/sdah SSD ONLINE Good
e0_pd_20 /dev/sdaj SSD ONLINE Good
e0_pd_21 /dev/sdal SSD ONLINE Good
e0_pd_22 /dev/sdan SSD ONLINE Good
e0_pd_23 /dev/sdap SSD ONLINE Good
Even the multipath.conf file is updated automatically;
multipath {
wwid 35000cca2604a90ac --> this is the wwid of the newly added disk
alias HDD_E0_S10_1615499436
}
[root@ermanoda0 etc]# lsscsi
[0:2:0:0] disk LSI MR9361-8i 4.23 /dev/sda
[7:0:0:0] disk ORACLE SSM PMAP /dev/sdb
[8:0:0:0] enclosu ORACLE DE2-24C 0018 -
[8:0:1:0] disk HGST H7280A520SUN8.0T P554 /dev/sdc
[8:0:2:0] disk HGST H7280A520SUN8.0T P554 /dev/sdd
[8:0:3:0] disk HGST H7280A520SUN8.0T P554 /dev/sde
[8:0:4:0] disk HGST H7280A520SUN8.0T P554 /dev/sdf
[8:0:5:0] disk HGST H7280A520SUN8.0T P554 /dev/sdg
[8:0:6:0] disk HGST H7280A520SUN8.0T P554 /dev/sdh
[8:0:7:0] disk HGST H7280A520SUN8.0T P554 /dev/sdi
[8:0:8:0] disk HGST H7280A520SUN8.0T P554 /dev/sdj
[8:0:9:0] disk HGST H7280A520SUN8.0T P554 /dev/sdl
[8:0:10:0] disk HGST H7280A520SUN8.0T P554 /dev/sdn
[8:0:12:0] disk HGST H7280A520SUN8.0T P554 /dev/sdr
[8:0:13:0] disk HGST H7280A520SUN8.0T P554 /dev/sdt
[8:0:14:0] disk HGST H7280A520SUN8.0T P554 /dev/sdv
[8:0:15:0] disk HGST H7280A520SUN8.0T P554 /dev/sdx
[8:0:16:0] disk HGST H7280A520SUN8.0T P554 /dev/sdz
[8:0:17:0] disk HGST HSCAC2DA4SUN400G A29A /dev/sdab
[8:0:18:0] disk HGST HSCAC2DA4SUN400G A29A /dev/sdad
[8:0:19:0] disk HGST HSCAC2DA4SUN400G A29A /dev/sdaf
[8:0:20:0] disk HGST HSCAC2DA4SUN400G A29A /dev/sdah
[8:0:21:0] disk HGST HSCAC2DA6SUN200G A29A /dev/sdaj
[8:0:22:0] disk HGST HSCAC2DA6SUN200G A29A /dev/sdal
[8:0:23:0] disk HGST HSCAC2DA6SUN200G A29A /dev/sdan
[8:0:24:0] disk HGST HSCAC2DA6SUN200G A29A /dev/sdap
[8:0:25:0] disk HGST H7280A520SUN8.0T P9E2 /dev/sdp
[9:0:0:0] enclosu ORACLE DE2-24C 0018 -
[9:0:1:0] disk HGST H7280A520SUN8.0T P554 /dev/sdk
[9:0:2:0] disk HGST H7280A520SUN8.0T P554 /dev/sdm
[9:0:3:0] disk HGST H7280A520SUN8.0T P554 /dev/sdo
[9:0:4:0] disk HGST H7280A520SUN8.0T P554 /dev/sdq
[9:0:5:0] disk HGST H7280A520SUN8.0T P554 /dev/sds
[9:0:6:0] disk HGST H7280A520SUN8.0T P554 /dev/sdu
[9:0:7:0] disk HGST H7280A520SUN8.0T P554 /dev/sdw
[9:0:8:0] disk HGST H7280A520SUN8.0T P554 /dev/sdy
[9:0:9:0] disk HGST H7280A520SUN8.0T P554 /dev/sdaa
[9:0:10:0] disk HGST H7280A520SUN8.0T P554 /dev/sdac
[9:0:12:0] disk HGST H7280A520SUN8.0T P554 /dev/sdag
[9:0:13:0] disk HGST H7280A520SUN8.0T P554 /dev/sdai
[9:0:14:0] disk HGST H7280A520SUN8.0T P554 /dev/sdak
[9:0:15:0] disk HGST H7280A520SUN8.0T P554 /dev/sdam
[9:0:16:0] disk HGST H7280A520SUN8.0T P554 /dev/sdao
[9:0:17:0] disk HGST HSCAC2DA4SUN400G A29A /dev/sdaq
[9:0:18:0] disk HGST HSCAC2DA4SUN400G A29A /dev/sdar
[9:0:19:0] disk HGST HSCAC2DA4SUN400G A29A /dev/sdas
[9:0:20:0] disk HGST HSCAC2DA4SUN400G A29A /dev/sdat
[9:0:21:0] disk HGST HSCAC2DA6SUN200G A29A /dev/sdau
[9:0:22:0] disk HGST HSCAC2DA6SUN200G A29A /dev/sdav
[9:0:23:0] disk HGST HSCAC2DA6SUN200G A29A /dev/sdaw
[9:0:24:0] disk HGST HSCAC2DA6SUN200G A29A /dev/sdax
[9:0:25:0] disk HGST H7280A520SUN8.0T P9E2 /dev/sdae
[root@ermanoda0 etc]# oakcli show disk e0_pd_10
Resource: e0_pd_10
ActionTimeout : 1500
ActivePath : /dev/sdp
AsmDiskList : |e0_data_10||e0_reco_10|
AutoDiscovery : 1
AutoDiscoveryHi : |data:43:HDD||reco:57:HDD||redo:100
:SSD||flash:100:SSD|
CheckInterval : 300
ColNum : 2
DependListOpr : add
Dependency : |0|
DiskId : 35000cca2604a90ac
DiskType : HDD
Enabled : 0
ExpNum : 0
IState : 0
Initialized : 1
IsConfigDepende : false
MonitorFlag : 0
MultiPathList : |/dev/sdae||/dev/sdp|
Name : e0_pd_10
NewPartAddr : 0
OSUserType : |userType:Multiuser|
PrevState : UnInitialized
PrevUsrDevName : HDD_E0_S10_1615499436
SectorSize : 512
SerialNum : 001634PA07WV
Size : 7865536647168
SlotNum : 10
State : Online
StateChangeTs : 1483596721
StateDetails : Good
TotalSectors : 15362376264
TypeName : 0
UsrDevName : HDD_E0_S10_1615499436
gid : 0
mode : 660
uid : 0
}
It is also seen in OS logs... The disk is discovered properly by OS;
/var/log/messages:
Jan 5 08:10:29 ermanoda0 kernel: mpt3sas0: detecting: handle(0x0024), sas_address(0x5000cca2604a90ae), phy(10)
Jan 5 08:10:35 ermanoda0 kernel: scsi 9:0:25:0: Direct-Access HGST H7280A520SUN8.0T P9E2 PQ: 0 ANSI: 6
Jan 5 08:10:35 ermanoda0 kernel: scsi 8:0:25:0: Direct-Access HGST H7280A520SUN8.0T P9E2 PQ: 0 ANSI: 6
Jan 5 08:10:35 ermanoda0 kernel: scsi 8:0:25:0: SSP: handle(0x0024), sas_addr(0x5000cca2604a90ae), phy(10), device_name(0x5000cca2604a90af)
Jan 5 08:10:35 ermanoda0 kernel: scsi 8:0:25:0: SSP: enclosure_logical_id(0x5080020001ecf27e), slot(80)
Jan 5 08:10:35 ermanoda0 kernel: scsi 8:0:25:0: serial_number(001634PA07WV VLHA07WV)
Jan 5 08:10:35 ermanoda0 kernel: scsi 8:0:25:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(7), cmd_que(1)
Jan 5 08:10:35 ermanoda0 kernel: scsi 9:0:25:0: SSP: handle(0x0024), sas_addr(0x5000cca2604a90ad), phy(10), device_name(0x5000cca2604a90af)
Jan 5 08:10:35 ermanoda0 kernel: scsi 9:0:25:0: SSP: enclosure_logical_id(0x5080020001eceb7e), slot(80)
Jan 5 08:10:35 ermanoda0 kernel: scsi 9:0:25:0: serial_number(001634PA07WV VLHA07WV)
Jan 5 08:10:35 ermanoda0 kernel: scsi 9:0:25:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(7), cmd_que(1)
Jan 5 08:10:35 ermanoda0 kernel: sd 8:0:25:0: Attached scsi generic sg17 type 0
Jan 5 08:10:35 ermanoda0 kernel: sd 8:0:25:0: [sdp] Enabling DIF Type 1 protection
Jan 5 08:10:35 ermanoda0 kernel: sd 8:0:25:0: [sdp] 15362376264 512-byte logical blocks: (7.86 TB/7.15 TiB)
Jan 5 08:10:35 ermanoda0 kernel: sd 8:0:25:0: [sdp] 4096-byte physical blocks
Jan 5 08:10:35 ermanoda0 kernel: sd 9:0:25:0: [sdae] Enabling DIF Type 1 protection
Jan 5 08:10:35 ermanoda0 kernel: sd 9:0:25:0: [sdae] 15362376264 512-byte logical blocks: (7.86 TB/7.15 TiB)
Jan 5 08:10:35 ermanoda0 kernel: sd 9:0:25:0: [sdae] 4096-byte physical blocks
Jan 5 08:10:35 ermanoda0 kernel: sd 8:0:25:0: [sdp] Write Protect is off
Jan 5 08:10:35 ermanoda0 kernel: sd 9:0:25:0: [sdae] Write Protect is off
Jan 5 08:10:35 ermanoda0 kernel: sd 9:0:25:0: [sdae] Write cache: disabled, read cache: enabled, supports DPO and FUA
Jan 5 08:10:35 ermanoda0 kernel: sd 8:0:25:0: [sdp] Write cache: disabled, read cache: enabled, supports DPO and FUA
Jan 5 08:10:35 ermanoda0 kernel: sdae:
Jan 5 08:10:35 ermanoda0 kernel: sd 9:0:25:0: Attached scsi generic sg32 type 0
Jan 5 08:10:35 ermanoda0 kernel: sdp:
Jan 5 08:10:35 ermanoda0 kernel: sd 9:0:25:0: [sdae] Attached SCSI disk
Jan 5 08:10:35 ermanoda0 kernel: sd 8:0:25:0: [sdp] Attached SCSI disk
On the other hand, there is one thing that still needs to be cleared and it is the firmware of the newly added disk.
This part is actually the part named "Check the conditions and take post actions if necessary".
Here, I use the oakcli show version -detail command to see the installed and supported firmware version of the ODA Components.
[root@ermanoda0 device]# oakcli show version -detail
Reading the metadata. It takes a while...
System Version Component Name Installed Version Supported Version
-------------- --------------- ------------------ -----------------
12.1.2.4.0
Controller_INT 4.230.40-3739 Up-to-date
Controller_EXT 06.00.02.00 Up-to-date
Expander 0018 Up-to-date
SSD_SHARED {
[ c2d20,c2d21,c2d22, A29A A122
c2d23 ]
[ c2d16,c2d17,c2d18, A29A A122
c2d19 ]
}
HDD_LOCAL A720 Up-to-date
HDD_SHARED {
[ c2d0,c2d1,c2d2,c2d P554 Up-to-date
3,c2d4,c2d5,c2d6,c2d
7,c2d8,c2d9,c2d11,c2
d12,c2d13,c2d14,c2d1
5 ]
[ c2d10 ] P9E2 P554
}
ILOM 3.2.4.42 r99377 Up-to-date
BIOS 30040200 Up-to-date
IPMI 1.8.12.0 Up-to-date
HMP 2.3.2.4.1 Up-to-date
OAK 12.1.2.4.0 Up-to-date
OL 5.11 Up-to-date
GI_HOME 12.1.0.2.4(20831110, Up-to-date
20831113)
DB_HOME {
[ OraDb12102_home1 ] 12.1.0.2.4(20831110, Up-to-date
20831113)
[ OraDb11203_home1 ] 11.2.0.3.15(20760997 Up-to-date
,17592127)
}
As seen in the output above, while the firmware of all the other disks are P554, the newly added disk has P9E2 firmware.
Well... The column named "supported" is showing the firmware P554.
So here is the question which needs to be answered? Is P9E2 newer than P554?
This question needs to be answered. That is, if the firmware version of the newly added disk is older than the supported one, then a Firmware upgrade needs to be done using "oakcli update -patch <patch_bundle_version> --infra" command
Reference from note "How to Replace an ODA (Oracle Database Appliance) FAILED/ PredictiveFail Shared Storage Disk ( Doc ID 1435946.1 )"
"If the newly replaced disk has older firmware from what the ODA Software is expecting, you will need to update the firmware on this disk.
If the disk has newer firmware from the existing disks, this is fine, and the firmware does not need to be downgraded to match existing disks."!!!
However, there is no info about P9E2 on Internet or Oracle Support.
Actually, there is an info about P901..
Here in note Oracle Database Appliance FAQ: Common Questions Regarding Firmware Versions on ODA (Doc ID 2119003.1), it says:
The P901 firmware is a newer firmware version for the 8 TB drive, and firmware P554 is the one we released in 12.1.2.5.
So p901> p554 right? I mean 901 is a greater value.
In the same logic p9E2 is greater than p901, but still we create an SR for it.
Here is what Oracle Support(SR) says:
The FirmWare P9E2 of the replaced shard storage disk in slot 10 is the latest FW which has been released with ODA image 12.1.2.8.0 in Sept 2016.
So, the problem solved. No need for firmware upgrade..
At the end of the day, we see the FAQ is not up-to-date and there is no document for getting info about ODA HDD firmware versions (a version compatibility between ODA versions and Component Versions).
Fortunetaly, we at least have Oracle Support :)