Erman Arslan's Oracle Blog: RAC ASM -- Disaster Recovery solution (using storage techniques) snapshot/snapmirror, Netapp

Wednesday, May 17, 2017

RAC ASM -- Disaster Recovery solution (using storage techniques) snapshot/snapmirror, Netapp

This blog post is about an alternative Disaster Recovery solution that I recently built through a POC (Proof of Concept).
The source system was a 2 Node Oracle RAC (Grid + RDBMS both 11.2.0.4) and the target system residing in DR was a 1 Node environment (GRID + RDBMS both 11.2.0.4).
Both source and target was using ASM for storing all the db files.
This made me think a little bit, becuase this alternative disaster recovery solution was relying on Netapp snapshot and snapmiror mechanisms.
We were planning to put the database to the backup mode (hot backup mode
/alter database begin backup), take the snapshot on the local storage, transfer the snapshot to the DR storage using snapmirror , map the luns created by this snapshot/snapmirror to the target server, end backup in target and open the database successfully in the target.

ASM was my concern, because I had never done this before.
Actually I have implemented (even automated) this solution 6 years ago, but the filesystems were ext3 ( a cooked one) in that case.

So, I was afraid that ASM header info might get corrupted during the snapshot/snapmirror operations.
However, everything went well.

It is ofcourse not supported by Oracle. However, I can say that ASM diskgroups can be replicated using the snapshot/snapmirror operations. (ofcourse if you know what you are doing..)

I did 3 tests and didn't encounter any problems.

In this blog post, I will give you the things that I did to replicate a 2 Node RAC system(Grid/ASM+RDBMS) to a single Node System(Grid/ASM+RDBMS) using Netapp's snapshot and snapmirror functionalities.

Following is a representative picture, explaining the work done.
As shown in the following picture, we take the snapshot of the ASM disks, replicate it to the DR using snapmirror and then map the luns to the target server running on the DR site, and make ASM running on the target server to see those disks...

The operation that was done for this, requires both Linux admin and storage admin skills. ( I don't see a need for mentioning the required DBA skills :) )
It requires Linux admin skills, because we do stuff like formatting the drives (we are working on new luns since it is a POC), rescanning the scsi bus and etc.
It requires the Storage admin skills, because we do stuff like taking snapshots, initiating snapmirror, mapping luns and etc.
Lastly, it requires DBA skills, because we put the database to the backup mode, change some init.ora parameters, make the replicated disks be known by ASM, open our database succesfully in the target node (recovery may ve needed), check the log files in case of an error and so on.

Let's see what I did in this POC ->
Note that, this is a technical overview. (doesn't have to include all the steps that I have done and all the commands that I have used).

RAC to SINGLE STORAGE REPLICATION (ASM diskgroup which is built on top of 3 disks)

1. Storage team creates 3 ASM disks (each sized approx. 13gb) and maps these luns to our Linux RAC nodes. (source nodes)

We do the scsi rescan to make Linux see these luns/disks, and then we create partition on them. (oracleasm createdisk requires partition)

fdisk output just after the scsi rescan:

Disk /dev/sdbk: 11.8 GB, 11811160064 bytes

64 heads, 32 sectors/track, 11264 cylinders

Units = cylinders of 2048 * 512 = 1048576 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 65536 bytes

Disk identifier: 0x00000000

Disk /dev/sdbl: 12.9 GB, 12884901888 bytes

64 heads, 32 sectors/track, 12288 cylinders

Units = cylinders of 2048 * 512 = 1048576 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 65536 bytes

Disk identifier: 0x00000000

Disk /dev/sdbm: 14.0 GB, 13960740864 bytes

64 heads, 32 sectors/track, 13314 cylinders

Units = cylinders of 2048 * 512 = 1048576 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 65536 bytes

Disk identifier: 0x00000000

Attention, before discovering the disks we use fdisk to create partitions on them.
fdisk output just after the partition creations:

Disk /dev/sdbk: 11.8 GB, 11811160064 bytes

64 heads, 32 sectors/track, 11264 cylinders

Units = cylinders of 2048 * 512 = 1048576 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 65536 bytes

Disk identifier: 0x42926436

Device Boot Start End Blocks Id System

/dev/sdbk1 1 11264 11534320 83 Linux

Disk /dev/sdbl: 12.9 GB, 12884901888 bytes

64 heads, 32 sectors/track, 12288 cylinders

Units = cylinders of 2048 * 512 = 1048576 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 65536 bytes

Disk identifier: 0xb900c4fd

Device Boot Start End Blocks Id System

/dev/sdbl1 1 12288 12582896 83 Linux

Disk /dev/sdbm: 14.0 GB, 13960740864 bytes

64 heads, 32 sectors/track, 13314 cylinders

Units = cylinders of 2048 * 512 = 1048576 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 65536 bytes

Disk identifier: 0x65f91faf

Device Boot Start End Blocks Id System

/dev/sdbm1 1 13314 13633520 83 Linux

2. We use oracleasm to create the ASM disks and again using oracleasm, we make ASM see them.

[root@source01 ~]# oracleasm createdisk DEV1 /dev/sdbk1

Writing disk header: done

Instantiating disk: done

[root@source01 ~]# oracleasm createdisk DEV2 /dev/sdbl1

Writing disk header: done

Instantiating disk: done

[root@source01 ~]# oracleasm createdisk DEV3 /dev/sdbm1

Writing disk header: done

Instantiating disk: done

[root@source01 ~]# oracleasm scandisks

Reloading disk partitions: done

Cleaning any stale ASM disks...

Scanning system for ASM disks...

[root@source01 ~]# oracleasm listdisks

DATA1

DATA2

DATA3

DATA4

DATA5

DATA6

DATA7

DATA8

DATA9

DEV1 ** new disk

DEV2 ** new disk

DEV3 ** new disk

FRA1

FRA2

TEST1

3. In order to make these new ASM disks available in source node 2 (remember this is RAC) , we rescan the scsi and use oracleasm scandisks in source node 2 as well..

root@source02 ~]# echo "- - -" > /sys/class/scsi_host/host0/scan
[root@source02 ~]# echo "- - -" > /sys/class/scsi_host/host1/scan
[root@source02 ~]# echo "- - -" > /sys/class/scsi_host/host2/scan
[root@source02 ~]# echo "- - -" > /sys/class/scsi_host/host3/scan
[root@source02 ~]# echo "- - -" > /sys/class/scsi_host/host4/scan
[root@source02 ~]# oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...
Instantiating disk "DEV2"
Instantiating disk "DEV1"
Instantiating disk "DEV3"
[root@source02 ~]# oracleasm listdisks
DATA1
DATA2
DATA3
DATA4
DATA5
DATA6
DATA7
DATA8
DATA9
DEV1 ** new disk
DEV2 ** new disk
DEV3 ** new disk
FRA1
FRA2
TEST1

4. After these scsi and ASM disks become available on both of the RAC nodes, we create a diskgroup called +DEV on these new 3 disks using ASMCA . After that we create a 2 node RAC database called DEVRAC and we configure this database and make all its files (controlfiles,redologs,datafiles etc..) be put into +DEV diskgroup.

SQL*Plus: Release 11.2.0.4.0 Production on Wed May 10 13:39:43 2017

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,

Data Mining and Real Application Testing options

SQL> select instance_name from gv$instance;

INSTANCE_NAME

----------------

DEVRAC1

DEVRAC2

SQL> select name from v$datafile;

NAME

--------------------------------------------------------------------------------

+DEV/devrac/system01.dbf

+DEV/devrac/sysaux01.dbf

+DEV/devrac/undotbs01.dbf

+DEV/devrac/users01.dbf

+DEV/devrac/undotbs02.dbf

SQL> select name from v$controlfile;

NAME

--------------------------------------------------------------------------------

+DEV/devrac/control01.ctl

+DEV/devrac/control02.ctl

SQL> select member from v$logfile;

MEMBER

--------------------------------------------------------------------------------

+DEV/devrac/redo02.log

+DEV/devrac/redo01.log

+DEV/devrac/redo03.log

+DEV/devrac/redo04.log

5. At this point, in order to make the necessary checks in the target environment, we create a table called ERMAN in the source. This table resides on a tablespace called ERMAN. (all the datafiles of this tablespace called ERMAN reside in +DEV diskgroup)
So, we first create a tablespace called ERMAN by creating all its datafiles on diskgroup +DEV. Then, we create a table named ERMAN in the tablespace called ERMAN . (lots of ERMANs.., I know :)

SQL> CREATE TABLESPACE ERMAN DATAFILE '+DEV';

Tablespace created.

SQL> select name from v$datafile;

NAME

--------------------------------------------------------------------------------

+DEV/devrac/system01.dbf

+DEV/devrac/sysaux01.dbf

+DEV/devrac/undotbs01.dbf

+DEV/devrac/users01.dbf

+DEV/devrac/undotbs02.dbf

+DEV/devrac/datafile/erman.270.943623731

SQL> create table ERMAN tablespace ERMAN as select * from dba_objectS;

Table created.

SQL> insert into ERMAN select * from ERMAN;

86293 rows created.

SQL> commit;

Commit complete.

SQL> insert into ERMAN select * from ERMAN;

172586 rows created.

6. We put our database (DEVRAC) to backup mode/hot backup mode and while it is in backup mode, we create a storage snapshot and then do a snapmirror update. After the snapmirror update, we create the luns on the target storage system (using the snapmirrored data) and map these luns to the target database node. After the lun mapping, we rescan the scsi on target node (Linux), and make ASM (running on target node) see these new luns (using oracleasm)

root@targetsinglenode ~]# echo "- - -" > /sys/class/scsi_host/host0/scan

[root@targetsinglenode ~]# echo "- - -" > /sys/class/scsi_host/host1/scan

[root@targetsinglenode ~]# echo "- - -" > /sys/class/scsi_host/host2/scan

[root@targetsinglenode ~]# echo "- - -" > /sys/class/scsi_host/host3/scan

[root@targetsinglenode ~]# echo "- - -" > /sys/class/scsi_host/host4/scan

Disk /dev/sdn: 11.8 GB, 11811160064 bytes

64 heads, 32 sectors/track, 11264 cylinders

Units = cylinders of 2048 * 512 = 1048576 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 65536 bytes

Disk identifier: 0x42926436

Device Boot Start End Blocks Id System

/dev/sdn1 1 11264 11534320 83 Linux

Disk /dev/sdo: 12.9 GB, 12884901888 bytes

64 heads, 32 sectors/track, 12288 cylinders

Units = cylinders of 2048 * 512 = 1048576 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 65536 bytes

Disk identifier: 0xb900c4fd

Device Boot Start End Blocks Id System

/dev/sdo1 1 12288 12582896 83 Linux

Disk /dev/sdp: 14.0 GB, 13960740864 bytes

64 heads, 32 sectors/track, 13314 cylinders

Units = cylinders of 2048 * 512 = 1048576 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 65536 bytes

Disk identifier: 0x65f91faf

Device Boot Start End Blocks Id System

/dev/sdp1 1 13314 13633520 83 Linux

[root@targetsinglenode ~]# oracleasm scandisks

Reloading disk partitions: done

Cleaning any stale ASM disks...

Scanning system for ASM disks...

Instantiating disk "DEV1"

Instantiating disk "DEV2"

Instantiating disk "DEV3"

7. After making the snapmirrored disks available on the target host, we modify the init.ora, which we copy from the source. We modify it to exclude the RAC related parameters, because the target node is a single node.
After modifying the init.ora, we mount the diskgroup named +DEV to the target node's ASM and we startup the DEVRAC database as single node database in our target node.
Note that, after opening the database, we check the table named ERMAN in the target instance to ensure that it is replicated successfully.

[oracle@targetsinglenode ~]$ sqlplus "/as sysasm"

SQL*Plus: Release 11.2.0.4.0 Production on Wed May 10 14:26:35 2017

Copyright (c) 1982, 2013, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Automatic Storage Management option

SQL> alter diskgroup DEV mount;

SQL> startup mount;
ORACLE instance started.

Total System Global Area 2355015680 bytes
Fixed Size 2255472 bytes
Variable Size 587203984 bytes
Database Buffers 1761607680 bytes
Redo Buffers 3948544 bytes
Database mounted.
SQL> alter database end backup;

Database altered.

SQL> alter database open;

Database altered.

SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
+DEV/devrac/system01.dbf
+DEV/devrac/sysaux01.dbf
+DEV/devrac/undotbs01.dbf
+DEV/devrac/users01.dbf
+DEV/devrac/undotbs02.dbf
+DEV/devrac/datafile/erman.270.943623731

6 rows selected.

SQL> select count(*) from ERMAN;

COUNT(*)
----------
172586

SQL> alter system checkpoint;

Note: After opening the database, we monitor the alert logs and check the related log files for both RDBMS and GRID and check whether any errors recorded in them.. In conclusion, we don't see any errors.

8. Just in case, we use the replicated database a little bit. We do some read write tests by inserting some data using a loop, switching the log files and so on.. During these tests, we do not encounter any abnormalities.

SQL*Plus: Release 11.2.0.4.0 Production on Wed May 10 14:57:11 2017

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning, Automatic Storage Management, OLAP, Data Mining

and Real Application Testing options

SQL> create table ERMAN2 as select * from ERMAN;

BEGIN

FOR x IN 1 .. 10 LOOP

INSERT INTO ERMAN2

SELECT * from ERMAN;

END LOOP;

END;

Table created.

SQL> 2 3 4 5 6

7 /

PL/SQL procedure successfully completed.

SQL>

SQL> commit;

Commit complete.

SQL> archive log all;

ORA-00271: there are no logs that need archiving

SQL> alter system checkpoint;

System altered.

SQL> alter system switch logfile;

System altered.

8. Actually, at this point the POC is successfully finished, but I want to give you a bonus info about the deletion of the databases and other things that we created during this POC.

Basically, we drop the databases using rman. We drop the diskgroups using the ASMCA and we clear these candidate disks using oracleasm(deletedisk).

You can see the "rman drop database method" in the the following example.. The method that I give in the following example is for a single node database, but it can be used in RAC environments as well.. ( note: in order to drop a RAC database using rman, the parameter named cluster_database must be set to "false" before executing the drop database command)

[oracle@targetsinglenode trace]$ sqlplus "/as sysdba"

SQL*Plus: Release 11.2.0.4.0 Production on Wed May 10 15:01:06 2017

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning, Automatic Storage Management, OLAP, Data Mining

and Real Application Testing options

SQL> shu immediaTE;

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL>

SQL> startup mount exclusive;

ORACLE instance started.

Total System Global Area 2355015680 bytes

Fixed Size 2255472 bytes

Variable Size 587203984 bytes

Database Buffers 1761607680 bytes

Redo Buffers 3948544 bytes

Database mounted.

SQL> alter system enable restricted session;

System altered.

[oracle@targetsinglenode trace]$ echo $ORACLE_SID

DEVRAC

[oracle@targetsinglenode trace]$ rman target /

Recovery Manager: Release 11.2.0.4.0 - Production on Wed May 10 15:04:12 2017

connected to target database: DEVRAC (DBID=680719421, not open)

RMAN> drop database including backups;

database name is "DEVRAC" and DBID is 680719421

Do you really want to drop all backups and the database (enter YES or NO)? YES

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=25 device type=DISK

specification does not match any backup in the repository

released channel: ORA_DISK_1

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=25 device type=DISK

specification does not match any datafile copy in the repository

specification does not match any control file copy in the repository

List of Archived Log Copies for database with db_unique_name DEVRAC

=====================================================================

Key Thrd Seq S Low Time

------- ---- ------- - ---------

2 1 7 A 10-MAY-17

Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_7.271.943623819

3 1 8 A 10-MAY-17

Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_8.272.943623843

4 1 9 A 10-MAY-17

Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_9.273.943626473

7 1 10 A 10-MAY-17

Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_10.276.943628345

8 1 11 A 10-MAY-17

Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_11.277.943628349

9 1 12 A 10-MAY-17

Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_12.278.943628353

10 1 13 A 10-MAY-17

Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_13.279.943628355

11 1 14 A 10-MAY-17

Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_14.280.943628377

1 2 1 A 10-MAY-17

Name: +DEV/devrac/archivelog/2017_05_10/thread_2_seq_1.269.943619773

5 2 2 A 10-MAY-17

Name: +DEV/devrac/archivelog/2017_05_10/thread_2_seq_2.274.943626473

6 2 3 A 10-MAY-17

Name: +DEV/devrac/archivelog/2017_05_10/thread_2_seq_3.275.943626475

deleted archived log

archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_7.271.943623819 RECID=2 STAMP=943623818

deleted archived log

archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_8.272.943623843 RECID=3 STAMP=943623842

deleted archived log

archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_9.273.943626473 RECID=4 STAMP=943626472

deleted archived log

archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_10.276.943628345 RECID=7 STAMP=943628345

deleted archived log

archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_11.277.943628349 RECID=8 STAMP=943628349

deleted archived log

archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_12.278.943628353 RECID=9 STAMP=943628352

deleted archived log

archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_13.279.943628355 RECID=10 STAMP=943628356

deleted archived log

archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_14.280.943628377 RECID=11 STAMP=943628377

deleted archived log

archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_2_seq_1.269.943619773 RECID=1 STAMP=943619772

deleted archived log

archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_2_seq_2.274.943626473 RECID=5 STAMP=943626473

deleted archived log

archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_2_seq_3.275.943626475 RECID=6 STAMP=943626474

Deleted 11 objects

database name is "DEVRAC" and DBID is 680719421

database dropped

Well. That is it :) it is working.. Please feel free to comment.

3 comments :

Data Replication Disaster RecoveryJuly 29, 2018 at 10:33 AM
Nice blog... Data replication disaster recovery is the best way to protect and recover data. Replication is just simply the concept of copying data as it is changing over to another site.
ReplyDelete
Replies
Oracle Disaster Recovery SolutionsDecember 31, 2018 at 4:44 PM
The information shared on the oracle disaster recovery solutions is perfectly described in the blog, which is really appreciative, quality content and useful information. Thanks for sharing.
ReplyDelete
Replies
HemanthApril 12, 2020 at 10:05 AM
How DISK group information updated in ASM of DR site? Do you need to create diskgroup and replicated disks manually ? Please suggest!
ReplyDelete
Replies

Add comment

If you will ask a question, please don't comment here..

For your questions, please create an issue into my forum.

Forum Link: http://ermanarslan.blogspot.com.tr/p/forum.html

Register and create an issue in the related category.
I will support you from there.

Subscribe to: Post Comments ( Atom )