Tuesday, May 23, 2017

RDBMS/Exadata -- Estimating the HCC compression ratio and the gain //using DBMS_COMPRESSION.GET_COMPRESSION_RATIO

If you are considering buying an Exadata and if you are wondering how much storage space that you can save with the Exadata specific compression methods, you should take a look at this one.

You can see the compression ratio and calculate the % gain for a table by executing the dbms_compression.get_compression_ratio (specifying COMP_FOR_QUERY_HIGH or other compression types such as COMPRESS_FOR_QUERY_LOW as the compression type.)

You can use this method to compute the compression ratio and gain, even if you do not currently have the required hardware to run HCC. (In other words; you don't need your database running on Exadata or you don't need your database files residing on Pillar or ZFS storage hardware)

Note that, the name of compression types change according to the release. For example, it is COMP_QUERY_HIGH if the database is 12C.

Anyways, by using this method, you can have an idea about what you can gain in terms of storage savings.

Ofcourse, there are some requirements for using this method;
  • Your database must be 11gR2 ( I recommend >=11.2.0.2, because there are bug records for 11.2.0.1 dbms_compression )
  • The RDBMS compatibility parameter must to be set to at least 11.2.0.
  • The table to be estimated needs to have a minimum of 1 million rows.
The code reports the stats both from uncompressed and compressed version of the table (the code itself creates those in the scratch tablespace that we specify as the "scratchtbsname")

The code creates 2 internal tables named DBMS_TABCOMP_TEMP_UNCMP (uncompressed) and DBMS_TABCOMP_TEMP_CMP (compressed) . It generates the COMPRESSION_RATIO by comparing these two tables. It drops these tablespace immediately after the estimation.

Let's see an example for QUERY HIGH compression->
  • We get a big table such as XLA_DIAG_SOURCES;

SQL> select owner,table_name,num_rows from dba_tables where num_rows>100000 order by num_rows desc;

OWNER                          TABLE_NAME                       NUM_ROWS
------------------------------ ------------------------------ ----------
XLA                            XLA_DIAG_SOURCES                226354744

  • The user that owns the table to be analyzed, must have quota on the scratch tablespace. So we quato to the user. This is to avaoid "ORA-01950: no privileges on tablespace" 'XXERM' errors.
SQL>

 ALTER USER XLA QUOTA UNLIMITED ON XXERM;

User altered.    
  • We execute the DBMS_COMPRESSION.GET_COMPRESSION_RATIO.
SQL>

set serveroutput on
declare
 v_scratchtbsname varchar2(32) := 'XXERM';
 v_ownname varchar2(32) := 'XLA';
 v_tabname varchar2(32) := 'XLA_DIAG_SOURCES';
 v_partname varchar2(32) := null;
 v_comptype number(10,0) := DBMS_COMPRESSION.COMP_FOR_QUERY_HIGH;
 v_blkcnt_cmp pls_integer;
 v_blkcnt_uncmp pls_integer;
 v_row_cmp pls_integer;
 v_row_uncmp pls_integer;
 v_cmp_ratio number;
 v_comptype_str varchar2(4000);
begin
 DBMS_COMPRESSION.GET_COMPRESSION_RATIO (
 Scratchtbsname => v_scratchtbsname,
 ownname => v_ownname,
 tabname => v_tabname,
 partname => v_partname,
 comptype => v_comptype,
 blkcnt_cmp => v_blkcnt_cmp,
 blkcnt_uncmp => v_blkcnt_uncmp,
 row_cmp => v_row_cmp,
 row_uncmp => v_row_uncmp,
 cmp_ratio => v_cmp_ratio,
 comptype_str => v_comptype_str
 );
 dbms_output.put_line('Block count if compressed  :' || v_blkcnt_cmp);
 dbms_output.put_line('Block count if uncompressed: ' || v_blkcnt_uncmp);
 dbms_output.put_line('Row count per block if compressed: ' || v_row_cmp);
 dbms_output.put_line('Row count per block if uncompressed : ' || v_row_uncmp);
 dbms_output.put_line('Compression ratio :' || v_cmp_ratio);
 dbms_output.put_line('Gain (%) :' || round((v_blkcnt_uncmp-v_blkcnt_cmp)/v_blkcnt_uncmp*100,2));
 dbms_output.put_line('Method : ' || v_comptype_str);
end;

OUTPUT
--------------
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows

Block count if compressed: 353
Block count if uncompressed:  19745
Row count if compressed: 2833
Row count if uncompressed : 50
Compression ratio :55.9
Gain (%) :98.21
Method : "Compress For Query High"

Let's see an example for QUERY LOW compression->

SQL> set serveroutput on
declare
v_scratchtbsname varchar2(32) := 'XXERM';
v_ownname varchar2(32) := 'XLA';
v_tabname varchar2(32) := 'XLA_DIAG_SOURCES';
v_partname varchar2(32) := null;
v_comptype number(10,0) := DBMS_COMPRESSION.COMP_FOR_QUERY_LOW;
v_blkcnt_cmp pls_integer;
v_blkcnt_uncmp pls_integer;
v_row_cmp pls_integer;
v_row_uncmp pls_integer;
v_cmp_ratio number;
v_comptype_str varchar2(4000);
begin
DBMS_COMPRESSION.GET_COMPRESSION_RATIO (
Scratchtbsname => v_scratchtbsname,
ownname => v_ownname,
tabname => v_tabname,
partname => v_partname,
comptype => v_comptype,
blkcnt_cmp => v_blkcnt_cmp,
blkcnt_uncmp => v_blkcnt_uncmp,
row_cmp => v_row_cmp,
row_uncmp => v_row_uncmp,
cmp_ratio => v_cmp_ratio,
comptype_str => v_comptype_str
);
 dbms_output.put_line('Block count if compressed  :' || v_blkcnt_cmp);
 dbms_output.put_line('Block count if uncompressed: ' || v_blkcnt_uncmp);
 dbms_output.put_line('Row count per block if compressed: ' || v_row_cmp);
 dbms_output.put_line('Row count per block if uncompressed : ' || v_row_uncmp);
 dbms_output.put_line('Compression ratio :' || v_cmp_ratio);
 dbms_output.put_line('Gain (%) :' || round((v_blkcnt_uncmp-v_blkcnt_cmp)/v_blkcnt_uncmp*100,2));
 dbms_output.put_line('Method : ' || v_comptype_str);
end;

OUTPUT
--------------------
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
Block count if compressed: 836
Block count if uncompressed:  19745
Row count if compressed: 1196
Row count if uncompressed : 50
Compression ratio : 23.6
Gain (%) : 95.77
Method : "Compress For Query Low"

Wednesday, May 17, 2017

RAC ASM -- Disaster Recovery solution (using storage techniques) snapshot/snapmirror, Netapp

This blog post is about an alternative Disaster Recovery solution that I  recently built through a POC (Proof of Concept).
The source system was a 2 Node Oracle RAC (Grid + RDBMS both 11.2.0.4) and the target system residing in DR was a 1 Node environment (GRID + RDBMS both 11.2.0.4).
Both source and target was using ASM for storing all the db files.
This made me think a little bit, becuase this alternative disaster recovery solution was relying on Netapp snapshot and snapmiror mechanisms.
We were planning to put the database to the backup mode (hot backup mode
/alter database begin backup), take the snapshot on the local storage, transfer the snapshot to the DR storage using snapmirror , map the luns created by this snapshot/snapmirror to the target server, end backup in target and open the database successfully in the target.

ASM was my concern, because  I had never done this before.
Actually I have implemented (even automated) this solution 6 years ago, but the filesystems were ext3 ( a cooked one) in that case.

So, I was afraid that ASM header info might get corrupted during the snapshot/snapmirror operations.
However, everything went well.

It is ofcourse not supported by Oracle. However,  I can say that ASM diskgroups can be replicated using the snapshot/snapmirror operations. (ofcourse if you know what you are doing..)

I did 3 tests and didn't encounter any problems.

In this blog post, I will give you the things that I did to replicate a 2 Node RAC system(Grid/ASM+RDBMS) to a single Node System(Grid/ASM+RDBMS) using Netapp's snapshot and snapmirror functionalities.

Following is a representative picture, explaining the work done.
As shown in the following picture, we take the snapshot of the ASM disks, replicate it to the DR using snapmirror and then map the luns to the target server running on the DR site, and make ASM running on the target server to see those disks...



The operation that was done for this, requires both Linux admin and storage admin skills. ( I don't see a need for mentioning the required DBA skills :) )
It requires Linux admin skills, because we do stuff like formatting the drives (we are working on new luns since it is a POC), rescanning the scsi bus and etc.
It requires the Storage admin skills, because we do stuff like taking snapshots, initiating snapmirror, mapping luns and etc.
Lastly, it requires DBA skills, because we put the database to the backup mode, change some init.ora parameters, make the replicated disks be known by ASM, open our database succesfully in the target node (recovery may ve needed), check the log files in case of an error and so on.

Let's see what I did in this POC ->
Note that, this is a technical overview. (doesn't have to include all the steps that I have done and all the commands that I have used).

RAC to SINGLE STORAGE REPLICATION (ASM diskgroup which is built on top of 3 disks)

1. Storage team creates 3 ASM disks (each sized approx. 13gb) and maps these luns to our Linux RAC nodes. (source nodes)
We do the scsi rescan to make Linux see these luns/disks, and then we create partition on them.  (oracleasm createdisk requires partition)

fdisk output just after the scsi rescan:

Disk /dev/sdbk: 11.8 GB, 11811160064 bytes
64 heads, 32 sectors/track, 11264 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disk identifier: 0x00000000


Disk /dev/sdbl: 12.9 GB, 12884901888 bytes
64 heads, 32 sectors/track, 12288 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disk identifier: 0x00000000


Disk /dev/sdbm: 14.0 GB, 13960740864 bytes
64 heads, 32 sectors/track, 13314 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disk identifier: 0x00000000

Attention, before discovering the disks we use fdisk to create partitions on them.
fdisk output just after the partition creations:

Disk /dev/sdbk: 11.8 GB, 11811160064 bytes
64 heads, 32 sectors/track, 11264 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disk identifier: 0x42926436

    Device Boot      Start         End      Blocks   Id  System
/dev/sdbk1               1       11264    11534320   83  Linux

Disk /dev/sdbl: 12.9 GB, 12884901888 bytes
64 heads, 32 sectors/track, 12288 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disk identifier: 0xb900c4fd

    Device Boot      Start         End      Blocks   Id  System
/dev/sdbl1               1       12288    12582896   83  Linux

Disk /dev/sdbm: 14.0 GB, 13960740864 bytes
64 heads, 32 sectors/track, 13314 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disk identifier: 0x65f91faf

    Device Boot      Start         End      Blocks   Id  System
/dev/sdbm1               1       13314    13633520   83  Linux


2. We use oracleasm to create the ASM disks and again using oracleasm, we make ASM see them.

[root@source01 ~]# oracleasm createdisk DEV1 /dev/sdbk1
Writing disk header: done
Instantiating disk: done
[root@source01 ~]# oracleasm createdisk DEV2 /dev/sdbl1
Writing disk header: done
Instantiating disk: done
[root@source01 ~]# oracleasm createdisk DEV3 /dev/sdbm1
Writing disk header: done
Instantiating disk: done

[root@source01 ~]# oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...
[root@source01 ~]# oracleasm listdisks

DATA1
DATA2
DATA3
DATA4
DATA5
DATA6
DATA7
DATA8
DATA9
DEV1 ** new disk
DEV2 ** new disk
DEV3 ** new disk
FRA1
FRA2
TEST1

3. In order to make these new ASM disks available in source node 2 (remember this is RAC) , we rescan the scsi and use oracleasm scandisks in source node 2 as well..

root@source02 ~]#  echo "- - -" > /sys/class/scsi_host/host0/scan
[root@source02 ~]#  echo "- - -" > /sys/class/scsi_host/host1/scan
[root@source02 ~]#  echo "- - -" > /sys/class/scsi_host/host2/scan
[root@source02 ~]#  echo "- - -" > /sys/class/scsi_host/host3/scan
[root@source02 ~]#  echo "- - -" > /sys/class/scsi_host/host4/scan
[root@source02 ~]# oracleasm  scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...
Instantiating disk "DEV2"
Instantiating disk "DEV1"
Instantiating disk "DEV3"
[root@source02 ~]# oracleasm listdisks
DATA1
DATA2
DATA3
DATA4
DATA5
DATA6
DATA7
DATA8
DATA9
DEV1  ** new disk
DEV2  ** new disk
DEV3  ** new disk
FRA1
FRA2
TEST1

4. After these scsi and ASM disks become available on both of the RAC nodes, we create a diskgroup  called +DEV on these new 3 disks using ASMCA . After that we create a 2 node RAC database called DEVRAC and we configure this database and make all its files (controlfiles,redologs,datafiles etc..) be put into +DEV diskgroup.

SQL*Plus: Release 11.2.0.4.0 Production on Wed May 10 13:39:43 2017

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> select instance_name from gv$instance;

INSTANCE_NAME
----------------
DEVRAC1
DEVRAC2

SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
+DEV/devrac/system01.dbf
+DEV/devrac/sysaux01.dbf
+DEV/devrac/undotbs01.dbf
+DEV/devrac/users01.dbf
+DEV/devrac/undotbs02.dbf

SQL> select name from v$controlfile;

NAME
--------------------------------------------------------------------------------
+DEV/devrac/control01.ctl
+DEV/devrac/control02.ctl

SQL> select member from v$logfile;

MEMBER
--------------------------------------------------------------------------------
+DEV/devrac/redo02.log
+DEV/devrac/redo01.log
+DEV/devrac/redo03.log
+DEV/devrac/redo04.log


5. At this point, in order to make the necessary checks in the target environment, we create a table called ERMAN in the source. This table resides on a tablespace called ERMAN. (all the datafiles of this tablespace called ERMAN reside in +DEV diskgroup)
So, we first create a tablespace called ERMAN by creating all its datafiles on diskgroup +DEV. Then, we create a table named ERMAN in the tablespace called ERMAN . (lots of ERMANs.., I know :)

SQL> CREATE TABLESPACE ERMAN DATAFILE '+DEV';

Tablespace created.

SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
+DEV/devrac/system01.dbf
+DEV/devrac/sysaux01.dbf
+DEV/devrac/undotbs01.dbf
+DEV/devrac/users01.dbf
+DEV/devrac/undotbs02.dbf
+DEV/devrac/datafile/erman.270.943623731


SQL> create table ERMAN tablespace ERMAN as select  * from dba_objectS;

Table created.

SQL> insert into ERMAN select * from ERMAN;

86293 rows created.

SQL> commit;

Commit complete.

SQL> insert into ERMAN select * from ERMAN;

172586 rows created.

6. We put our database (DEVRAC) to backup mode/hot backup mode and while it is in backup mode, we create a storage snapshot and then do a snapmirror update. After the snapmirror update, we create the luns on the target storage system (using the snapmirrored data) and map these luns to the target database node. After the lun mapping, we rescan the scsi on target node (Linux), and make ASM (running on target node) see these new luns (using oracleasm)

root@targetsinglenode ~]# echo "- - -" > /sys/class/scsi_host/host0/scan
[root@targetsinglenode ~]# echo "- - -" > /sys/class/scsi_host/host1/scan
[root@targetsinglenode ~]# echo "- - -" > /sys/class/scsi_host/host2/scan
[root@targetsinglenode ~]# echo "- - -" > /sys/class/scsi_host/host3/scan
[root@targetsinglenode ~]# echo "- - -" > /sys/class/scsi_host/host4/scan

Disk /dev/sdn: 11.8 GB, 11811160064 bytes
64 heads, 32 sectors/track, 11264 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disk identifier: 0x42926436

   Device Boot      Start         End      Blocks   Id  System
/dev/sdn1               1       11264    11534320   83  Linux
Disk /dev/sdo: 12.9 GB, 12884901888 bytes
64 heads, 32 sectors/track, 12288 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disk identifier: 0xb900c4fd

   Device Boot      Start         End      Blocks   Id  System
/dev/sdo1               1       12288    12582896   83  Linux

Disk /dev/sdp: 14.0 GB, 13960740864 bytes
64 heads, 32 sectors/track, 13314 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disk identifier: 0x65f91faf

   Device Boot      Start         End      Blocks   Id  System
/dev/sdp1               1       13314    13633520   83  Linux

[root@targetsinglenode ~]# oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...
Instantiating disk "DEV1"
Instantiating disk "DEV2"
Instantiating disk "DEV3" 

7. After making the snapmirrored disks available on the target host, we modify the init.ora, which we copy from the source. We modify it to exclude the RAC related parameters, because the target node is a single node.
After modifying the init.ora, we mount the diskgroup named +DEV to the target node's ASM and we startup the DEVRAC database as single node database in our target node.
Note that, after opening the database, we check the table named ERMAN in the target instance to ensure that it is replicated successfully.

[oracle@targetsinglenode ~]$ sqlplus "/as sysasm"

SQL*Plus: Release 11.2.0.4.0 Production on Wed May 10 14:26:35 2017

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Automatic Storage Management option

SQL> alter diskgroup DEV mount;


SQL> startup mount;
ORACLE instance started.

Total System Global Area 2355015680 bytes
Fixed Size                  2255472 bytes
Variable Size             587203984 bytes
Database Buffers         1761607680 bytes
Redo Buffers                3948544 bytes
Database mounted.
SQL> alter database end backup;

Database altered.

SQL> alter database open;

Database altered.

SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
+DEV/devrac/system01.dbf
+DEV/devrac/sysaux01.dbf
+DEV/devrac/undotbs01.dbf
+DEV/devrac/users01.dbf
+DEV/devrac/undotbs02.dbf
+DEV/devrac/datafile/erman.270.943623731

6 rows selected.

SQL> select count(*) from ERMAN;

  COUNT(*)
----------
    172586

SQL> alter system checkpoint;

Note: After opening the database, we monitor the alert logs and check the related log files for both RDBMS and GRID and check whether any errors recorded in them.. In conclusion, we don't see any errors.

8. Just in case, we use the replicated database a little bit. We do some read write tests by inserting some data using a loop, switching the log files and so on.. During these tests, we do not encounter any abnormalities.

SQL*Plus: Release 11.2.0.4.0 Production on Wed May 10 14:57:11 2017

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options

SQL> create table ERMAN2 as select * from ERMAN;
BEGIN
    FOR x IN 1 .. 10 LOOP
         INSERT INTO ERMAN2
         SELECT * from ERMAN;
    END LOOP;
END;
Table created.

SQL>   2    3    4    5    6
  7  /

PL/SQL procedure successfully completed.

SQL>
SQL>
SQL> commit;

Commit complete.

SQL> archive log all;
ORA-00271: there are no logs that need archiving
SQL> alter system checkpoint;

System altered.

SQL> alter system switch logfile;

System altered.

8. Actually, at this point the POC is successfully finished, but I want to give you a bonus info about the deletion of the databases and other things that we created during this POC.
Basically, we drop the databases using rman. We drop the diskgroups using the ASMCA and we clear these candidate disks using oracleasm(deletedisk).
You can see the "rman drop database method" in the the following example.. The method that I give in the following example is for a single node database, but it can be used in RAC environments as well.. ( note: in order to drop a RAC database using rman, the parameter named cluster_database must be set to "false" before executing the drop database command)

[oracle@targetsinglenode trace]$ sqlplus "/as sysdba"
SQL*Plus: Release 11.2.0.4.0 Production on Wed May 10 15:01:06 2017
Copyright (c) 1982, 2013, Oracle.  All rights reserved. 

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options

SQL> shu immediaTE;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>
SQL> startup mount exclusive;
ORACLE instance started.

Total System Global Area 2355015680 bytes
Fixed Size                  2255472 bytes
Variable Size             587203984 bytes
Database Buffers         1761607680 bytes
Redo Buffers                3948544 bytes
Database mounted.
SQL> alter system enable restricted session;

System altered.

[oracle@targetsinglenode trace]$ echo $ORACLE_SID
DEVRAC
[oracle@targetsinglenode trace]$ rman target /

Recovery Manager: Release 11.2.0.4.0 - Production on Wed May 10 15:04:12 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: DEVRAC (DBID=680719421, not open)

RMAN> drop database including backups;

database name is "DEVRAC" and DBID is 680719421

Do you really want to drop all backups and the database (enter YES or NO)? YES

using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=25 device type=DISK
specification does not match any backup in the repository

released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=25 device type=DISK
specification does not match any datafile copy in the repository
specification does not match any control file copy in the repository
specification does not match any control file copy in the repository
List of Archived Log Copies for database with db_unique_name DEVRAC
=====================================================================

Key     Thrd Seq     S Low Time
------- ---- ------- - ---------
2       1    7       A 10-MAY-17
        Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_7.271.943623819

3       1    8       A 10-MAY-17
        Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_8.272.943623843

4       1    9       A 10-MAY-17
        Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_9.273.943626473

7       1    10      A 10-MAY-17
        Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_10.276.943628345

8       1    11      A 10-MAY-17
        Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_11.277.943628349

9       1    12      A 10-MAY-17
        Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_12.278.943628353

10      1    13      A 10-MAY-17
        Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_13.279.943628355

11      1    14      A 10-MAY-17
        Name: +DEV/devrac/archivelog/2017_05_10/thread_1_seq_14.280.943628377

1       2    1       A 10-MAY-17
        Name: +DEV/devrac/archivelog/2017_05_10/thread_2_seq_1.269.943619773

5       2    2       A 10-MAY-17
        Name: +DEV/devrac/archivelog/2017_05_10/thread_2_seq_2.274.943626473

6       2    3       A 10-MAY-17
        Name: +DEV/devrac/archivelog/2017_05_10/thread_2_seq_3.275.943626475

deleted archived log
archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_7.271.943623819 RECID=2 STAMP=943623818
deleted archived log
archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_8.272.943623843 RECID=3 STAMP=943623842
deleted archived log
archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_9.273.943626473 RECID=4 STAMP=943626472
deleted archived log
archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_10.276.943628345 RECID=7 STAMP=943628345
deleted archived log
archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_11.277.943628349 RECID=8 STAMP=943628349
deleted archived log
archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_12.278.943628353 RECID=9 STAMP=943628352
deleted archived log
archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_13.279.943628355 RECID=10 STAMP=943628356
deleted archived log
archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_1_seq_14.280.943628377 RECID=11 STAMP=943628377
deleted archived log
archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_2_seq_1.269.943619773 RECID=1 STAMP=943619772
deleted archived log
archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_2_seq_2.274.943626473 RECID=5 STAMP=943626473
deleted archived log
archived log file name=+DEV/devrac/archivelog/2017_05_10/thread_2_seq_3.275.943626475 RECID=6 STAMP=943626474
Deleted 11 objects

database name is "DEVRAC" and DBID is 680719421
database dropped

Well. That is it :) it is working.. Please feel free to comment.

Friday, May 5, 2017

Fractured Blocks- ORA-01578 & tail checks

Recently dealed with an ORA-1578 error, which could be an indication of a Fractured Block..
Because of this error, the database could not be opened.
The complete Oracle error stack recorded in the trace file, was like the following;

Corrupt block relative dba: 0x0101342a (file 4, block 78890)
.....
........
.............
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 4, block # 78890)
ORA-01110: data file 4: '/ortest/testdata/system04.dbf'

So, without a need to do a further diagnostic, it was obvious that a block in the system04.dbf was corrupted. 
However; as the error was ORA-01578, this made me think that the problem was due to a physical corruption, which was a result of an OS or Hardware IO error.. 
In addition, this physical corruption seemed to be the cause of a fractured block.

I like to explain the Fractured block everytime it is asked, because i find it interesting.. It is somehow related with the other stacks below Oracle .. ( OS, Storage and so on)

As oracle IO is done through the OS. The size of a particular IO depends on the OS Block size, which is 512 bytes for Linux.

So, if we have a 8K Block Sized Oracle Database and if we modify one block of it; (oracle does its modifications on block level), then we actually do multiple IOs. (when we look to this incident from the OS perspective)

Almost all the backup consistency related solutions are built by considering this. For example, hot backup mode (Backup mode) is designed to guarantee the consistency, to guarantee the recovery eventhough there may be fractured blocks in the backed up files.

So, a fractured block means that the database block is incomplete/physically corrupted.

Suppose we are doing a DB block IO. It is 8K right? So, considering our OS block size is 512bytes, a single DB block IO means multiple OS IO (16 IOs..) . 

Suppose OS does 15 IOs and during the 16th IO it encounteres a problem and can not do the 16th IO.. 

Now suppose; Oracle can not even catch it, or we as Oracle couldn't get the return, or let's say; could not handle it.. (or OS says it is ok, altough it is not ok..)

At this moment, our Block is physically corrupt and it is inconsistent, since it has one part coming from the future , and one part coming from the past..

On the other hand; Oracle knows about this danger and it has mechanisms to check this.

In order to check this, Oracle updates a field in the end of each block. The size of this field is 4 bytes. 

Oracle writes down a number into these 4 bytes for checking the consistency of the tail of the block in the subsequent updates.

That is, Oracle updates this 4 byte field, which is located in the tail of each block, before writing down the block to disk completely.  (before giving block to OS -- for IO)

This value stored in a 4 byte field is called the tailchk..I t is a number, which is built by using the information stored in the block itself (which oracle reads from the head and remembers).

If the tailchk is aligned with the related data that is stored in block header, Oracle ensures that at least the tail of the block is consistent with the head.

Tailchk value is computed using the the last 4 digits of the SCN base, the Block Type and the SCN sequence Number.
Additional info can be found in Fractured Block explanation (Doc ID 1392417.1).

Note: The tailcheck only verifies if the header and the tail of the block have been written correctly. The complete(whole) block verification can be achieved using the DB_BLOCK_CHECKSUM..

So in our case, as seen below, the block type= 0 , last 4 digits of scn is 0001 and seq is 0x1..
So, our tailchk value must be -> 0x00010001, however; it is 0x19b90601, as you see below.
(--block type is 0 .. This is wrong in the first place, but still wanted to show you how we calculate the tailchk--)

Corrupt block relative dba: 0x0101342a (file 4, block 78890)
Bad header found during buffer read
Data in bad block:
 type: 0 format: 3 rdba: 0x0000000c
 last change scn: 0x056e.ea020001 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x19b90601
 check value in block header: 0x747
 computed block checksum: 0x8577
Reread of rdba: 0x0101342a (file 4, block 78890) found same corrupted data
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 4, block # 78890)
ORA-01110: data file 4: '/ortest/testdata/system04.dbf'

Also, computed block checksum approves that this block is corrupt, as it is different than 0x0.
Actually the block type is 0 here. Type 0 is an unknown type. So this blog is really a corrupt one. It is not only the computed block checksum that proves this.  (computed block checksum is meaningless actually.. because the type is 0) 
In addition, the content of block is actually for a different block. Rdba value is different than the requested dba value.
So this blog is both corrupted and misplaced.

The methods that Oracle suggests for fixing these kinds of problems, can be seen below;

Note 144911.1 RMAN : Block-Level Media Recovery - Concept & Example
Note 342972.1 How to perform Block Media Recovery (BMR) when backups are not taken by RMAN ORA-1578 Methods to Skip Block Corruption (Doc ID 2199133.1)

Above 2 methods rely on backups.. So we need to have RMAN or non-RMAN backups in order to fix these kinds of problems using the methods above.

However, what if we don't have backups?

Well, there are 4 things that I can say on this ->

1)We can recreate the index, if the fractured block is an index block.

2)We can use methods for skipping block corruption (documented in 2199133.1)

3)We can use "bbed" utility to update the consistency value in tail and computed block checksum properly in order to at least start our database. (using bbed at on our own risk) -- this method is a very advanced one.

4)There are cases for specific scenarios.. For example this one -> Error ORA-1578 reported on the database after changing compatible.rdbms >= 11 and resizing a datafile (Doc ID 965751.1)Taking a look to Oracle Support is always a good idea.

Tuesday, May 2, 2017

EBS -- OAF Page is not updatable - no errors received // caused by Accessibility features settings

Here is a strange issue encountered in an EBS 12.2 environment.(EBS 12.2.4)
It was related with a specific OAF Page (aka Self Service page) named GL_ASF_ALC_SUMMARY, which is shown in the following screenshot.


The problem was weird.. After rendering, the page just didn't respond to any of the user inputs.
We clicked on the buttons, tried to use LOVs and so on... We even tried using different browsers and different clients but couldn't make it work.
There were no errors in OAF logs, no errors in HTTP trace tools and no warning in our browsers.

In addition, the url produced for this page seemed to be okay..
There were no personalization for this page and the classes and everyting were there in the fileystem, and they were looking all good.

The url:
http://ermansrv:8000/OA_HTML/OA.jsp?page=/oracle/apps/gl/asflow/config/webui/GL_ASF_CONFIG_ALC_SUMMARY_PG&_ri=101&objectId=2070&objectType=PRIMARY&_ti=1375218639&retainAM=Y&addBreadCrumb=Y&OASF=GL_ASF_ALC_SUMMARY&oapc=27&oas=xMIKbnyHRvklsGwM8W3WcA..
However, the page just didn't work..
Okay.. I will keep this short..
Guess what the cause was..

"The accessibility features"...

The users encountering this problem had "Screen Reader Optimized" set in their user preferences. ( as the Accessibility Feature setting ) In other words, the profile option named "Self Service Accessibility Features" was set to Screen Reader Optimized , for these users.

Note:  when we set a preference, EBS actually updates the related Profile Options for our user. (in our user level)

I said I keep it short ,but I juut can't directly give the solution without giving some extra info :)

The Accessibility Features preference is used for better supporting the use of assistive technologies, such as screen readers for blind users in EBS.

When "Screen Reader Optimised" mode is set, EBS builds the pages in a way to better support the screen readers.
Here is the defition for values that can be set for the Screen Reader Optimized user preference:

* Standard Accessibility - pages are accessible to users using assistive technologies.
* Screen Reader Optimized - pages are optimized for screen readers. This may degrade the output for a sighted user.
* None - pages include behaviors that are not accessible to users using assistive technologies"

The screen reader is a software for a blind or visually impaired person to use the computer screen. The screen readers usally read the text diplayed in the screen and convert it into a form those users can understand.. tactile, auditory or a combination of both..

I have tested this screen reader thing using the ChromeVox, which is an add-on for the Chrome Browsers, and I found it interesting, as it actually understand the mouse moves, interprets the screen and reads the contents accordingly.

Okay after giving this extra info, let's see what the solution was:

Solution:

Query the system profile, Self Service Accessibility Features
Update the value to None (or Null) for the User (user level) that is getting this error.
Save and retest Retest

Alternative solution:

Go to Preferences - > Accessibility feature and set it to None
Apply the changes
Retest the issue
One more thing.. You may ask why did this error encountered? 
This question is normal because screen reader optimizer mode should not break anyting in EBS, normally..
However, in real world it does. It is a bug ofcourse and the solutions above are just  the workarounds.
I believe it is a bug of the page that is failing.. (in this case GL_ASF_ALC_SUMMARY)
So I think, the module should be upgraded to the latest code level in order to get the fix . (In this case, GL should be upgraded..)

Anyways, I' m giving you the related MOS documents here and I promise, I will revisit this post if I will find something more about it. 

By that time, you have the workarounds :)
  • Checklist for Assistive Technologies Issues with Oracle E-Business Suite (Doc ID 1100934.1)
  • http://www.oracle.com/us/corporate/accessibility/ebs-accessibility-white-paper-163185.pdf (E-Business Suite Accessibility , an Oracle Whitepaper)