Friday, April 29, 2016

Rdbms/ASM, seperation of roles, interpretations of the linux permissions

If we want  to implement seperation of roles for Oracle on Linux, which means having two seperate users for Grid and RDBMS, then the linux permissions, file owners and group owners of ASM disk devices, as well as the oracle binaries in both Grid Home and Oracle Home becomes crucial..
Let's examine this from Linux perspective by going through a working example, where the Asm device files are owned by grid:asadmin group, oracle binary in Grid Home is owned by grid:oinstall diskgroup and where oracle binary in Oracle Database Home is owned by oracle:asmadmin.

In this working example, we have the permission of oracle binary in RDBMS_HOME/bin as follows;

permissions  Links     owner     group        file
-rwsr-s--x       1           oracle    asmadmin   oracle

What does the columns mean?

Permission column: Permission of the file and the type of the file (directory, file etc.. )
Links column: number of links, the links to that file.
Owner: Owner of the file
Group:Group owner of the file
File: file name

What does that actually mean?

oracle binary file is owned by an OS user named oracle, the group owner of the oracle binary file is asmadmin. There are no links to that file. 
When we look at the permissions, which is the most crucial part of it, it says: any Os user can execute this binary. 
The users in asmadmin group can read and execute the oracle binary, and the user oracle can read, write and execute the file oracle. 
When the file oracle is executed by any OS user who has the permission to execute it(actually anyone can execute it), it is executed with the permissions of asmadmin group and the permission of oracle user.
What basically it says to the OS operating system is; when oracle binary is executed, execute it like it is executed by the user oracle and by the group asmadmin.

The file permissions of the oracle binary located in GRID HOME is the same as the permissions of oracle binary in RDBMS_HOME, but the owner and the groups of it is different(grid:oinstall)

permissions  Links     owner     group        file
-rwsr-s--x        1           grid      oinstall  /u01/app/11.2.0.4/grid/bin/oracle

So , this basically means, everyone can execute oracle binary stored in GRID HOME and when this binary file is executed, it is executed like it is executed by grid user and by the oinstall group (in terms of permissions)

The file permissions of the super/device files that are used to reach the ASM disks, are

[oracle@daroravmsrv1 ~]$ ls -al /dev/mapper/HDD_E0_S*
permissions  Links     owner     group        file
brw-rw----      1             grid   asmadmin /dev/mapper/HDD_E0_S00_1461220864

"b" means, it is a block device which can be accessed randomly.
So, the disks are owned by grid:asmadmin and in order to able to read or write them, one should have grid user or asadmin user privileges or let's say, only the grid user or the users in asmadmin directory can read from or write to these files.

What does this configuration provides us?
By using this configuration, we  have different users(oracle and grid) which can access the ASM disks. That is, when we create a file from an Oracle Database running in the oracle home, it can access to ASM devices and create the file there. On the other hand, if we want to do a Grid administration stuff using the same user that is the owner of the Oracle Database Oracle Home, then we will not be able to. 
So, by having asmadmin as the group owner of the oracle binary located in Oracle home and as we use the suid and guid bits set for the oracle binary, our database or anything that uses the oracle binary stored in Oracle Home can reach the Asm Disk Groups without any problems. (can read and write to the ASM disk devices) 
The oracle binary in Grid Home, which is owned by grid can already access the ASM Disk groups, as the owner of the ASM disks is grid user.
So seperation of duties are in place without effecting the oracle database's mandatory ASM works.

We may ask, why we use the SUID an GUID bits in general? We see them in standard single node Non-ASM Oracle installations as well.
Well, the answer again : is they are required seperation of duties.
Suppose we want the OS user erman to be able make "sqlplus / as sysdba"  local connections in an environment, where the RDBMS owner is the OS user named oracle. In such a case, the OS user "erman" should be act like OS user "oracle" and this can be accomplished by the SUID an GUID bits of the permissions of the $ORACLE_HOME/bin/oracle file.

Without SUID and GUID bits, we end up with the following; (altough erman is in dba group)

[root@demoorcl ~]# su - erman
[erman@demoorcl ~]$ id
uid=54323(erman) gid=54322(dba) groups=54322(dba),54321(oinstall)
[erman@demoorcl ~]$ sqlplus "/as sysdba"
SQL*Plus: Release 11.2.0.3.0 Production on Tue Apr 26 15:12:32 2016

Copyright (c) 1982, 2011, Oracle. All rights reserved
ERROR:
ORA-12547: TNS:lost contact

A workaround for this is to make a TNS listener based sysdba connection like;

sqlplus sys@PROD as sysdba... This way, listener will bind us to a LOCAL=NO shadow process and we will not need the suid ang guid based permissions. (In case of a local connection; we need the permissions for executing the oracle binary because the oracle binary is executed by our processes in a local connection)
Almost the same effect will be there when we change the permission of oracle binary to "700". Local connections will end with ORA-12547, but the configuration will  actually be more secure, as no one except oracle os user will be able to oracle binary, and if someone(in dba group) wants to connect to the database using sysdba connection, he/or she should provide the password..

Here is a demo:

[oracle@demoorcl bin]$ ls -al oracle
-rwx------ 1 oracle oinstall 232891110 May 19  2015 oracle

[oracle@demoorcl bin]$ sqlplus "/as sysdba"
SQL*Plus: Release 11.2.0.3.0 Production on Fri Apr 29 19:37:11 2016
Copyright (c) 1982, 2011, Oracle.  All rights reserved
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> exit

[oracle@demoorcl bin]$ exit
logout
[root@demoorcl ~]# su - erman
[erman@demoorcl ~]$ sqlplus "/as sysdba"

SQL*Plus: Release 11.2.0.3.0 Production on Fri Apr 29 19:37:21 2016
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
ERROR:
ORA-12546: TNS:permission denied

[erman@demoorcl ~]$ sqlplus sys/manager@CLONE as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Fri Apr 29 19:37:33 2016
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> 


Well, this post become a little unstructured, but still it gives lost of valuable info. Hope you ll get benefit from that.

Saturday, April 23, 2016

RDBMS -- standby controlfile and the db_file_name_convert, log_file_name_convert parameters

db_file_name_convert and log_file_name_convert parameters are widely used in standby environments. Their main usage is to convert the paths of the datafile and redolog files that are created in the primary during the managed recovery and make them be able to created in the standby database. (when standby_file_management is set to auto)

In this post, I will mention about a different thing, that is done using these two parameters.
That is, as you may alredy know, in case we recreate standby controlfile, the paths in the newly created standby control reflects the paths of the database and redolog files of the primary database. If the primary and standby file paths are the same, then no issue, but when they are different that we need to do some extras.
There are 3 things that we can do for this;

1)Set the filename conversion initialization parameters. (db_file_name_convert, log_file_name_convert parameters)
2)Rename the files manually using ALTER DATABASE statements.
3)Use a combination of conversion parameters and manual renames.(rename + db_file_name_convert, log_file_name_convert parameters)

So, In this post, I will mention about the first method.

Here are the quick descriptions;

DB_FILE_NAME_CONVERT:Converts primary database datafile filenames to standby datafile filenames, for example, from tbs_* to standbytbs_*.

LOG_FILE_NAME_CONVERT:Converts primary database redo log filenames to standby database redo log filenames, for example, from log_* to standbylog_*.

STANDBY_FILE_MANAGEMENT:When set to auto, this parameter automates the creation and deletion of datafile filenames on the standby site using the same filenames as the primary site. (if we want different file paths, we use db_file_name_convert and log_file_name_convert parameters) -- this parameter actually is not needed for db_file_name_convert and log_file_name_convert to convert the current filenames of the primary to the standby but it is needed when a file is added or deleted in the future)

Making Oracle to use the correct file paths in the standby environment, can be done using  db_file_name_convert and log_file_name_convert parameters easily.

That is, when db_file_name_convert and log_file_name_convert parameters are in place, oracle updates the paths that are stored in the standby controlfile accordingly and thus there will be no need to rename, or switch to copy-like rman operations. In addition Oracle creates or deletes the files in standby databases that are created or deleted in the primary according to these parameters.

As far as I can see , what happens is, during a startup operation, when Oracle understand the control file is a standby controlfile, it checks to see if the db_file_name_convert and log_file_name_convert  parameters are set and if they are set, oracle updates the paths that it will use for accessing the database files and redologfiles accordingly till the next startup. So as long as the parameters in place, oracle does this in every startup, thus there will be no need for renaming the files or using switch to copy-like rman operations.

This parameters are used when the database is a standby database, so if you start to think like; "okay, then I can do my migrations like this.. I can move the datafiles to different paths and then without the controlfile, I can open my database using db_file_name_convert and log_file_name_convert parameters.. " Well that's not true. These parameters are only honoured when used in standby databases.

Also, with this in mind, as for the switch over operations, it seems; Oracle updates the standby controlfile with the paths that are produced according to the db_file_name_convert and log_file_name_convert parameters, and then makes the standby controlfile to be a normal controlfile and opens the database. (note that, standby controlfile and a controlfile has no big difference, only a flag changes )

Note:
As for the environments where the primary and standby databases reside in ASM (managed with Oracle Managed Files), db_file_name_convert and log_file_name_convert can not be used for these kind operations.  (I mean they can be used for future adding or deleting of files but can not be used for renaming the current filepaths--after recreating the standby controlfiles, in the standby site)
This is because the filenames(not paths only) are different because of the suffixes that are added by Oracle. So, in ASM and Oracle Managed Files, it can be accomplished by using the following document: "Step By Step Guide On How To Recreate Standby Control File When Datafiles Are On ASM And Using Oracle Managed Files (Doc ID 734862.1)"

EBS 11i/R12/12.2 -- using a non-default block size for a tablespace

In this blog post, I will share you some recommendations and things to consider about using a non-default blocksize for a tablespace in Oracle EBS databases.

As you may already, the block size of an EBS database is 8K and it is a mandatory thing. However, as Oracle let us have different block size tablespace in a database, we can have bigger block sized tablespace in our EBS databases, even tough the EBS database block size is set 8K.

It is mandatory to  have 8k block sizes, in other words having the db_block_size init.ora parameter EBS databases , so this means, the standard tablespaces that comes with EBS must remain in 8K, but this also means, we can use non default tablespace block size for our custom application schemas,as having non-default block sized tablespaces is supported by Oracle.
In the early days, the ability to have non-default tablespaces may be invented for Transportable tablespaces, but as the new releases have released (11g and 12c), this ability have involved and now it is supported to be used in any case.

One use case may be when there is a need to have a new schema in this database and this new schema will be used by ETL processes / BI like reports, we can create a 32k block sized tablespace and put the db objects of this schema to this 32k tablespace to get the advantage of the following documented things;

Larger oracle block sizes (big tablespace block sizes or db_block_size) typically give fewer index levels and hence improved index access times to data. A single I/O will fetch many related rows and subsequent requests for the next rows will already be in the data buffer. This is one of the major benefits of a larger block size. Another benefit is that it will decrease the number of splits.

However, no one including Oracle can guarantee if having a 32k tablespace for a custom schema in an EBS database is a good thing or bad thing, as it depends on the load and profile of the environment.

For the standard EBS applications, it is guaranteed by Oracle , as they made 8k block size as a mandatory thing. On the hand, as for the custom applications/schemas, having a non default tablespace size may provide  3 results. That is, it may increase the performance, it may decrease the performance or the effect of it can not be felt (almost change in performance)

As for the rule of thumb, there are 3 points;

Use db_block_size = 8192 for most of the transactional processing systems.
Use db_block_size of larger than 8KB for OLTP systems in which your data structures are large. (avoiding change and migrated blocks)
Use db_block_size of larger values than 8KB for systems in which your undo generation is not a meaningful part of your workload. (Data warehouses)

So should we have non-default block sizes for custom tablespaces, which are used by reporting, BI , ETL applications then?
Well, the answer is it depends and if we can this problem as an issue, then the solution is "testing". So, it is supported, but in order to to decide whether it will be good or bad in terms of performance, the only thing that can be done is to test it.

Friday, April 22, 2016

EBS R12/12.2- HR ORG CHART transaction context is lost , Bug 19955094, Workaround: Use virtual hosts

Recently encountered this problem in a new 12.2.5 customer environment.
The first page of Hr Org Chart was displayed properly, but when we click on the Talent Profile or Manager Actions buttons/links, a web page was displaying the "Transaction Context is lost" error.


In this site, the EBS application tier and the Hr org Chart was deployed on the same server and the issue was actually expected one ,as it was already documented in "Transaction Context Is Lost when click on Org Chart from Talent Profile (Doc ID 1962008.1)"

The document was clearly saying something like:

The issue is that from the perspective of the browser, the server test.local is the same "server" regardless of both instances listening to different ports on the same host. To fix this issue standalone WLS server on which the orgchart was deployed, needs to be moved to a different host. As for the solution, Web Logic Server on which Org Chart is deployed needs to be on host different than instance host.
What we did?

As it was not applicable to install a new weblogic into a new server and deploy HR ORG Chart once again;

we did create a virtual host(a different hostname) in the Hr Org Chart weblogic server and added it to the managed server/or admin server which was running the Hr Org Chart application.
We also added a DNS entry for this new virtual hostname and updated the related EBS profiles to make them redirect us to this new virtual hostname for accessing the HR ORG CHART page. 
These actions made, the managed server, that was running the Hr Org Chart application to listen on the new virtual hostname as well. 
As we configured EBS profiles with this new virtual host(different than the EBS application server's hostname), the issue dissapeared. Note that, It took only 5 mins to activate this new virtual hostname.

This workaround or solution is not available in Oracle Support, so I hope you will find it useful.

Wednesday, April 20, 2016

Oracle Linux -- boot problem, panic, mount: you must specify the filesystem type, fsck, Logical Volume, HP server

After replacing a failed disk drive, which was a member of a Raid 6 configuration, root filesystem became read-only and when we rebooted the Server, OS couldnt boot itself. (note that, OS was Oracle Linux 6.5 64 bit)



It was obvious that , the mount command which was a part of the boot sequence was failing, and it was the root filesystem what the mount command was failing for.

There were no fsck errors or fsck warning during the boot(it was the root filesystem, so maybe OS couldnt even warn us) , but we have decided make a check using Oracle Linux cd.

So, we have first burned a Oracle Linux DVD and used the external DVD drive for booting it.
It was failing. It actually didn't fail , but displayed the following warning and dropped us to syslinux boot prompt. It was complaining about "No DEFAULT or UI configuration directive found".
When we saw the error, we thought that, there could be problem with Oracle Linux boot sequence and the external  DVD drive. So we have created a bootable Oracle Linux USB and booted Oracle Linux from there.
We were right, Oracle Linux could boot itself using the usb drive

So, what we did to solve the error was;
  • run the rescue mode and made us drop ourselves in a shell
  • activated all the Logical Volume groups (vgchange -a y)
  • checked mount command with the root lvm and saw it was failing as the same way boot sequence was failing. (mount /dev/Volumegroup_name/LVM_name /foobar
  • use blkid to find the filesystem of the root lvm (blkid /dev/Volumegroup_name/LVM_name 
  • use fsck.filesystem(in our case it was ext4 -> fsck.ext4 -y) to repair the filesystem.
  • and lastly reboot.

Monday, April 18, 2016

DAC/BIAPPS-ETL mail problem, javax.mail.AuthenticationFailedException, use "serverSetupPrompt.sh"

Altough your mail tests may be successful on DAC client, you may encounter mail problems during your ETL.
This issue was encountered in a customer environment, where BIAPPS was installed and where DAC server, client , Informatica, OBIEE and Oracle Database was delivering the BI solution.

The issue started after changing the  email used by DAC server and altough the change was done on DAC client, the DAC server couldn't send any emails after it. (Normally, DAC server sends notification emails when an ETL process is started and completed or encounters any errors etc..)

The out file that is used the DAC server was displaying the following java code authentication error.

javax.mail.AuthenticationFailedException
at javax.mail.Service.connect(Service.java:306)
at javax.mail.Service.connect(Service.java:156)
at javax.mail.Service.connect(Service.java:105)
at javax.mail.Transport.send0(Transport.java:168)
at javax.mail.Transport.send(Transport.java:98)
at com.siebel.etl.email.EmailManager.dispatchJMail(EmailManager.java:531)
at com.siebel.etl.email.EmailManager.dispatch(EmailManager.java:394)
at com.siebel.etl.email.EmailManager.safeDispatch(EmailManager.java:353)
at com.siebel.etl.engine.core.ETL.sendEmail(ETL.java:866)
at com.siebel.etl.engine.core.ETL.thisETLProcess(ETL.java:303)
at com.siebel.etl.engine.core.ETL.run(ETL.java:659)
at com.siebel.etl.engine.core.ETL.execute(ETL.java:911)
at com.siebel.etl.etlmanager.EtlExecutionManager$1.executeEtlProcess(EtlExecutionManager.java:210)
at com.siebel.etl.etlmanager.EtlExecutionManager$1.run(EtlExecutionManager.java:165)
at java.lang.Thread.run(Thread.java:662)
Apr 1, 2016 11:56:43 PM com.siebel.etl.engine.bore.NewDispatcher execute
SEVERE: Starting ETL Process.


It was clear that DAC Server and DAC client did not use the same configuration at all.

The problem was partially documented in DAC Email Notification Not Working (Doc ID 1527685.1), but it was not the exactly pointing the solution.

The solution was using the serverSetupPrompt.sh located in DAC server' s root directory. (/u1/DAC in our case)

What I did was, setting the mail server ip , port , email account name, password and specifying the use of SSL in mail delivery(in our case it was non-ssl). Lastly, restarting the DAC server ,after saving these setting. (restart is important)

Following is a demo explaining the fix;

[root@ermanserver DAC]# ./serverSetupPrompt.sh
OS detected: Linux

What can I do for you?
1 - Enter repository connection information
2 - Test repository connection
3 - Enter email account information
4 - Send test email
5 - Save changes
6 - Exit

Please make your selection: 3

Would you like to remove current value for User name? (Y/N) Y

Would you like to remove current value for Password? (Y/N) Y

Current value for Email server is smtp.gmail.com.
Press return to keep it or enter a new value.
> 10.5.5.242

Would you like to remove current value for Email server port? (Y/N) Y

Would you like to remove current value for Email address? (Y/N) Y

Current value for Needs authentication is true.
Press return to keep it or enter a new value.
>
Current value for Needs SSL is true.
Press return to keep it or enter a new value.
> false

What can I do for you?

1 - Enter repository connection information
2 - Test repository connection
3 - Enter email account information
4 - Send test email
5 - Save changes
6 - Exit

Please make your selection: 3

Please enter User name
> erman@someermanserver.com

Please enter Password
> PASSWORD

Current value for Email server is 10.5.5.242.
Press return to keep it or enter a new value.
>

Please enter Email server port
> 25

Please enter Email address
> erman@someermanserver.com

Current value for Needs authentication is true.
Press return to keep it or enter a new value.
>
Current value for Needs SSL is false.
Press return to keep it or enter a new value.
> false

What can I do for you?

1 - Enter repository connection information
2 - Test repository connection
3 - Enter email account information
4 - Send test email
5 - Save changes
6 - Exit

Please make your selection: 5

What can I do for you?

1 - Enter repository connection information
2 - Test repository connection
3 - Enter email account information
4 - Send test email
5 - Save changes
6 - Exit

Please make your selection: 4

Email test will fail if the current modifications to configuration is not saved. Continue? (Y/N)
Y

Sending test email...
Apr 18, 2016 11:24:00 AM com.siebel.etl.email.EmailManager <clinit>
INFO: About to load EmailManager class...
Apr 18, 2016 11:24:00 AM com.siebel.etl.email.EmailManager <init>
INFO: Starting Constructor of EmailManager...
Apr 18, 2016 11:24:00 AM com.siebel.etl.email.EmailConfig <clinit>
INFO: About to load EmailConfig class...
Apr 18, 2016 11:24:00 AM com.siebel.etl.email.EmailConfig loadEmailProperties
INFO: About to load email config.
Apr 18, 2016 11:24:01 AM com.siebel.etl.email.EmailManager getInstance
INFO: Returning an instance of EmailManager...
Apr 18, 2016 11:24:01 AM com.siebel.etl.email.EmailManager enableTestingMode
INFO: EmailManager is entering a testing mode
Apr 18, 2016 11:24:02 AM com.siebel.etl.email.EmailManager dispatchJMail
INFO: EmailManager.dispatch : Created a message...
Apr 18, 2016 11:24:02 AM com.siebel.etl.email.EmailManager dispatchJMail
INFO: Message successfully sent OK.

Test email successfully sent.

EBS 12.2 -- R12.AD.C.Delta.7 and R12.TXK.C.Delta.7 , forms-c4ws disabled

While installing the latest release of EBS 12.2(12.2.5) , you upgrade to AD.C.Delta.7 and R12.TXK.C.Delta. So after you upgrade to Delta 7 , when you login to the weblogic console and you will see that the forms-c4ws managed server is not running. Altough the adstrtal and adstpall scripts don't encounter any errors , you will see forms-c4ws stopped, not started.

Well, dont try to solve this issue, as it is not an issue or problem.

As, with the Delta 7 upgrade, s_forms-c4wsstatus comes as disabled so, start scripts doesn't start the forms-c4ws at all.
This is a performance improvement, as the services provided by this managed server is not commonly used. One area that the forms-c4ws is used in SOA Gateway, but as said it is rare.

The situation is already documented in Oracle Support, but I just want you to be aware of it.

Reference: 
Oracle E-Business Suite Applications DBA and Technology Stack Release Notes for R12.AD.C.Delta.7 and R12.TXK.C.Delta.7 (Doc ID 2033780.1)
Oracle WebLogic Server Performance Enhancements:

Several related enhancements have been made to Oracle WebLogic Server:
A new -DserverType=wlx start argument for managed servers reduces their memory footprint, by preventing startup of the Enterprise JavaBeans (EJB), Java EE Connector Architecture (JCA), and Java Message Service (JMS) services.
The default value of s_forms-c4wsstatus is now set to 'Disabled'.Thus, the formsc4-ws servers are no longer started during a 'start all' operation.
To reduce oacore startup time, the Portlet Producer libraries are no longer deployed to the EBS domain. A new context variable, s_deploy_portlet, has been introduced to cater for cases where portlet-related configuration is required, such as in instances needing Webcenter integration.

EBS 11i/R12 - XML publisher, bursting , barcode and xdo.cfg

The XML publisher uses the mapping defined in xdo.cfg file when producing outputs with barcodes. So, to get font mappings in Bursting one has to use xdo.cfg, rather than the Administration settings.

Note that: XML/BI Publisher's bursting engine accepts a data stream and splits it based on multiple criteria, generates output based on a template, then delivers the individual documents through the delivery channel of choice.

If those mappings are not in place, you can end up missing barcodes, our numbers displayed rather than barcodes characters in your Xml publisher based outputs, such as PDF outputs.

In order to have a proper barcode output,

1)Fonts for displaying the barcodes(3of9.ttf) should be available in $AF_JRE_TOP/jre/lib/fonts (11i) or $AF_JRE_TOP/lib/fonts (R12)
2)The required font mappingshould be done in xdo.cfg:
Example mapping:
<font family="3 of 9 Barcode" style="normal" weight="normal"><truetype path="/u01/jre/lib/fonts/3of9.ttf" />
</font>

-- The locations of xdo.cfg:
11i: under $AF_JRE_TOP/jre/lib R12.0 under $AF_JRE_TOP/lib
R12.1 under $XDO_TOP/resource
12.2 under fs_ne/EBSapps/appl/xdo/resource

The 2nd action may be necessary after any incidient that recreates or overwrites the xdo.cfg file.

References: 
In XML Publisher Bursting Barcode Font is Ignored (Doc ID 1136423.1)
Bursting in EBS XML Publisher Consolidated Reference (Doc ID 1574210.1)
What Is The Recommended Location Of Xdodelivery.cfg or XDO.cfg In An R12.2 Environment? (Doc ID 1915791.1)
Oracle Business Intelligence Publisher User's Guide Release 10.1.3.2

Friday, April 8, 2016

HP GEN9 servers- Oracle Linux 6.5, kernel panic, panic+0xc4/0x1e4, __perf_cgroup_move+0xe/0x20

It is important to check the Hardware certification web pages for minimum compatible OS-Hardware configuration before doing any Oracle Linux Installations.

In case of HP, it is required to check "HPE Servers Support & Certification Matrices- Oracle Linux
Minimum Operating System Requirements"  available via http://h17007.www1.hpe.com/us/en/enterprise/servers/supportmatrix/exceptions/oel_exceptions.aspx#.VweuRPmLTIU, as in here Oracle Linux 6.6 to 7.0 seems certified with HP DL 380 GEN 9 (for instance) --> the accurate information.

"The HPE Servers Support & Certification Matrices - Oracle Linux Certification and Support available" via the url: http://h17007.www1.hpe.com/us/en/enterprise/servers/supportmatrix/oracle_linux.aspx is a little misleading, as in here Oracle Linux 6 (without declaring any mininum version) seems certified with HP DL 380 GEN 9. (for instance) -> misleading information.

So, if you install Oracle Linux 6.5 to a HP DL 380 GEN9, you get a kernel panic, as it is not supported and certified .


Trace:
panic+0xc4/0x1e4
__perf_cgroup_move+0xe/0x20
find_new_arepear+0x17c/0x180

Workarounds are possible ofcourse.
A workaround is to use Redhat compatible kernel, but we can't recommend using an unsupported OS-Hardware configuration.

So, as for installing Oracle Linux 6 and HP DL 380 GEN9, the minimum supported version is Oracle Linux 6.6.

Thursday, April 7, 2016

EBS R12/12.2 - concurrent manager facts: cache, process count sleep times

In these post , I will share some facts about configuring concurrent managers.
My reference is Maris Elsins's blog, thanks to him for pointing out this valuable information.
This is tested for EBS R12. (Recently, tested/revisited for EBS 12.2 as well.)
https://www.pythian.com/blog/performance-settings-of-concurrent-managers/
https://me-dba.com/2016/04/12/internals-of-querying-the-concurrent-requests-queue-revisited-for-r122/

Here are the facts:

  • cache size is per process
  • sleep time is per process
  • A concurrent process waits for <sleep time> and checks the fnd_concurrent_requests table , only when it is idle.
  • A concurrent process, which is not idle/running a concurrent request, directly checks the fnd_concurrent_requests table without waiting the <sleep time> to pass, after it complete the execution of the running concurrent request.

So, suppose we have no concurrent request which is pending to be running and suppose we configure standard manager with 30 processes, with a sleep time of 30 seconds and with a cache size of 30 , see what happens;

30 seperate standard manager process starts. Each of them check the fnd_concurrent_requests table in every 30 seconds.

Let's suppose they run these checks sequential; see what happens; 
What about sleep times?

in second 1, process 1 checks the fnd_concurrent_requests
in second 2, process 2 checks the fnd_concurrent_requests
in second 3, process 3 checks the fnd_concurrent_requests
in second 4, process 4 checks the fnd_concurrent_requests
...
...
in second 30, process 30 checks the fnd_concurrent_requests

in second 31, process 1 checks the fnd_concurrent_requests (again)
in second 32, process 2 checks the fnd_concurrent_requests (again)
in second 33, process 3 checks the fnd_concurrent_requests (again)
...
...
in second 59, process 29 checks the fnd_concurrent_requests (again)

So, almost 60 checks are done. So it makes 1 check per second.

So in second 60,  checks are done.

This is a little high.

What is the correct setting?

It depends on the situation and can be derived using the formula that Maris Elsins shared in its blog.

Sleep time= "# of process" * (1-Avg Utilization percentage ) * Avg Time(seconds) for a request allowed to be pending

#of process= 30
suppose we have only 1 hour peak time , 1 hour in a day whic we have important conc requests
Suppose these concurrent requests are critical and they should be wait max 10 seconds

30 * (1-1/24)*10 =  280  -> sleep time should be 280 seconds. (approximately.)

So, as we have 30 processes and they are idle most of the time, they will check the  queues in every 280/30=  9 seconds. (in average)

If this formula calculates a sleep time lower than the  "Avg Time(seconds) for a request allowed to be pending", then it means "the concurrent manager process count is not enough", the solution should be adding concurrent manager processes.

What about the caching?

process 1 caches 30 requests (1-30) and start executing request 1
process 2 caches 30 requests (2-31) and start executing request 2, as request 1 is already run by Process 1
process 3 caches 30 requests (3-32) and start executing request 3,  as request 2 is already run by Process 1
....
.....
process 30 caches 30 requests (30-59) and start executing request 3,  as request 29 is already run by Process 1
...
suppose now process 1 finishes the executing of request 1;
process 1 checks request 2 and see it is locked, as it is executed by another process.
process 1 checks request 3 and see it is locked, as it is executed by another process.
process 1 checks request 4 and see it is locked, as it is executed by another process.
process 1 checks request 5 and see it is locked, as it is executed by another process.
....
process 1 checks request 30 (which is the last request in its cache) and see it is locked, as it is executed by another process.
What process 1 does than?
it caches again. It caches another 30 request. So where it the benefit of caching?

Well, we can conclude that, cache  give us benefits when there are few number of concurrent processes available. For the environments where we have several concurrent processes like 10 standard manager process or so,  a general recommended setting can be "1" for it.

Exadata X6 released -- quick info

Here is an overview of new generation Exadata.

X6-2 released and announced in oracle.com at April 5, 2016.

For each Database node:

2x 22 core Intel Xeon E5-2699 v4 “Broadwell"
Memory is DDR4 (by default 256GB, explandable to 768GB)
Local storage by default 4 drives. 4x600gb (10000 RPM)
No Flash
2x Infiniband ports QDR
3x 1/10GB ethernet port for client connections
1x 1/10GB ethernet port for management connections
2x 10GB optical Ethernet ports
1x ILOM port

For each Storage node:

High Capacity: (HC)

2x 10 core Intel Xeon E5-2630 v4 processor
128GB memory 
4x 3.2 TB Sun Accelerator Flash F320 NVMe PCIe cards 12.8 TB per server
2x Infiniband ports QDR
1x ILOM port

Extreme Flash: (EF)

2x 10 core Intel Xeon E5-2630 v4 processor
128GB memory 
8x 3.2 TB Sun Accelerator Flash F320 NVMe PCIe cards, 25.6 TB per server.
2x Infiniband ports QDR
1x ILOM port

Key Benefits:

  • Uncompressed I/O bandwidth of up to 350 GB/second per full rack from SQL 
  • Ability to perform up to 5.6M database 8K read I/O operations or 5.2M database 8K write I/O operations per second 
  • Easily upgrade to meet the needs of any size application
  • Scale by connecting multiple Exadata Database Machine X6-2 racks or Exadata Storage Expansion Racks. Up to 18 racks can be connected by simply connecting via InfiniBand cables and using internal switches. Larger configurations can be built with external InfiniBand switches 
  • Pre-configured system optimized for all database applications

What is introduced with Exadata X6 Software,

  • In memory columnar formats in storage servers : same format as the database in memory, super fast vector processing.
  • Aggregation in storage : aggregation offload to storage
  • Set-membership using new kind of Storage Index :  set membership offloaded to storage
  • Smart Fusion Block Transfer for moving block between nodes : new protocol for hot block transfer in RAC.
  • Automated rolling upgrade accross full stack: From DB, to OS , automated the entire process.
  • 2x faster disk recovery
  • Hierarchical Snapshots 
  • 2x application connections
  • Automated VLAN creation
  • Extended Distance Clusters: 
  • 2.5X faster software updates
  • High Redundancy on Quorum Disks on 1/4 and 1/8 racks
  • Preserve storage index on rebalance.
Capacity-On-Demand Licensing:

For Exadata X6-2 at least  14 cores must be enabled per server.
For Exadata X6-8 at least  56 cores must be enabled per server.

RDBMS -- getting correct explain plans

autotrace and EXPLAIN PLAN may not create actual plans in some cases.
This is because autotrace and EXPLAIN plan do not peak binds and they also dont care about the data type of the columns while deciding whether an index on that columns can be used or not.

MOS note, EXPLAIN PLAN and SQL*PLUS AUTOTRACE may not generate actual plans (Doc ID 1268111.1) got the details. This note is for : Oracle Server - Enterprise Edition - Version 8.0.3.0 and later.

Kerry Osborne also showed that AUTOTRACE sometimes may produce wrong plans. (http://kerryosborne.oracle-guy.com/2010/02/autotrace-lies/), he also showed EXPLAIN plan sometimes may produce wrong plans: (http://kerryosborne.oracle-guy.com/2008/10/explain-plan-lies)
By the way, what these two tools have in common? -> Autotrace uses Explain Plan.

So, in conclusion, we can say that, the actual execution plan is what is prepared/created during the execution.
We should not always rely on the explain plans that we have created using autotrace or Explain plan.
We should rely on the explain plans that are prepared for the running queries or prepared and still avaiable through cursor info stored in data dictionary views for the queries executed earlier.

In conclusion, the best ways for getting the actual explain plans are;

1)Tracing the session; using SQL TRACE (10046 trace)
2)If we know sql_id of the query, using V$SQL_PLAN or DBMS_XPLAN.DISPLAY_CURSOR.

"Autotrace" and "EXPLAIN PLAN" are easy to use, but they may not generate the correct explain plans everytime.

So, DBMS_XPLAN.DISPLAY_CURSOR is recommend.
This method required sql stateament to be in shared pool once, so it means it requires an execution.
Suppose, we have written an sql statement and want to check its actual explain plan created by the optimizer; we just have to the following;

execute the statement; (we can cancel the execution of sql statement if it is a long one, just after we start its execution, it will be already parsed,so we still get what we need)
Find the sql statement in v$sql
execute DBMS_XPLAN.DISPLAY_CURSOR for that sql statement.