Saturday, January 31, 2015

EBS 12.2 / 12.1-- UTF8, too long error in Forms screen, ORA-12899: value too large for column, DMU conversion ...

In UTF8 EBS 12.2 database; you may encounter value too long errors while performing some operations..
We have recently faced with an ORA-12899 in a LOV..
The column length inside the database was 40, the problematic data was 25 chars.. It had 2 turkish characters , so its byte length was 29.  (In UTF8 , turkish chars are 2 bytes in lenght)
So, based on above information, we should not encounter any errors, becuase we have the 40 bytes as column length , and we try to insert only 29 bytes(25 chars)..
After some analysis, we have found a control in the form itself.. The form was the problem point. It has the control and the control did not let us to insert more than 25 bytes on the relevant table..
We have used a workaround and did not use any turkish characters in this form..
Note that : The fix should be a patch.. (no such a patch available in Oracle Support, and SR was needed)

The interesting thing is that the problem have started after UTF8 migration.
This system was a EBS vm template and the database inside of this template was ASCII..
So , we needed to convert it .. We had convert it to uniform (UTF8 or AL32UTF8), because there were some application (like Hyperion) which were planned to use this database for storing their repositories, and these applications wants their database to be UTF8.
Also , we could not convert the database to WEISO8859P9 (turkish) charset ,because convert tool (DMU) has the capability for converting to uniform only.. Moreover; Converting to P9 was not supported in the first place. Besides, the valid reasons to choose WEISO8859P9 instead of UTF8 can be performance or db size..(except Thai chars in EBS , if using Thai chars in EBS, it might be a good to choice to use thai specific charset rather than utf8)
So, In general; column length should not be an issue. For packaged applications, application owner should already configured the table columns to support uniform char sizes. Also for in-house developed applications, column lengths may be increased accordingly. Ofcourse, In EBS actions like increasing column lenghts or changing NLS_LENGTH_SEMANTICS parameter to "char " , are not supported..

Anyways; the problem was obvious.. Possible workarounds and fix are as stated above..  The problem was not the database , or it was not the DMU.. The convertion was successful, but still I want to give you some info about the problems which may be encountered in these layers..
Okay.. Lastly, I want to share the information that I have gathered and filtered during my analysis and researches.. This info may be helpful for you in the future.

  • Despite Note 1283764.1 which allows the database character conversion from UTF8/AL32UTF8 to another character set, this is not supported for E-Business Suite as it will result in data loss.
  • Globalization Guide for Oracle Applications Release 12 (Doc ID 393861.1)
It is good to be aware of some problems which may be encountered while using DMU for Db charset conversions.

  • Existing data may expand beyond the length of its column width when converting to Unicode. For instance, WE8ISO8859P1 is a single-byte character set, so every character is stored in one byte; a 20 byte column holds 20 characters. In UTF8, some of those characters will take up more than one byte, and then the 20 characters may require more than 20 bytes of storage space. The impact of data expansion varies according to the character sets, languages, and amount of data involved.
  • If the Data Expansion is reported and the expanding data is Oracle seed data, contact Applications Support; if customer data, use the E-Business Suite UI to shorten the data. If you cannot navigate to the data, contact Applications Support.
  • Never modify the column size attribute in order to accommodate the existing data. Although the DMU allows this, Oracle E-Business Suite does not allow it and will not support a database with a modified column size.
  • Never truncate the expanding data.You can display ROWID in the DMU cleansing editor, but this is only for locating the exceptional data. E-Business Suite does not support cleansing data through the DMU.

Also , I have found a problem record for MTL_SYSTEM_ITEMS_B Materialized view. It was as follows;
"The DESCRIPTION column in the MTL_SYSTEM_ITEMS_B was converted from VARCHAR2(240) to VARCHAR2(240 Char)." 

After Converting Database To AL32UTF8 Materialized View MTL_SYS_ITEMS_SN Fails With ORA-12899: Value Too Large For Column ORA-12008 (Doc ID 1493499.1)

Linux -- running fsck on readonly filesystem -- especially about readonly root filesystem

Running e2fsck , fsck on readonly filesystems makes sense.. It makes sense, because you can be sure that the filesystem is consisten while fsck is reading it, and also while fsck is correcting it..
Moreover; I did these kind of operations couple of times recently and did not encounter any problems..

On the other hand; it is not recommended..
Lets have a quick look at the fsck man page:
Here is says;
e2fsck(8) - Linux man page
Note that in general it is not safe to run e2fsck on mounted filesystems.

So it is not safe according to the application owner.
The question is why? 
Altough, I could not find a certain answer for this, I think that it may be related with the kernel and filesystem.. I mean the filesystem remains consistent in itself.. That 's true. On the other hand; the whole picture(kernel<->fs) will not be consistent..  What I try to say is ; when we correct a readonly filesystem using e2fsck, we actually make write operations on it.. This may lead problems when we mount it read/write again. The reason is that kernel may find some unexpected data in the fs.. Unexpected data may be a result of writing in to the fs while it was readonly..  
The concern is sspecially for root filesystem...  Such a problematic scenario in a root fs may trigger a hang or crash..
So what we need to do? What should be our action plan if we need to run e2fsck on filesystems?
I think, you can run e2fsck , while the system is online, by taking the risk, suppposing you cant reboot and also able to take the risk.... root filesystem, on the other hand; should be an exception. 
Altough, we can technically run e2fsck on root filesystem without reboot; it is not safe. 
So in case of root fs; we should reboot and make an fsck during the boot phase.
In such a situation; when you reboot ; recent versions of Oracle / Redhat Linux will direct you to make fsck while booting; anyways.

Tuesday, January 27, 2015

Exadata -- Find Exadata Rack Model (Eight/Quarter/Half ..)

In order to find the rack model( eigth , quarter, half or full rack) of Exadata, you can use the databasemachine.xml file.. This file is created by the onecommand process during provisioning of 
Oracle Exadata..

For example:
cd /opt/oracle.SupportTools
grep -i MACHINETYPES databasemachine.xml
Output : X4-2 Eighth Rack HP 1.2TB

As you see here, MACHINETYPE tag gives us the Rack Model, type of the machine (HP for High performance, HC for High Capactiy) and storage size, as well. 

Keep in mind that; You can gather more info using databasemachine.xml file.. 
Admin ip, MACHINEUSIZE, ILONNAMES are just some of those information stored in this file ..
"databasemachine.xml" file is actually used by Oracle Enterprise Manager 12c  during the discovery process and also used to gather the information to render the schematic diagram in the Enterprise Manager's Exadata home page.

Monday, January 26, 2015

EBS 12.2 -- Cloning in Virtualized Oracle Database Appliance ODA x4-2

You may need to clone your EBS VM machines which reside on a virtualized ODA X4-2 .. You can always create new VM machines and copy the database and application tiers like you do in a standard rapid clone process.. On the other hand, need for cloning EBS 12.2 in such a virtualized environment may make you want to take advantage of VM technologies.. I m  talking about Oracle VM Server here.. ODA X4-2 comes with Oracle VM Server, thus you can clone existing VM machines using standard copy commands and oakcli utility.

To clone the VM machines; use my approach in the following link;
http://ermanarslan.blogspot.com.tr/2015/01/oda-x4-2-cloning-in-virtualized.html

Cloning VMs will save your time as you will not have to scp your application files and duplicate your database.. But even if you clone your VMs , there will be some more things to do for creating a proper EBS 12.2 clone.
First of all ; you will need to change your database name in the target..I recommend recreating controlfile for this.
After you change the name; just execute adcfgclone dbTier on the database site, this will create the environment file , update the necessary apps tables and configure the files like tnsnames and listener.ora .. Note that : your ip and hostname will also be different than the source where this database comes from.. That 's why a new configuration needed...
Once you finish configuring your database in your cloned database server , proceed with the apps tier.
In apps tier cloning phase; take a preclone first; use adpreclone.pl.. This is needed because your apps Tier may be changed and your common top clone directory may not be up-to-date.. For example: If you have made an EBS 12.2.4 upgrade recently, you need to execute adpreclone to create an uptodate clone directory..(run preclone in source and scp it to the cloned env)
Otherwise, you will end up with unexpected errors in the post clone phase..

I will not go in the details of cloning, as this is beyond  of the scope of this blog post..
For the details about cloning in EBS 12.2, you can follow my cloning documents. They are detailed, and they are especially for EBS 12.2
Theory: http://ermanarslan.blogspot.com.tr/2014/07/ebs-122-cloning-how-to-clone-ebs-122.html
Real Life : http://ermanarslan.blogspot.com.tr/2014/07/ebs-122-cloning-real-life-example.html

So, as the purpose of this post is to explain the logic for cloning EBS 12.2 in virtualized platforms; lets walk through the problems and their solutions which you may encounter during such a cloning process..

1)You need to execute preclones before executing post clone in appsTier. Otherwise you will encounter something like this;
START: Instantiating the OHS Config.
Running /jit/jit12/fs1/FMW_Home/webtier/perl/bin/perl -I/jit/jit12/fs1/FMW_Home/webtier/perl/lib/5.10.0 -I/jit/jit12/fs1/FMW_Home/webtier/perl/lib/site_perl/5.10.0 -I/jit/jit12/fs1/EBSapps/appl/au/12.0.0/perl -I/jit/jit12/fs1/FMW_Home/webtier/ohs/mod_perl/lib/site_perl/5.10.0/x86_64-linux-thread-multi /jit/jit12/fs1/EBSapps/appl/fnd/12.0.0/patch/115/bin/txkSetOHSConfig.pl cfgCloneInstance -ctxfile=/jit/jit12/fs1/inst/apps/JIT12_fxsljit03/appl/admin/JIT12_fxsljit03.xml -archvloc=/jit/jit12/fs1/EBSapps/comn/clone/FMW/OHS/ohsarchive.jar -mvpln=/jit/jit12/fs1/EBSapps/comn/clone/FMW/OHS/moveplan.xml -t2plogloc=/jit/jit12/fs1/inst/apps/JIT12_fxsljit03/admin/log/clone/ohsT2PApply -logdir=/jit/jit12/fs1/inst/apps/JIT12_fxsljit03/admin/log/clone -console=off -promptmsg=hide
Script Executed in 49055 milliseconds, returning status 1
ERROR: Script failed, exit code 1

2)If you will not change the paths in the application tier; remove the following oracle homes from the inventory; oracle_common,webtier;Oracle_EBS-app1
Dont set your env.
cd /u02/r122/fs1/FMW_Home/oracle_common/oui/bin
./runInstaller -detachhome ORACLE_HOME=/u01/install/APPS/fs1/FMW_Home/oracle_common
./runInstaller -detachhome ORACLE_HOME=/u01/install/APPS/fs1/FMW_Home/webtier
./runInstaller -detachhome ORACLE_HOME=/u01/install/APPS/fs1/FMW_Home/Oracle_EBS-app1

3)Dont set EBSapps.env before running post clone in Apps.
Because there is a file which is sourced in EBSapps.env and it has hostname dependencies .. Hostname is changed in cloning process as you may imagine, thus sourcing the environment create problems indirectly..
If the env is set, then you can't even switch user to applmgr , you cant su - applmgr(it will hang), even if you can; you will not able to do anything..
/u01/install/APPS/fs2/FMW_Home/Oracle_EBS-app1/applications/oacore/APP-INF/node_info.txt

4) Dont set environment before running database post clone in database tier.. The env script is much simpler than the env script of apps , but you dont set it -- just in case..
/u01/install/PROD/11.2.0/ORATEST_daroratestdbsrv1.env

5) Delete the fmw folder before running post clone script in Apps tier.. fmw is packaged in common_top/clone dir, so it will be create during post clone..

So, keep in mind the 5 thing that I have mentioned above.. Other than these 5 things; method for changing hostname and ip address in EBS 12.2 is the same as the method for cloning. 
In a virtualized environment, the process is consist of 1) clone the vms 2) open them and change their ip/hostname 3) run postclone on db and apps tier
There isn't any public document for changing ip address & hostnames, but as I have mentioned above, cloning will do the job..

ODA X4-2 -- Cloning in Virtualized Platform , oakcli -- Oracle VM Server

I have checked a lot of documents and could not see a document that specifies the exact method for cloning an existing VM guest to a new VM guest. The need has arised in an EBS 12.2 project in which we are using Virtualized Oracle Database Appliance X4-2 for hosting our EBS nodes.
After checking the docs; I have conclude that, we can follow the method for restoring backups of VM guest for our cloning purposes..
Note that: Using VM manager is not supported on ODA. Also we dont have snapshot option for oakcli in ODA X4-2..

Here is the action plan:

cd to the shared_repo_path/VirtualMachines/yourexistingvm
tar -cvzf shared_repo_path/VirtualMachines/yourexistingvm.tgz ./*
oakcli import vmtemplate thenewone -files shared_repo/VirtualMachines/yourexistingvm.tgz -repo yoursharedreponame
oakcli clone vm thenewmachine -vmtemplate thenewone -repo yoursharedreponame

Note that: As this will be a clone of our existing vm machine; the ip address and hostname of it will be the same as the existing machine. That's why , we need to shutdown the existing machine, then open up this clone and change the hostnames accordingly.. (this has to be done in order to prevent an ip and hostname conflict)
That is ;
shutdown the existing vm.
open the new clone vm
change the hostname and ip address of new clone vm
reboot the new clone vm
start the existing vm

That 's it.. You will have a cloned Oracle VM if you follow this procedure..

On the other hand; what I did was a little different than this.
I just created a vm machine from an already imported VM template and then copied the image disk of the existing VM machine into the VM directory of this newly created Virtual machine, edited the vm.cfg accordingly and done.! ...

Note that : while performing tar or copy operations , the source /existing vm must be shutdown..

Hope you ll find it useful..

Linux -- I/O hang : io_schedule+0x42/0x5c, corrupted controlfile , scsi error , sd 0:0:0:0: timing out command, waited 360s

This post will  be about an I/O error which caused a  controlfile of an EBS Oracle Database to be corrupted.
Recoverying the controlfile was not a big challenge, as we had multiplexed controlfiles for this database.
Recoverying a controlfile problem: shutdown the databases; delete the problematic controlfile; copy a proper controlfile with the name of the deleted controlfile; open the database.. That's it.

As I said; recoverying from a controlfile corruption was not a big deal..
On the other hand, diagnosing this kind of errors was a little tricky. 

In this post; you will find diagnostics in OS level (linux) .. Diagnostics for finding the cause of such an error on I/O subsystem..  We have to diagnose these kind of errors.. We cant let it go, because it is a production system and there is no guarantee that we will not end up with a corrupted system datafile at next time the error occurs..

The system info:

Production
OS: Oracle Linux 5.8 64 bit
Kernel : 2.6.32.300.10.1.el5uek
File systems : ext3 , db is on non-asm ext3 filesystems.
Multipath was not configured.

Diagnostics:
  
INFO: task kjournald:2372 blocked for more than 120 seconds.
Jan 26 06:55:55 ermandb kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 26 06:55:55 ermandb kernel: kjournald     D 0000000000000000     0  2372      2 0x00000000
Jan 26 06:55:56 ermandb kernel:  ffff8802358ffc40 0000000000000046 ffff8802358ffc10 ffffffffac0a666e
Jan 26 06:55:56 ermandb kernel:  ffff88022ec74040 ffff8802372da880 ffff88022ec74410 ffff880028200000
Jan 26 06:55:56 ermandb kernel:  ffff8802358ffcd0 ffff88002801d0b0 ffff8802358ffc20 ffff88022ec74040

Seemed kjournald was in hung state.. kjournald is the deamon for the journaling feature of EXT 3 filesystem. Journaling in Linux works like redolog mechanism in Oracle.. Kjournal works like LGWR and writes the changes in to a special area—the journal— So After a crash, recovery simply involves reading the journal from the file system and replaying changes from this journal until the file system is consistent again.

It seemed like kjournald is not the only thing which is hung state; Here is an oracle process which was in hung state caused by exactly the same problem;

INFO: task oracle:5608 blocked for more than 120 seconds.
Jan 26 06:55:56 ermandb kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 26 06:55:56 ermandb kernel: oracle D ffff8802295fc7c0 0 5608 1 0x00000080
Jan 26 06:55:56 ermandb kernel: ffff880226eddb58 0000000000000086 ffff880226eddb18 ffffffffac0a666e
Jan 26 06:55:56 ermandb kernel: ffff880226b72300 ffff88001839a180 ffff880226b726d0 ffff880028200000
Jan 26 06:55:56 ermandb kernel: ffff880226eddbe8 ffff88002802a9e8 ffff880226eddb38 ffff880226b72300

Last executed routine was io_schedule. Here is the call trace of the hung processes;

Jan 26 06:55:56 ermandb kernel: [<ffffffff8145480c>] io_schedule+0x42/0x5c
Jan 26 06:55:56 ermandb kernel: [<ffffffff8114044c>] sync_buffer+0x2a/0x2e
Jan 26 06:55:56 ermandb kernel: [<ffffffff81454d3b>] __wait_on_bit+0x4a/0x7c
Jan 26 06:55:56 ermandb kernel: [<ffffffff81140422>] ? sync_buffer+0x0/0x2e
Jan 26 06:55:56 ermandb kernel: [<ffffffff81140422>] ? sync_buffer+0x0/0x2e
Jan 26 06:55:56 ermandb kernel: [<ffffffff81454de0>] out_of_line_wait_on_bit+0x73/0x80
Jan 26 06:55:56 ermandb kernel: [<ffffffff81077039>] ? wake_bit_function+0x0/0x2f
Jan 26 06:55:56 ermandb kernel: [<ffffffff811c1206>] ? __journal_temp_unlink_buffer+0x1a/0xf4
Jan 26 06:55:56 ermandb kernel: [<ffffffff8114039b>] __wait_on_buffer+0x24/0x26
Jan 26 06:55:56 ermandb kernel: [<ffffffff811c2fa5>] wait_on_buffer+0x31/0x35
Jan 26 06:55:56 ermandb kernel: [<ffffffff811c3555>] journal_commit_transaction+0x4d9/0xe38
Jan 26 06:55:56 ermandb kernel: [<ffffffff8104b334>] ? finish_task_switch+0x52/0xab
Jan 26 06:55:56 ermandb kernel: [<ffffffff8106511b>] ? lock_timer_base+0x2b/0x4f
Jan 26 06:55:56 ermandb kernel: [<ffffffff810651b2>] ? try_to_del_timer_sync+0x73/0x81
Jan 26 06:55:56 ermandb kernel: [<ffffffff811c7057>] kjournald+0xcd/0x211
Jan 26 06:55:56 ermandb kernel: [<ffffffff81076ffc>] ? autoremove_wake_function+0x0/0x3d
Jan 26 06:55:56 ermandb kernel: [<ffffffff811c6f8a>] ? kjournald+0x0/0x211
Jan 26 06:55:56 ermandb kernel: [<ffffffff81076c53>] kthread+0x6e/0x76
Jan 26 06:55:56 ermandb kernel: [<ffffffff81012dea>] child_rip+0xa/0x20
Jan 26 06:55:56 ermandb kernel: [<ffffffff81076be5>] ? kthread+0x0/0x76
Jan 26 06:55:56 ermandb kernel: [<ffffffff81012de0>] ? child_rip+0x0/0x20
Jan 26 06:55:56 ermandb kernel: INFO: task oracle:5608 blocked for more than 120 seconds.
Jan 26 06:55:56 ermandb kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 26 06:55:56 ermandb kernel: oracle D ffff8802295fc7c0 0 5608 1 0x00000080
Jan 26 06:55:56 ermandb kernel: ffff880226eddb58 0000000000000086 ffff880226eddb18 ffffffffac0a666e
Jan 26 06:55:56 ermandb kernel: ffff880226b72300 ffff88001839a180 ffff880226b726d0 ffff880028200000
Jan 26 06:55:56 ermandb kernel: ffff880226eddbe8 ffff88002802a9e8 ffff880226eddb38 ffff880226b72300

The cause have appeared in the following lines ;

Jan 26 06:55:59 ermandb kernel: sd 0:0:0:0: timing out command, waited 360s
Jan 26 06:55:59 ermandb kernel: sd 0:0:0:0: [sda] Unhandled error code
Jan 26 06:55:59 ermandb kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
Jan 26 06:55:59 ermandb kernel: sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 1c d1 bb 3d 00 00 20 00
Jan 26 06:55:59 ermandb kernel: end_request: I/O error, dev sda, sector 483507005
Jan 26 06:55:59 ermandb kernel: Buffer I/O error on device sda4, logical block 47383555
Jan 26 06:55:59 ermandb kernel: lost page write due to I/O error on sda4
Jan 26 06:55:59 ermandb kernel: Buffer I/O error on device sda4, logical block 47383556
Jan 26 06:55:59 ermandb kernel: lost page write due to I/O error on sda4
Jan 26 06:55:59 ermandb kernel: Buffer I/O error on device sda4, logical block 47383557
Jan 26 06:55:59 ermandb kernel: lost page write due to I/O error on sda4
Jan 26 06:55:59 ermandb kernel: Buffer I/O error on device sda4, logical block 47383558

So , it was obvious.. The problem was caused by an I/O error.
Strange thing was that ; we were seeing CIFS error in dmesg output.. 
These errors seemed to be encountered just before the I/O errors in /dev/sda..

CIFS VFS: No response for cmd 114 mid 8857
CIFS VFS: No response for cmd 114 mid 8878
CIFS VFS: No response for cmd 114 mid 8883
CIFS VFS: No response for cmd 114 mid 25488
CIFS VFS: No response for cmd 114 mid 25565
CIFS VFS: No response for cmd 114 mid 25583
CIFS VFS: No response for cmd 114 mid 25612
CIFS VFS: No response for cmd 114 mid 25629
CIFS VFS: No response for cmd 114 mid 25662
CIFS VFS: No response for cmd 114 mid 25667
CIFS VFS: No response for cmd 114 mid 25696
CIFS VFS: No response for cmd 114 mid 25729
CIFS VFS: No response for cmd 114 mid 25758
CIFS VFS: No response for cmd 114 mid 25799


There were bugs records about CIFS in the kernels below 2.6.32-400.33.1
Bug 18248478 : CIFS LEADS TO KERNEL PANIC --> Upgrade to UEK1 kernel-2.6.32-400.33.1 or later.
SAMBA DATA (CIFS MOUNT) CORRUPTION REPORTED ON UEK2 (Doc ID 1949715.1)

There were bugs but we did not have a kernel panic.. Besides; we had corruptions/IO errors in local disk partitions... So these CIFS problems were not related with the issue..

So what was the cause then?  Was it OS or SAN (the disks were on storage) ?

We have seen an issue record similar to ours, the issue record was saying that it was an OS - Filesystem problem...
File System issues lead to RAC Instance hangs (Doc ID 1510727.1)

On the other hand; according to that record, we had to see the following in /var/log/messages : kernel: attempt to access beyond end of device..  But we did not have such a line in /var/log/messages..

What we have seen actually was the Scsi errors in dmesg output above... They were important errors which have affected our decision about the cause of this problem..

Jan 26 06:55:59 ermandb kernel: sd 0:0:0:0: timing out command, waited 360s
Jan 26 06:55:59 ermandb kernel: sd 0:0:0:0: [sda] Unhandled error code
Jan 26 06:55:59 ermandb kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
Jan 26 06:55:59 ermandb kernel: sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 1c d1 bb 3d 00 00 20 00
 
So actually we had a problem in scsi level.. We have started to concantrate on this one actually..
Here is a bug record for this. It was for Exadata but it did not matter , at the end of the day; it gave us the clue.
Exadata Compute Node RAID Controller Failed (Doc ID 1362174.1)..

So as we had errors is scsi level ; then we thought that we had the problem in the levels below scsi.. HBA driver, HBA ,storage controller or physical disks..


There was also an opportunity that the error could be caused by the scsi drivers, but most probably the cause was a non-responsive scsi target..
Anyways, at this point; we have redirected the problem to the SAN admins.. We have requested them to check the SAN itself and all the I/O Paths used for reaching the disks on it.

Conclusion: 

The error was caused by the SANs.. SANs were rebooting theirselves , and as there was no multipath configuration which could tolerate this , we ended up with the errors in filesystems, corruptions in Oracle's controlfiles and dead EBS application processes which were caused by the read only filesystems...

Wednesday, January 21, 2015

EBS R12 -- Gather Info about Installed Modules, Product installation statuses, Patchset Levels and more -- adutconf.sql

In order to check installed, not installed or shared modules in EBS , you can use adutconf.sql located in  $AD_TOP/sql directory.
Besides the information about the installation status of EBS products, adutconf.sql provides information about the patch level of the products, their associated Db Schema Names, localization module information , registered data groups , Base language , additional languages and NLS settings, as well..

To use the adutconf script;

Just connect the application server using Application Owner OS user (i.e applmgr)  and use sqlplus to connect the database with apps user..

Example run:

sqlplus apps/apps @$AD_TOP/sql/adutconf.sql

Here is an example output snip of the adutconf.sql , which was executed in an EBS 12.2.4 environment.


As you see in the above picture, this output seems quite handy for checking the patchset level of the EBS products while performing a prereq checks and/or compatability checks while making EBS patching analysis..

It can also be included in the health-check reports.

Friday, January 16, 2015

RDBMS -- Acknowledgement & Solution for ACL(ORA-24247) & Relay Access Denied errors

Normally writing about this kind of errors is not my style; but I will write about this one, because it is good to have this plsql to be available on my blog in case I need it + sometimes there are cases where we need to send email directly from the database without using notification mailer..

It is needless to say that ORA-24247 is an ACL error, and can be corrected by defining necessary ACLs..
Note that; the ACLs are there to protect the network services from being accessed by unprivileged users.
Stardart Oracle packages which can be used to access network services are : UTL_TCP, UTL_SMTP, UTL_MAIL, UTL_HTTP, UTL_INADDR and DBMS_LDAP..
On the other hand; we usually call these standart packages from our custom packages/function or procedures, whereas it is hard to find the source that actually triggers the ORA-24247 errors, sometimes.

If this is the case; we can use 24247 trace event to identify the source of these errors..
I mean we can set an event in the database to find the actual cause of an ORA-24247 error, as follows;
alter system set events='24247 trace name errorstack level 3';
After setting the event, there should be a new trace file generated in the trace directory,when the error reappears.. By analyzing the trace file (especially search for the word "user"  and "PL/SQL Call Stack"), we can see the failing session and the plsql callstack of it..

Okay. So far so good. So lets proceed with the solution ;
To overcome this type of errors, we connect into our database using SYSTEM accound and  check our current ACLs in the first place as follows;

SELECT host, acl, lower_port lport, upper_port uport
FROM DBA_NETWORK_ACLS ;

We check the output and we 'll see that there is no ACLs defined for enabling access to the server that we are trying reach.

Then, we create our ACL and associate it with the relevant user to make that user access the desired IP/host from our Oracle Database;

BEGIN
  DBMS_NETWORK_ACL_ADMIN.CREATE_ACL(acl         => 'mailserver.xml',
                                    description => 'mailserver icin xml',
                                    principal   => 'APPS',
                                    is_grant    => true,
                                    privilege   => 'connect');

  DBMS_NETWORK_ACL_ADMIN.ADD_PRIVILEGE(acl       => 'mailserver.xml',
                                       principal => 'APPS',
                                       is_grant  => true,
                                       privilege => 'resolve');

  DBMS_NETWORK_ACL_ADMIN.ASSIGN_ACL(acl  => 'mailserver.xml',
                                    host => '.....ip or hostname here....');
END;
/
COMMIT;

That 's it.. This should resolve the error...

Okay.. So far so good. But there is more..
After solving ORA-24247, we may encounter a different error.. One of these errors can be the one about the relays..
I m talking about "relay access denied" errors..
This errors may be caused by a wrong relay configuration in the SMTP server.. If this is the case; just contact the Mail Server Admin to make the configurations in the mail server, accordingly.
But, also we may encounter these relay errors, because of our code..
I mean; we may be missing some lines of plsql there..  Especially the lines needed for authentication..
That is ; some Mail Servers does not accept anonymous logins, that 's why we need to supply user/pass info after opening our connection to the Mail Server..
If this is the case; we just add the following lines in to our plsql.. We add them just after the line that we open our connection, and we modify the required strings according to our environment..

User/Pass Authentication in UTL_SMTP
utl_smtp.command( l_mail_conn, 'AUTH LOGIN');
utl_smtp.command( l_mail_conn, utl_raw.cast_to_varchar2( utl_encode.base64_encode( utl_raw.cast_to_raw( 'our Mail UserName' ))) );
utl_smtp.command( l_mail_conn, utl_raw.cast_to_varchar2( utl_encode.base64_encode( utl_raw.cast_to_raw( 'our Mail Password' ))) );

That's all for now.. I hope you 'll find it useful.

Thursday, January 15, 2015

EBS 12.2 -- Automatic Start / Stop Scripts

In this post; you will find start and stop scripts for EBS 12.2 .
This scripts are capable for starting and stopping Weblogic services as well.
Shutdown script kills the remaining application processes after it finish stopping apps services.
Scripts must be run using application owner OS user (for ex : applmgr).
Note that : This scripts are tested and verified in a Development envrionment. The risk is yours if you want to use them..
Also these scripts can be modified and make better ..
I will make some enhancements in a couple of days.. Enhancements like adding some lines for unlocking the domain , adding some exceptions and adding some echso for having a better output..

START SCRIPT:
### ERMAN ARSLAN EBS 12.2 APPS START automated
### 12/30/2013
## supply appspassword and weblogicpassword in sequence as command line arguments
### reviewed and tested 1/15/2015
if [ `whoami` == "root" ]
then
echo you can not run this script with root!
exiting
else
. /u01/apps/EBSapps.env run
{ echo apps; echo $1; echo $2; } | sh $ADMIN_SCRIPTS_HOME/adstrtal.sh -nopromptmsg
fi


SHUTDOWN SCRIPT:
### ERMAN ARSLAN EBS 12.2 APPS SHUTDOWN automated
### 12/30/2013
## supply appspassword and weblogicpassword in sequence as command line arguments
### reviewed and tested 1/15/2015
if [ `whoami` == "root" ]
then
echo you can not run this script with root!
exiting
else
. /u01/apps/EBSapps.env run
{ echo apps; echo $1; echo $2; } | sh $ADMIN_SCRIPTS_HOME/adstpall.sh -nopromptmsg
sleep 10
export APPLICATION_USER=`whoami`
kill -9 `ps -ef|grep appldev|grep -v grep |grep -v bash|grep -v sshd |grep -v "ps -ef"| grep -v shutdownapps | awk '{print $2}'`
fi

Example Run:

FOR START using application owner user -> sh startupapps.sh apps welcome123
FOR STOP using application owner user -> sh shutdownapps.sh apps welcome123

Note: Dont use the shutdown script on a single node environment which has a shared user as the owner of both database and application files. If you have such an environment (by the way; you shouldnt have such a confiuration ) and still if you want to use the shutdown script, comment the line with kill command in the shutdown script

Linux -- Disabling a path in a Multipath configuration

You may need to disable a path in the multipath enabled Linux system. Maybe you may need to simulate a path failure.. Such a need may arise to prove that multipath can handle a path failure..
Also , it may be required for making some sort of perfomance diagnostics in a active-active multipath configuration.

I have recently done such an operation , and want it to share with you..

We start with checking multipath paths using multipath -ll command.

Here is an example:
Suppose we want to disable path which is through 3:0:3:0, device /dev/sdp

Our multipath.conf

defaults {
        udev_dir                /dev
        polling_interval        10
        selector                "round-robin 0"
        path_grouping_policy    failover
        getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"
        prio_callout            "/bin/true"
        path_checker            tur
        rr_min_io               100
        rr_weight               uniform
        failback                immediate
        no_path_retry           12
        user_friendly_names     yes
}


mpath9 (36006016029b0380029dbd37cf734e411) dm-4 DGC,VRAID
[size=1.5T][features=1 queue_if_no_path][hwhandler=1 emc][rw]
\_ round-robin 0 [prio=1][active]
\_ 3:0:3:0 sdp 8:240 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 2:0:5:0 sdh 8:112 [active][ready]

Note that:

active - Path group currently receiving I/O requests.
enabled - Path groups to try if the active path group has no paths in the ready state.
disabled - Path groups to try if the active path group and all enabled path groups have no paths in the active state.The disabled state only exists for certain storage arrays.

As you see the devices /dev/sdp and /dev/sdh  are enabled. Altough they are not working as active/active (because we use failover not multibus as path_grouping_policy) , lets proceed by executing lsscsi and lsscsi to see the source of these devices to ensure that their transport protocol is FCP.
(this step is optional)

lsscsi -H; lsscsi -g
[0]    ata_piix      
[1]    ata_piix      
[2]    qla2xxx       
[3]    qla2xxx       
[0:0:0:0]    cd/dvd  TEAC     DV-28S-W         C.2C  /dev/scd0  /dev/sg0 
[2:0:0:0]    disk    DGC      RAID 5           0223  /dev/sda   /dev/sg1 
[2:0:1:0]    storage HP       HSV300           0953  -          /dev/sg2 
[2:0:1:1]    disk    HP       HSV300           0953  /dev/sdb   /dev/sg3 
[2:0:1:2]    disk    HP       HSV300           0953  /dev/sdc   /dev/sg4 
[2:0:1:3]    disk    HP       HSV300           0953  /dev/sdd   /dev/sg5 
[2:0:2:0]    storage HP       HSV300           0953  -          /dev/sg6 
[2:0:2:1]    disk    HP       HSV300           0953  /dev/sde   /dev/sg7 
[2:0:2:2]    disk    HP       HSV300           0953  /dev/sdf   /dev/sg8 
[2:0:2:3]    disk    HP       HSV300           0953  /dev/sdg   /dev/sg9 
[2:0:3:0]    storage HP       MSA CONTROLLER   7.20  -          /dev/sg10
[2:0:4:0]    storage HP       MSA CONTROLLER   7.20  -          /dev/sg11
[2:0:5:0]    disk    DGC      VRAID            0533  /dev/sdh   /dev/sg12
[3:0:0:0]    disk    DGC      RAID 5           0223  /dev/sdi   /dev/sg13
[3:0:1:0]    storage HP       HSV300           0953  -          /dev/sg14
[3:0:1:1]    disk    HP       HSV300           0953  /dev/sdj   /dev/sg15
[3:0:1:2]    disk    HP       HSV300           0953  /dev/sdk   /dev/sg16
[3:0:1:3]    disk    HP       HSV300           0953  /dev/sdl   /dev/sg17
[3:0:2:0]    storage HP       HSV300           0953  -          /dev/sg18
[3:0:2:1]    disk    HP       HSV300           0953  /dev/sdm   /dev/sg19
[3:0:2:2]    disk    HP       HSV300           0953  /dev/sdn   /dev/sg20
[3:0:2:3]    disk    HP       HSV300           0953  /dev/sdo   /dev/sg21
[3:0:3:0]    disk    DGC      VRAID            0533  /dev/sdp   /dev/sg22

lsscsi -t
[0:0:0:0] cd/dvd ata: /dev/scd0
[2:0:0:0] disk fc:0x500601604be027be,0x010b00 /dev/sda
[2:0:1:0] storage fc:0x50014380013c6cf9,0x010000 -
[2:0:1:1] disk fc:0x50014380013c6cf9,0x010000 /dev/sdb
[2:0:1:2] disk fc:0x50014380013c6cf9,0x010000 /dev/sdc
[2:0:1:3] disk fc:0x50014380013c6cf9,0x010000 /dev/sdd
[2:0:2:0] storage fc:0x50014380013c6cfd,0x010100 -
[2:0:2:1] disk fc:0x50014380013c6cfd,0x010100 /dev/sde
[2:0:2:2] disk fc:0x50014380013c6cfd,0x010100 /dev/sdf
[2:0:2:3] disk fc:0x50014380013c6cfd,0x010100 /dev/sdg
[2:0:3:0] storage fc:0x500508b30093e201,0x010700 -
[2:0:4:0] storage fc:0x500508b30093e209,0x010900 -
[2:0:5:0] disk fc:0x5006016808603b82,0x010a00 /dev/sdh
[3:0:0:0] disk fc:0x500601684be027be,0x010700 /dev/sdi
[3:0:1:0] storage fc:0x50014380013c6cf8,0x010000 -
[3:0:1:1] disk fc:0x50014380013c6cf8,0x010000 /dev/sdj
[3:0:1:2] disk fc:0x50014380013c6cf8,0x010000 /dev/sdk
[3:0:1:3] disk fc:0x50014380013c6cf8,0x010000 /dev/sdl
[3:0:2:0] storage fc:0x50014380013c6cfc,0x010100 -
[3:0:2:1] disk fc:0x50014380013c6cfc,0x010100 /dev/sdm
[3:0:2:2] disk fc:0x50014380013c6cfc,0x010100 /dev/sdn
[3:0:2:3] disk fc:0x50014380013c6cfc,0x010100 /dev/sdo 
[3:0:3:0] disk fc:0x50014380013c6cfc,0x010100 /dev/sdp 

Fibre Channel (FC)...
Okay, their transport protocol is FCP ..

Then , we continue with disabling the multipath device by using the following;
echo "offline" > /sys/block/sdp/device/state
This will offline the disk , and multipath will see the associated path as faulty. 

To clear the path completely from multipath -ll output, we remove the device from the SCSI subsystem as follows;
echo "scsi remove-single-device 3 0 3 0" > /proc/scsi/scsi

Lastly, we execute multipath -ll command to see that mpath9 now uses only one path...

mpath9 (36006016029b0380029dbd37cf734e411) dm-4 DGC,VRAID
[size=1.5T][features=1 queue_if_no_path][hwhandler=1 emc][rw]
\_ round-robin 0 [prio=0][active]


If we need to enable the path again;

we need to make linux rescan the scsi bus  and use multipath -v2 command to make multipath detect the changes..

echo "- - -" > /sys/class/scsi_host/${HBA}/scan     -> we may scan all the HBA hosts..
multipath -v2

Wednesday, January 14, 2015

Rdbms -- Oracle Client-Server / Dblink compatability matrix -- 11g to 8i dblink

Following is a compatability matrix gathered from Oracle Support.


It is client-server compatability but it is also applicable for server to server dblinks..

Here is the full version of the matrix;


For database links between different Oracle versions connections must be supported in BOTH directions in the matrix above.
Reference:  
Client / Server Interoperability Support Matrix for Different Oracle Versions (Doc ID 207303.1)

So , what we can do for reacing an 8i Oracle Database from 11gR2 then?

Since the database links directly from 11g to 8i is not supported (same for : from 8i to 11gR2) , we can implement an indirect solution for this.
I mean, we can put a database which is supported with these versions(11gr2 and 8i) in terms of db links, and use it as our intermediate database to connect from 11gR2 to 8i using db links.

Using a 9iR2 and 10gR2 seems to be okay for this.

The flow can be something like the following;

11gR2 db ( reach the synonyms in intermediate database. These synonyms will be directing us to the tables in 8i database using dblinks defined in the intermediate database) --->INTERMEDIATE db (Snyonyms directing to the 8i database using dblinks) ---> 8i database


Note that ; even if we can build a solution using the intermediate database method explained above, we can still face with problems in some specific operations. But it is worth to try.

Anyways; I will test this configuration, and update this post as soon as possible.

--Update
 Yes! 10gR2 can be used as an Intermediate database.. We have used 10gR2 as Intermediate database; and could able to connect a 11gR2 database to a 8i database using dblinks, with the method described above..

Monday, January 12, 2015

EBS R12 -- Missing class: _AppsLocalLogin error, jsp auto-compilation, Allowed JSP feature in EBS 12.2

We may encounter a login problem in both EBS 11i and R12
Our services may be up and running, our Http server may be responding, but in the background our oacore server may be in doubt with the error : Missing Class  _AppsLocalLogin ..

We can see this by analyzing the log files of oacore server.
Note that:
For all login problems or lets say for all oaf problem, we need to analyze the oacore server's log file beforehand...

In EBS R12 (12.0 and 12.1), the logfile is located in $LOG_HOME/ora/10.1.3/j2ee/oacore/oacore_default_group_1 or
$INST_TOP/logs/ora/10.1.3/j2ee/oacore/oacore_default_group_1

In 12.2, however, we have FMW infrastructure, thus it is in $EBS_DOMAIN_HOME/servers/oacore_server1/logs/

In this particular problem  (Missing Class  _AppsLocalLogin) , we are actually  against a cache corruption problem.
Clearing the cache and restarting the services , will do the job.

I mean;
The solution is;
perl $FND_TOP/patch/115/bin/ojspCompile.pl --compile --flush -p 2
Restart the application services
Clear browser caches (just in case) and retry

Okay .. That's it..
but there is someting wrong with this post :) It is so boring, right?

So , lets go a little bit further by examining the jsp caching process in EBS R12 ..
In EBS R12 (except 12.2) the jsp pages are cached in $COMMON_TOP/_pages directory.
In EBS 12.2 cached classes are stored in  $EBS_APPS_DEPLOYMENT_DIR/oacore/html/WEB-INF/classes/_pages
Anyways, when we execute ls in$ COMMON_TOP/_pages,  we will see the following;

ls -al |grep Apps
-rw-r--r-- 1 applprod dba  25723 Nov 25  2008 _AppsChangePassword.class
-rw-r--r-- 1 applprod dba   9304 Apr 21  2011 _AppsLocalLogin.class
-rw-r--r-- 1 applprod dba   8805 Apr 21  2011 _AppsLocalLogout.class
-rw-r--r-- 1 applprod dba   5441 Jan  5  2007 _SSOAppsLogout.class

As you see, the name of the classes in this cache directory starts with "_" :)
So, that 's why we clear the cache and compile jsps(just in case) 

Okay.. Also , we know that In Release 12 the (automatic) compilation of a JSP is disabled and, at runtime, only the pre-compiled JSP's are picked up. Since this requires fewer checks to be done, the performance is improved and therefore this is the recommended and default setting for a Production environment where JSP's will only be replaced occasionally.
Okay... This explains the behavior of EBS..
Now it is clear..That is, if the precompiled jsp is corrupt in cache, then we cant use it.. 
In order to get rid of this, we need to modify the jsp file(if it is corrupt in its source location) , delete the cached file, compile the modified jsp and bounce oacore services.

This routine can be changed, though. We can enable auto-compilation and make JSP recompiled automatically when the JSP is changed (at least the time stamp). We dont need to restart oacore or anything.

Following is the action plan for enabling auto-compilation:

The objective is to change the setting for main_mode from justrun to recompile in
$INST_TOP /ora/10.1.3/j2ee/oacore/application-deployments/oacore/html/orion-web.xml

Login into E-Business suite and select System Administrator responsibility
Select function AutoConfig (under Oracle Applications Manager) (*)
For each web tier server perform the following:
Click on pencil icon under Edit Parameters
Select tab System
Expand section jtff_server
Change value for the entry s_jsp_main_mode from justrun to recompile
Confirm the change by clicking Save button
Run AutoConfig to propagate the changes to the configuration files

Verify that the $INST_TOP/ora/10.1.3/j2ee/oacore/application-deployments/oacore/html/orion-web.xml has the following:
<init-param> <param-name>main_mode</param-name> <param-value>recompile</param-value> </init-param>

Restart the web tier services
Request a JSP in the browser which is compiled. See that a new _<jspname>.class is created in _pages
Make a change in the JSP file
Request it again in the browser. See that _<jspname>.class is 'refreshed' in _pages and the change is seen in the browser.

(*) If the Autoconfig function is not available it can also be accessed by selecting other function for Oracle Application Manager and then select Site Map > System Confiration - Autoconfig. Also the Autoconfig function can be added to the menu as follows:
Log in as System Administrator and select System Administrator
Application>Menu
Put the system into query\
In Menu enter OAM_ADMIN_MENU
Run the query: This should return Oracle Applications Manager Administrator menu
Add new line
Enter Prompt = Autoconfig + Function select OAM_AD_CONFIG_FILES_TABLE and ensure Grant is ticked
Save
After recompilation (and possibly bounce) the new function will be available to be used in the future


This, however; can not be implemented in 12.2(The Enhancement Request 18187079 "REQUEST FOR COMPILATION ON THE FLY FOR JSP PAGES IN RELEASE 12.2" (INTERNAL) ).
On the other hand; EBS 12.2 brings new security feature named  Allowed JSP..  
This new feature gives a white-list of JSP allowed to be called and any other JSP will be blocked..

You can read more about Allowed JSP feature in Oracle E-Business Suit Security Guide Release 12.2 Part No. E22952-09..

Okay, now this post looks ok :) 
I hope you 'll find it useful.

EBS R12 -- unexpected error when Attempting to Login -- java.lang.NullPointerException OAException.java:912

Really unexpected... A login problem related with workflow..
In a manner, we can say that, we cant login because of a little workflow problem.
It has been 6 years since I saw a similar thing in an Production environment...  It was year 2007, and an Apex application stopped working because of a problem in the corporate Mail Server.. The problem was obviously caused by a design bug.. Simply it meant that " if you cant send a mail , then do not work" :)
This  java.lang.NullPointerException which we may encounter in login, seems to be coming from the same family..
Anyways, if we face with this, we will end up with the following;



When we take a look at the call stack(by just clicking the here link in "click here for exception") , we can understand the problem better, because we will see Worklist objects there.

Like for example : NtfWorklist and wf.worklist..

So, just a little thinking will make us see the cause here.. It is caused by the work list which is displayed in the Home Page .. Work list is a workflow dependent UI object and any problem in workflow may affect it.. Likewise, a problem in a UI object affects the application.. Lastly a problem in the login phase of the application affects almost all the users :)

Okay.. Lets take a look at the solution for this problem;
The Concurrent Program : Synchronize WF LOCAL Tables and restart Apache..
Yes, that 's the one and only solution..

On the other hand; there is one more problem.. We cant login, so how will you run this concurent program?
The answers is;

Run the concurrent program "Synchronize WF LOCAL Tables" using the CONCSUB utility
or
Run the concurrent program "Synchronize WF LOCAL Tables" using PLSQL.

Lastly , I will give an example for running "Synchronize WF LOCAL Tables" using PLSQL.

select *
from wf_local_user_roles
--where user_name like 'SYSADMIN%'
where role_name like '%SYSTEM_ADMINISTRATOR|STANDARD%'

select *
from fnd_Responsibility_vl
where responsibility_id=20420

select *
from fnd_concurrent_programs_vl
where user_CONCURRENT_PROGRAM_NAME like 'Sync%WF%'

select *
from fnd_application_vl
where application_id=0

select *
from fnd_concurrent_requests
where concurrent_program_id=42845

declare
 l_req_id number;
BEGIN
 FND_GLOBAL.apps_initialize (0, 20420, 1);
      l_Req_id:=    fnd_request.submit_request (application   => 'FND',
                                     program       => 'FNDWFLSC',
                                     sub_request   => FALSE,
                                     argument1     => 'ALL',
                                     argument2     => '0',
                                     argument3     => 'LOGGING',
                                     argument4     => null,
                                     argument5     => 'Y'  );
commit;
END;   
--                                  
--                                 
That 's all.. 
Restart Apache and you are good to go..

Friday, January 9, 2015

EBS 12.2 -- form compile FRM-18108 and custom synchronization driver file

You may encounter FRM-18108 while compiling fmb files in EBS 12.2 or 12.1

FRM-18108: Failed to load the following objects.

Source Module:APPSTAND.fmb
  Source Object: STANDARD_FOLDER
Source Module:APPSTAND
  Source Object: STANDARD_PC_AND_VA
Source Module:APPSTAND
  Source Object: STANDARD_TOOLBAR
Source Module:APPSTAND
  Source Object: STANDARD_CALENDAR

This issue is actually an expected one. 
It is caused by FORMS_PATH environment variable or we can also say the issue caused by the form file to be compiled is not in the right place. An errornous deployment action ...

Normally, when you login your Application server with your Application OS user (lets say applmgr) and source your environment file (using for example in 12.2 -> . EBSapps.env run ), you will have FORMS_PATH set something like the following;

/apps/fs1/EBSapps/appl/au/12.0.0/resource:/apps/fs1/EBSapps/appl/au/12.0.0/resource/stub

As you see above, there are $AU_TOP/resource and $AU_TOP/resource/stub directories.. 
This is by design and the environment file is not incomplete.
On the other hand, if you want to build or compile some forms , you need to add an extra directory which stores the standard forms inside..
This directory is $AU_TOP/forms/YOUR_LANGUAGE
So if you want to compile a US form.
You need to set your FORMS_PATH before invoking your forms compiler as follows;
export FORMS_PATH=$FORMS_PATH:$AU_TOP/forms/US

Or if you want to compile a NLS form (for example turkish)
Then you have to set your FORMS_PATH as follows;
export FORMS_PATH=$FORMS_PATH:$AU_TOP/forms/TR

Alternatively , you can copy your fmb file to the $AU_TOP/forms/"your_LANG" directory and execute frmcmp_batch to compile your fmb file.. Note that: your current directory should be $AU_TOP/forms/your_LANG in this case.

You need to add $AU_TOP/forms/LANG(US or any other lang dir)  to your FORMS_PATH variable , only when compiling or building form files.
There is no need to add $AU_TOP/forms/LANG(US or any other lang dir) in the standard environment file ,as the needed standart things are already included in the fmx files during the compile phase.
This method is the method that Oracle suggests..

Lastly , here is the statement of Oracle about compiling forms in EBS 12.2

All the form .fmb files are staged under $AU_TOP/forms/<LANG>, for example, $AU_TOP/forms/US. The compiled forms (.fmx files) are staged under $PROD_TOP/forms/<LANG>, for example, $FND_TOP/forms/US.

In the case of custom forms created by a customer, the .fmb files are staged under $AU_TOP/forms/US. The compiled forms (fmx files) are staged under lt;CUSTOM>_TOP/forms/US.

One last thing, if your form is a custom one then; you should add entries for all your custom files to the custom synchronization driver file located at $APPL_TOP_NE/ad/custom/adop_sync.drv (%s_ne_base%/EBSapps/appl/ad/custom/adop_sync.drv). The adop utility uses this driver file to synchronize files between the run file system and the patch file system.
Add your entries in the section marked by the '#Begin Customization' and '#End Customization' comments.
When adding your entries, follow the syntax of the examples provided in the %s_adtop%/admin/template/adop_sync_drv.tmp template file. For example, if you have custom java class files under the $JAVA_TOP/<Company identifier>/* directory, and if all the files under this directory need to be synchronized between the patch file system and the run file system, then add the following entry in the custom synchronization driver file:
rsync -zr %s_current_base%/EBSapps/comn/java/classes/<Company identifier> %s_other_base%/EBSapps/comn/java/classes

You can use context variables in the entries you add. The syntax for a context variable is: %s_sample_var%
Any paths you include in your entries should be specified relative to s_current_base and s_other_base.

Wednesday, January 7, 2015

EBS R12/Linux -- frmweb , segfault error 4 in libsosdw.so.0 --investigation

In a EBS R12 environment, segfaults for frmweb may appear in sys log (/var/log/messages) .
I have faced with this errors in a EBS 12.1.3 environment, which was running on Oracle Linux 6.5 64 bit.
Anyways, the reason that makes me write this blog post, is not the segfault itself. I m writing this blog post to show you the method to investigate these types of errors further..
Okay.. The error itself does not give us detailed information.  We need to have at least a call stack of frmweb while it was encountering the segfault.
  • Here ; the error is reported in /var/log/messages;
kernel: frmweb[9727]: segfault at 6974655e ip 00000000f758f2c1 sp 00000000ffdd9b58 error 4 in libsosdw.so.0[f758b000+b000]
  • But the coredump can not be produced because of the configuration of abortd;
Jan 5 17:38:02 erpr12app abrt[20281]: Saved core dump of pid 29744 (/u1/oracle/R12PROD/apps/tech_st/10.1.2/bin/frmweb) to /var/spool/abrt/ccpp-2015-01-05-17:38:01-29744 (58826752 bytes)
Jan 5 17:38:02 erpr12app abrtd: Directory 'ccpp-2015-01-05-17:38:01-29744' creation detected
Jan 5 17:38:02 erpr12app abrtd: Executable '/u1/oracle/R12PROD/apps/tech_st/10.1.2/bin/frmweb' doesn't belong to any package and ProcessUnpackaged is set to 'no'
Jan 5 17:38:02 erpr12app abrtd: 'post-create' on '/var/spool/abrt/ccpp-2015-01-05-17:38:01-29744' exited with 1
  • So we configure the abort daemon to create the coredump for the unpackaged applications like frmweb.
To accomplish that; we modify the file /etc/abrt/abrt-action-save-package-data.conf file as follows;
ProcessUnpackaged = yes
Then, we restart the abrtd ..-> service abrtd restart 
  • After the restart of abrtd, we see that coredumps are generated for frmweb segfaults.
Jan  5 22:30:54 erpr12app kernel: frmweb[22841]: segfault at 6974655e ip 00000000f75d42c1 sp 00000000fff34e98 error 4 in libsosdw.so.0[f75d0000+b000]
Jan  5 22:30:55 erpr12app abrt[24120]: Saved core dump of pid 22841 (/u1/oracle/R12PROD/apps/tech_st/10.1.2/bin/frmweb) to /var/spool/abrt/ccpp-2015-01-05-22:30:54-22841 (47157248 bytes)
Jan  5 22:30:55 erpr12app abrtd: Directory 'ccpp-2015-01-05-22:30:54-22841' creation detected
  • We use gdb(GNU debugger) to reach the callstack stored in the coredump;
cd  /var/spool/abrt/ccpp-2015-01-05-22:30:54-22841
gdb /u1/oracle/R12PROD/apps/tech_st/10.1.2/bin/frmweb coredump
(gdb) bt

-->Program terminated with signal 11, Segmentation fault.
#0 0xf75b12c1 in siehjmpterm () from /u1/oracle/R12PROD/apps/tech_st/10.1.2/lib/libsosdw.so.0
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.132.el6.i686 libXp-1.0.0-15.1.el6.i686 libXt-1.1.3-1.el6.i686 libgcc-4.4.7-4.el6.i686 libuuid-2.17.2-12.14.el6.i686 multi_lib32-4.8.2-1_Oso.x86_64 openmotif21-2.1.30-11.EL6.i686
(gdb) bt
#0 0xf75b12c1 in siehjmpterm () from /u1/oracle/R12PROD/apps/tech_st/10.1.2/lib/libsosdw.so.0
#1 <signal handler called>
#2 0xf77d742e in __kernel_vsyscall ()
#3 0xf6a825d3 in __read_nocancel () from /lib/libpthread.so.0
#4 0xf758fe00 in ixhgr_GenRead () from /u1/oracle/R12PROD/apps/tech_st/10.1.2/lib/libixw.so.0
#5 0xf758f059 in ixhrdh_ReadHeader () from /u1/oracle/R12PROD/apps/tech_st/10.1.2/lib/libixw.so.0
#6 0xf758fa5e in ixhhsm_HTTPStateMachine () from /u1/oracle/R12PROD/apps/tech_st/10.1.2/lib/libixw.so.0
#7 0xf758f51e in ixhhrd_HTTPRead () from /u1/oracle/R12PROD/apps/tech_st/10.1.2/lib/libixw.so.0
#8 0xf7595a76 in ixncrd_Read () from /u1/oracle/R12PROD/apps/tech_st/10.1.2/lib/libixw.so.0
#9 0xf759614b in ixncru1_ReadUB1 () from /u1/oracle/R12PROD/apps/tech_st/10.1.2/lib/libixw.so.0
#10 0xf7594054 in ixncrme_ReadMsg () from /u1/oracle/R12PROD/apps/tech_st/10.1.2/lib/libixw.so.0
#11 0xf7595004 in ixncrq_ReadQueue () from /u1/oracle/R12PROD/apps/tech_st/10.1.2/lib/libixw.so.0
#12 0xf7595328 in ixncpl_PutList () from /u1/oracle/R12PROD/apps/tech_st/10.1.2/lib/libixw.so.0
#13 0xf76bc53c in iifwcpl_PutList () from /u1/oracle/R12PROD/apps/tech_st/10.1.2/lib/libiifw.so.0
#14 0xf4b9dfa6 in afm_msg_send () from /u1/oracle/R12PROD/apps/apps_st/appl/fnd/12.0.0/bin/fndfmxit.so
  • After obtaining the call stack, we check the Oracle Support .. We check the bugs, and Knowledge base to find a document or bug record that matches the call stack.
  • In this particular case; the solution was applying patch: 8940272 , as the general error definition and call stack declared in the related document and bug record was the same as ours.

Tuesday, January 6, 2015

Linux -- Soft vs Hard Limits -- from Oracle's perspective

As known, we set the limits for our operating system user using limits.conf , ulimit command or similar techniques..
We set two type of limits.. I mean, Soft limits and hard limits.
When we look from the Oracle's perspective; we set these limits according to our Oracle Product's needs.

For example; Oracle documents says set files limits as follows;

oracle soft nofile 1024
oracle hard nofile 65536

The explanation of these setting is ;

A process that is owned by oracle user can open 1024 files at a time, but if the process that is owned by oracle user wants the increase this limit, it can increase this limit up to 65536.

Another example:

oracle soft nproc 2047
oracle hard nproc 16384

nproc is a little different..
oracle user can create 2047 processes at a time, but if oracle user want the increase this limit, it can increase this limit up to 16384.

Focus on the words : if oracle user want the increase this limit

So, we set the soft limit and hard limit right?  Then , both of them are used  ..
Lets consider the scenario; 
Suppose we login to the server using oracle Os user, 
Note that :Our soft open file limit is 20 and our hard open file limit is 20000 .

ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 94934
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 20    ---> our open file limit is set to soft open files limit initially (in our bash when we login)
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 16384
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

As seen above, the soft limit for our session is 20 and it is our current open files limit when we login to the server with oracle user..
lets start the database and trace the system calls;

getrlimit(RLIMIT_NOFILE, {rlim_cur=20, rlim_max=20000}) = 0
setrlimit(RLIMIT_NOFILE, {rlim_cur=20000, rlim_max=20000}) = 0

You see, the session is under control of the soft limit. 
During startup 
(
I mean 
sqlplus "/as sysdba "
SQL>startup
)
oracle binary first uses getrlimit system call to see the current and max limits.. Then uses setrlimit system call to set the max number of files limit to be equal to the hard limit value.

Oracle does this even if soft proc is 39000 and hard proc is 40000.. It basically sets the number of file limit as equal to the hard limit..

getrlimit(RLIMIT_NOFILE, {rlim_cur=39000, rlim_max=40000}) = 0
setrlimit(RLIMIT_NOFILE, {rlim_cur=40000, rlim_max=40000}) = 0

LOCAL=NO , background processes and beq connections also behaves the same.. they set the hard limit as their soft limit... 
 (note that with local=no , the listener process limits applies) .. So if you change the limits of your  shell, remember to restart your listener. ( or lets say restart every related thing :))

Then why do we have soft limits for open files or lets say soft limit at all?
I guess there are some oracle processes which cant modify their soft limits , I mean there may be some oracle processes which cant increase their open file limits up to hard limit.. So, that's why oracle recommends a soft limit such as 16384 for soft limit of open files. 
Or
Maybe there are some points in the code, that can not break.. The soft limits are there to guarantee these points to be executed without any problems..
Or
It may be related with the db_file init.ora parameter (or with maxdatafiles) .. I mean the required soft nofile limit specified in the Oracle documents is 1024, so it may be derived by these parameters + a bunch of library files mapped by the process
In addition to that, these limits are supplies security in a manner.

Anyways,  what we can say is that ; with the settings in Oracle docs; an Oracle process will have open files limit as 1024(at worst) , when it starts to its execution.
This seems to be sufficient in the first start.. But then an Oracle process sets its soft nofile limit to be the same as its hard nofile limit, to support more complicated/complex operations ,which may require more files to be opened in process level.

At the end of the day; Oracle recommends that...
oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 1024
oracle hard nofile 65536
oracle soft stack 10240

Linux -- setting limits in .bash_profile vs 90-nproc.conf file

I already wrote a post about setting limits in Redhat/Oracle Linux 6..
http://ermanarslan.blogspot.com.tr/2014/11/ebslinux-fork-retry-resource.html
There is bug introduced by using nproc.conf file to prevent the fork bombs, and Application admins should add their limits settings into .bash_profile of the relevant Os user to workaround this problem..
But yesterday, I have seen a problem while making a switch user operation(su - osuser)..
The problem was that , altough we have the required settings in the user's .bash_profile, "su - osuser" can not be completed.. The error was something like resource unavailable.
Then I have started to investigate the problem , and end up with the following;

When executing su - osuser operation;
90-nproc.conf (which is causing the problems as explained above) is read after all other limits related files(for example : limits.conf)
After reading the file , the limits setting are done according to the values which are set in 90-nproc.conf file.

open("/etc/security/limits.d/90-nproc.conf", O_RDONLY) = 3  -> we read the file
fstat(3, {st_mode=S_IFREG|0644, st_size=316, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5726262000
read(3, "# Default limit for number of us"..., 4096) = 316
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x7f5726262000, 4096)            = 0
setrlimit(RLIMIT_NPROC, {rlim_cur=1024, rlim_max=1024}) = 0  -> we set the limits to the values which we read from the 90-nproc.conf file
setrlimit(RLIMIT_NOFILE, {rlim_cur=40384, rlim_max=40384}) = 0   -> we set the limits to the values which we read from the 90-nproc.conf file
setpriority(PRIO_PROCESS, 0, 0)         = 0

The important thing is the number of process setting in that file is 1024, and su - osuser sets it to 1024..
Thus , we get resource unavailable errors in our crowded systems while switching to our OS user.. 
--eventhough we have the needed limits settings in our .bash_profile..

Normally, what happens is ;

In Time 0  -> 
we issue a su - osuer command and during the su operation we have low limits  ..

ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 515291
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 40384
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

In Time 1 ->

But after the su - osuser operation completes, our .bash_profile  is read, and our high limit setting commands are executed , and our limits are set as we want them to be set. 
Because we have someting like following written in our .bash_profile..

if [ $USER = "your_os_user" ]; then

        if [ $SHELL = "/bin/ksh" ]; then

              ulimit -p 16384

              ulimit -n 65536

        else

              ulimit -u 16384 -n 65536

        fi

fi

In Time 2 ->
We start our application processes, and everyting works well..

But sometimes, we cant even complete our su - osuser operation.. 
If that is the case; the solution is to set the limits in 90-nproc.conf file , too..

Note that:  Being not able to make switch user does not affect the application in this case..
I mean, your application may run without problems, if you have defined the limits in the application OS user's .bash_profile.. 90-nproc.conf plays part in the su - osuser phase..

So we can say that; setting the limits in .bash_profile affects application and setting the limits in 90-nproc.conf affects Linux admins :)

Okay.. That's enough for this topic :). I hope you 'll find it useful.

Monday, January 5, 2015

EBS 12.2 -- 12.2.3 to 12.2.4 Upgrade

Current Release: 12.2.3 
Target Release  : 12.2.4
Language other than American English : Turkish
Operating System : Oracle Linux 6
Documents Reference:
o   Oracle E-Business Suite Release 12.2.4 Readme (Doc ID 1617458.1)
o   Oracle E-Business Suite Release 12.2: Consolidated List of Patches and Technology Bug Fixes (Doc ID 1594274.1)
o   Applyng the Latest AD and TXK Release Update Packs to Oracle E-Business Suite Release 12.2 (Doc ID 1617461.1)

Important Note:
Patches may differ according to your patch levels.. Also Oracle may release new patches and update the upgrade documents.
This document should be followed just to understand the general process of EBS 12.2.3 to 12.2.4 upgrade.. Patches and actions should be crosschecked with the Oracle Documents, which I have given as references above.  Also env file paths may differ according to your environment..

1.  Oracle E-Business Suite Release 12.2: Consolidated List of Patches and Technology Bug Fixes (Doc ID 1594274.1)

a.      Database Patches and Bug Numbers
Database 11.2.0.3 Patches and Bug Numbers
Following database patches have been applied using opatch utility:
· 16989137
· 18685209
· 17600719
· 19078951
· 17693770
· 19393542
· 17875948
· 17912217
· 18116376
· 18419770
· 18614015

b.      Application Tier Technology Patches and Bug Numbers
Following technology patches have been applied using opatch utility:

Forms and Reports 10.1.2.3 Patches and Bug Numbers
export PATH=$ORACLE_HOME/OPatch:$PATH

· 7140405
· 8943095
· 15900214
· 5659594
· 17645157
· 17653437
· 17907988
· 18186693
· 18620223
· 19434967
· 12837860

Patches and bug numbers for FMW webtier 11.1.1.6
 export ORACLE_HOME=/u01/install/APPS/fs1/FMW_Home/webtier/
 export PATH=$ORACLE_HOME/OPatch:$PATH
· 13055259
· 17639414
· 17555224
· 19945419

Patches and bug numbers for 11.1.0.7 RSF within FMW webtier
· 7695070
· 12949905
  
Patches and bug numbers for FMW oracle_common 11.1.1.6
export ORACLE_HOME=/u01/install/APPS/fs1/FMW_Home/oracle_common/
export PATH=$ORACLE_HOME/OPatch:$PATH
· 13490778
· 13947608
· 9905685
· 17394319
· 18989444
· 19462638
· 17284368

Patches and bug numbers for Oracle Weblogic Server 10.3.6.0
· 17495356
· 17893334
· 19600486

2.  Applying the Latest AD and TXK Release Update Packs to Oracle E-Business Suite Release 12.2 (Doc ID 1617461.1)
- Sourced the run edition environment file and Start new online patching cycle.
· $ . <EBS_ROOT>/EBSapps.env run
· $ adop phase=prepare

- Downloaded and unzipped the following patches:
· Patch 18283295:R12.AD.C.Delta.5
· Patch 20034256:R12.AD.C
· Patch 20075842:R12.AD.C
· Patch 18288881:R12.TXK.C.Delta.5
· Patch 20043910:R12.TXK.C
· Patch 19803392:R12.TXK.C
· Patch 20158287:R12.TXK.C
· Patch 19259764:R12.FND.C

- Execute adgrants by following the instructions in the readme of  Patch 20034256:R12.AD.C.

$ adop phase=apply patches=18283295,20034256,20075842 merge=yes
$ adop phase=apply patches=18288881,20043910,19803392,20158287 merge=yes
$ adop phase=apply patches=19259764
$ . <EBS_ROOT>/EBSapps.env run
$adop phase=finalize
$ adop phase=cutover
$ . <EBS_ROOT>/EBSapps.env run
$ adop phase=cleanup
$ adop phase=fs_clone

3.  Path B - Existing Customers Release 12.2.3 upgrading to 12.2.4        Doc ID 1617458.1: 6.2
In this step, preffered to follow Path B, as the system was already in service and it have been used by the clients for 3 months.
Sourced the run environment and started new online patching cycle.
· $ . <EBS_ROOT>/EBSapps.env run
· $ adop phase=prepare

Applied Oracle E-Business Suite Release 12.2.4 Patch 17919161
· $ adop phase=apply patches=17919161

Completed the online patching cycle and Synchronized file system
· $ adop phase=finalize
· $ adop phase=cutover
· $ adop phase=cleanup
· $ adop phase=fs_clone

Applied 12.2.4 Online Help Patch 17919162 merged with Patch 19290141 using adop hotpatch mode on the run file system.
·  $ adop phase=Applied patches=17919162,19290141 hotpatch=yes merge=yes

4.  Performed NLS-related Step
Sourced the run edition environment file and Started new online patching cycle.

· $ . <EBS_ROOT>/EBSapps.env run
· $ adop phase=prepare

Applied NLS Patch 17919161
· $ adop phase=apply patches=17919161_TR:u17919161.drv

Applied NLS 12.2.4 Online Help Patch 17919162
· $ adop phase=apply patches=17919162_TR:u17919162.drv

Applied HRGlobal Patch
· $ adop phase=apply patchtop=$PER_TOP/patch/115 patches=driver:hrglobal.drv options=nocopyportion,nogenerateportion,forceapply

Completed the online patching cycle and Synchronized file system

· $ adop phase=finalize
· $ adop phase=cutover
· $ adop phase=cleanup
· $ adop phase=fs_clone