Wednesday, November 26, 2014

Linux,Netapp luns-- Disk Alignment

In these days, I m working with Netapp Luns. 
While walking through my document archive; I have found a disk alignment approach , which can be implemented when using Netapp Luns in Linux ext3/ext4 filesystems. I want to share it with you.

Using fdisk 's expert mode in Linux , we define the disk geometry for mapped luns;

For example:

fdisk /dev/mapper/NetAppU1150g

We use "x" to go into the expert mode and set the head , sector and cylinder sizes as follows;

head=256
sector=56
cylinder= (The size of disk in bytes)/512/256/56

Then we return to the normal mode and create our new partition with the default inputs.
Then we use "w" to save and exit.

Extra info:

Host file system need to be aligned with a Storage Lun, in order to achieve best performance.
This can be accomplished specifying the correct starting sector or logical block address (LBA) for the partition, and filesystem should be configured to use the same or multiple of the Storage block size..

On Linux , you can configure a partition and filesystem using fdisk (x option)

Alignment is important, and should be considered for Storage IO performance.Otherwise, you can end up with the following;

Misaligned filesystem and LUN:



It is almost equal -> doing 2 Storage IO for 1 OS IO..

RAC,Linux -- asm libraries , Do we need to install asmlib, ASM kernel driver and ASM support in OEL6?

Normally, there are 3 packages which need to be installed in Linux environment that will run Oracle ASM.
These packages are;

oracleasm (The kernel driver)
oracleasmlib (ASM libraries)
oracleasm-support  (ASM Support scripts)

We normally install these one by one, and use our ASM instances in our RAC environment accordingly.
On the other hand; I have realized that the things are a little different, when you use Oracle Enterprise Linux 6 for your RAC environment.
There are 3 facts for a RAC installation based on an Oracle Enterprise Linux 6)

1)The oracleasm kernel driver is built into the Unbreakable Enterprise Kernel for Oracle Linux 6 and does not need to be installed manually.
2)oracleasm-support can be choosed while installing Oracle Linux 6, so we do not need to install it manually.
But if we want to install it manually; the oracleasm-support package can be downloaded from the Unbreakable Linux Network(ULN) if you have an active support subscription, or from http://public-yum.oracle.com if you do not.
3) The oracleasmlib packages still need to be installed manually, it can be downloaded from  "http://www.oracle.com/technetwork/server-storage/linux/asmlib/ol6-1709075.html" -> 


Intel EM64T (x86_64) Architecture


Tuesday, November 25, 2014

EBS,Linux -- fork: retry: Resource temporarily unavailable, limits.conf has no effect in Redhat

The resource temporarily unavailable error is actually self explanatory. It means, OS has resource that the process needs but these resources can not be given to the process at the moment.
The interest of ours as Linux Admin should be the cause of this error and ofcourse the solution to that..
This error may have critical effects, such as ;

Cant starting the EBS services;
/u01/fs2/inst/apps/ERMAN_ERBANT/admin/scripts/adstrtal.sh: fork: retry: Resource temporarily unavailable

Cant using basic commands;
 ps aux
-bash: fork: retry: Resource temporarily unavailable

ps -ef |grep pmon
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable

As written in the error lines, the process can not fork. As known; fork() creates a new process by duplicating the calling process.
The cause of the error actually comes from the OS limits and here is the checklist for finding the cause..

First; we check the /etc/sysctl.conf and /etc/limits.conf... In these two files we try to find a clue. We check for undersized parameter values such as Max Process count and Max File Descriptor count..
Note that: fork makes me directly think of the Process counts, but File Descriptor check is a nice-to-have thing in these kind of OS limit errors.

Note that : Tthe activities such as checking the memory using free command or using df command to check the disk space are unrelevant ..

Okay lets explain the problem determination and the solution by walking through an example scenario;

Problem :
/u01/fs2/inst/apps/ERMAN_ERBANT/admin/scripts/adstrtal.sh: fork: retry: Resource temporarily unavailable


Here is an example sysctl.conf: 

net.ipv4.ip_forward = 0
# Controls source route verification
net.ipv4.conf.default.rp_filter = 1
# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1
# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1
# Disable netfilter on bridges.
# net.bridge.bridge-nf-call-ip6tables = 0
# net.bridge.bridge-nf-call-iptables = 0
# net.bridge.bridge-nf-call-arptables = 0

# Controls the default maxmimum size of a mesage queue
# kernel.msgmnb = 65536
# Controls the maximum size of a message, in bytes
# kernel.msgmax = 65536
# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736
# Controls the maximum number of shared memory segments, in pages
# kernel.shmall = 4294967296
kernel.sem = 256 32000 100 142
kernel.shmall = 2097152
kernel.shmmni = 4096
kernel.msgmax = 8192
kernel.msgmnb = 65535
kernel.msgmni = 2878
fs.file-max = 6815744
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range=9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576

We are interested for the fs.file-max above.. It is 6815744, actually pretty good.

Here is an example of limits.conf:

* hard nofile 65536
* soft nofile 4096
* hard nproc 16384
* soft nproc 1024
* hard stack 16384
* soft stack 10240

Okay.. In limits conf , we have soft nproc 1024 and hard nproc 16384.

These values seem good.. They actually seem okay when you take Oracle Database installation document as reference, but there is an important info: These values are actually suggested minimum values, so they should be increased according to the situation or lets say according to the load & concurrency.

Okay.. After the introduction ; here is the fd counts, process counts and the ulimits of the problematic OS user at the problematic moment;

cat /proc/sys/fs/file-nr
26016 0 6815744

26016 ( total allocated file descriptors since boot)
0 ( total free allocated file descriptors)
6815744 ( maximum open file descriptors)

lsof | wc -l (note that: lsof display allocated file descriptors a little higher than normal -- so it is consistent wtith the file-nr output)
33572

However, this control  is not right, becuase these 33572 is total number of open files.. Remember our nofile limits are per process.
In order to monitor the open files per process , we may use ;
for p in /proc/[0-9]* ; do echo $(ls $p/fd | wc -l) $(cat $p/cmdline) ; done | sort -n | tail
In a crowded EBS application server, I see max 650,660 files opened by a process.

ps -eLF | grep oracle| wc -l ( !!!! the user oracle has 1029 processes)
1029

Note that: nproc limit is per user (not per process)

Lets look at the limits of Oracle user

ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 95145
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 8192
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

As you may guess , this oracle OS user have reached its process limits at the problematic moment, and it seems this creates ourr fork problem..

The solution here is increasing the proc values and restarting our shell processes. Reboot also works for this purpose but it is not necessary. What we need to do is edit the limits.conf using root, and then relogin to our oracle account and start our applications processes. In addition we need to be sure that in our login file -> session required pam_limits.so should be present..



Note that:
Watch out the PAM extra configuration:
Because in some releases PAM provides a file which overrides the settings in limits.conf for the number of processes.. If this is the case , then the modification should be done in that file.

This behaviour is explained as a bug record in Redhat:




One last thing:
An alternative can be using the following script in the .bash_profile of the problematic os user:

if [ $USER = "oracle" ]; then

        if [ $SHELL = "/bin/ksh" ]; then

              ulimit -p 16384

              ulimit -n 65536

        else

              ulimit -u 16384 -n 65536

        fi

fi

RAC -- Preparing for the installation -- requests

RAC installation is a complex process which consists of several sub processes.
Preparing for RAC is the starting point among these processes.
In preparing phase, we as Database Administrators,RAC Admins deliver our system requirements to the client, we deliver our requirements by explaining the RAC and the related system concepts briefly.
Following is a system requirements paper delivered by me.. It was written pretty quickly and delivered after a RAC pre-install meeting with the client. It also contains the brief explaination about the migration process.
RAC system mentioned in the paper was supposed to be 11gR2 Linux RAC consisting of two nodes..

->
RAC nodes will be Oracle Linux 6.5 64 bit, so Oracle Linux 6.5 iso file should be there for us.. It should be burned to DVD or mounted to the servers using ILOM/ILOM/etc. as Image file to before the installation.

After OS installation, Oracle Grid and Rdbms software installations will be done.
There will be 2 Diskgroups, DATA and RECO and the system will be ready for migration at this point.

RMAN, and rconfig will be used for the migration.. (Rman convert option + rconfig together will be used for Windows Single to Linux RAC migration process here)

The whole process will be 3-4 days.

A diagrammatic explaination of Oracle RAC:

The disks which will be presented by SAN should be in form of 8 lun( For ex:  rather than having 1x 1600 gb , we prefer  8x200 gb luns) We want 8 luns because the environment has multipath with 2 active paths, thus we prefer active path *4 luns for ASM. Disks should be aligned for ASM use.. 

In network side, te ip addresses and hostname should be ready before the installation day
Firewalls should be configured properly for RAC public ips and vips for client-applicaiton-database communication.

Network Requirements:

Public IPs , Private IPs, Virtual IP's ve SCAN IPs:

In both of the nodes:

We'll have 1 public , 1 private and 1 virtual ip addresses.
We'll have 3 Scan ip addresses and one scan hostnsame defined for these 3 ip addresses.. The scan ip and hostnames will be only on DNS.. A DNS with round robin capability.
We'll use bonding in Linux tier both for private and public interfaces.. So Network configuration, cabling and switch configuration should be done accordingly.

An Example Configuration:

Note: Public IPs, Virtual IPs and Scan s must be on the same subnet.
Node 1:

Public hostname :osrvdb01, public ip: 10.35.120.xx /255.255.255.0 --configured on the server before the installation, in DNS also..
Private hostname: osrvdb01-priv , private ip : 192.168.10.x /255.255.255.0 --configured on the server before the installation in its own subnet .NO Dns configuration.
Virtual hostname : osrv01-vip, virtual ip: 10.35.120.yy /255.255.255.0 --configured on DNS before the installation.


Node 2:

Public hostname :osrvdb02, public ip: 10.35.120.zz /255.255.255.0 --configured on the server before the installation, in DNS also..
Private hostname: osrvdb02-priv , private ip : 192.168.10.x /255.255.255.0 --configured on the server before the installation in its own subnet .NO Dns configuration.
Virtual hostname : osrv01-vip, virtual ip: 10.35.120.tt /255.255.255. --configured on DNS before the installation.

--once the cabling will be done, we ll configure the network interfaces and bonding in the linux tier.
Bonding will be active-backup.. It can be active-active load balanced if requested to be so.


Additionally;

3 scan entry defined in DNS:

erm-scan : 10.35.120.aa /255.255.255.0 --should be configured on DNS before the installation.
erm-scan : 10.35.120.bb /255.255.255.0 --should be configured on DNS before the installation.
erm-scan : 10.35.120.cc /255.255.255.0 --should be configured on DNS before the installation.

Sunday, November 23, 2014

EBS and Oracle Database 12c certifications

At the moment Oracle Database is certified to work with the following E-Business Suite versions.
I 'm happy to see 12.1.3 there. Also , I was not excepting to see 11.5.10.2  but it is also pleasing because there are several customers still using 11.5.10.2.

Certification info is as follows;

  • Oracle Database 12.1.0.2.0 is certified with Oracle E-Business Suite 12.1.3.
  • Oracle Database 12.1.0.1.0 is certified with Oracle E-Business Suite 12.2.4, 12.2.3, 12.2.2 , 12.1.3 , 12.0.6 and 11.5.10.2.
It is also interesting to see that Database 12.1.0.2.0 is certified with only EBS 12.1.3..

Anyways , I m sharing the roadmap to use the latest Oracle Database and E-Business Suite together and the documented to be followed in sequence ->


12.2.0 installation:
Oracle E-Business Suite Installation and Upgrade Notes Release 12 (12.2) for Linux x86-64 (Doc ID 1330701.1)

12.2.4 Upgrade: 
Oracle E-Business Suite Release 12.2.4 Readme (Doc ID 1617458.1)


12C Db uprade:
Interoperability Notes Oracle EBS 12.2 with Oracle Database 12c Release 1 (Doc ID 1926201.1)


Thursday, November 20, 2014

Toad for Oracle -- OCI cant initialize error

Normally, I dont write about Toad.. Actually I dont write about any client tools, but this time I m writing for Toad because I like to write the undocumented things :)

The error is the famous "OCI cant initialize" error. Oracle Call Interface (OCI) stands for the Oracle Call Interface , and it is the comprehensive, high performance, native C language interface to Oracle Database. 
Altough the cause seems to be realted with a problem in OCI initialization, it is actually a result..



The error is encountered when we try to connect to a database. any database actually.
The error may be encountered one day, without any reason.
OCI error may be just for some users.. Even if all the users are using a single toad binary through the Windows Terminal Server, this error may be seen only by some users..


The solution is to delete the directory named AppData\Roaming\Quest Software\Toad for Oracle for the problematic users.. Note that : if you delete this directory your license info, your password and your connection information will be deleted too.
An alternative may be to delete the directory of the problematic user and copy the same directory from another user.. Like copying C:\Users\erman\AppData\Roaming\Quest Software\ to C:\Users\ali\AppData\Roaming\Quest Software

AppData\Roaming in Windows is where programs on your machine store data that is specific to your user account. The folder is normally hidden, you need to display the hidden files to make operations on it.
So the cause must be something like a logical corruption or a security problem introduced with a new configuration in the files of Toad located in the Roamin folder.
Anayways, you dont need to reinstall Toad if you get this OCI cant initialize error...Try the solution that I have mentioned above in the first place :)

Weblogic -- AppSecurityContext.setApplicationID.null error when using Nodemanager //Diagnostics

You may encounter AppSecurityContext.setApplicationID.null error when starting your Weblogic application in Weblogic console.
It happened to me and I have made a 6 hour diagnostic to find the cause of this error.
On the other hand; once the cause is found, it is easy to take the corrective action.

Actually, in my opinion we dont need to deal with this kind of errors as Apps Dba's or even as Weblogic Admins..
On the other hand; dealing with this kind of errors and correcting them make me happy in a way :).
But I must admit that I started to think that day by day things in Oracle World are getting difficult rather than getting easier...

Okay, lets take a look at the error.

The error stack is something like the following;
java.security.AccessControlException: access denied (oracle.security.jps.JpsPermission AppSecurityContext.setApplicationID.null)
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:374)
at java.security.AccessController.checkPermission(AccessController.java:546)

The error appears when starting an Entprise Application deployed in the Weblogic.

The error only appears if the Managed Server that configured to run the Entprise Application is started using the Node Manager.. In other words; if we start the Managed Server using command line(sh startManagedServer ERMANServer t3://ermanhost:7003) ; the error is not seen, Enterprise Application can be started without any problems..

So, as you may guess, the problem is related with the Node Manager, but it is hard to find the root cause as this is a specific java error..

Fortunately, it is possible to understand that it is related with a java security pattern, as it comes from the java.security package.

Anyways , I will keep it short..
One way or another I have found that the problem is caused by a missing environment variable or a missing command line arguments related with the java process that represents the Managed Server.
When the process (Managed Server) is started using Node  Manager, there must be a missing/errornous environment variable or java command line argument which causes this error.

To find this missing argument or lets say environment variable; I have made a test..
I started Managed Server two times and recorded its command line using ps -ef..

Firstly started the Managed Server using NodeManager and recorded its command line.
Then started the Managed Server using startManagedServer script and recorded its command line.
Lastly I compared these two with each other and saw the difference..

The command line of the Manager Server process which was started using startManagedServer.sh was  longer..
Thus by using this testing technique, I concluded the problem..
"The problem was caused by the start command that was used by the Nodemanager to start the Managed Server."

The start command was missing some command line arguments, as it was shorter than the command line of the startManagedServer script.

Then I realized that the error is related with security.. Also realized that the security is maintained using jps-config.xml. This config file is actually the configuration of the Java Policy Store (JPS) it is used to configure Policy, Credential, Key Store, Login Module and User Identity Store location.
It is located in $DOMAIN_HOME/config/fmwconfig, and it is supplied to the java processes using the parameter -Doracle.security.jps.config..

So , when started from the Nodemanager , it was clear that the ManagedServer was missing this config and the Application located in this ManagedServer can not be started because of this.

The solution have been stated when I saw the fine setDomainEnv.sh..
This file was sourced before starting the Managed Server using startManagedServer.sh..

In setDomainEnv.sh  I saw an export of the environment variable EXTRA_JAVA_OPTIONS.
This environment variable was used to set some additional command line arguments like jps-config!

There were several environment variables in EXTRA_JAVA_OPTIONS though..
It was like EXTRA_JAVA_OPTIONS=/erman/java/blabla:$ERMAN:$ERMAN2... and so on.

To have a plain text of it; I used echo $EXTRA_JAVA_OPTIONS just after sourcing the setDomainEnv.sh..

Once I got the plain text, I opened the Weblogic console and write this plain text into the server start tab of the problematic Managed Server..

After I have done this, the error dissapeared. Our ADF Enterprise Application could start even if its Managed Server was started using Node Manager..



Another info that I have gathered is that Node Manager reads the needed command line arguments from the server-start options of related Managed Server. If it cant find any server-start option for a particular Managed Server, it starts the Managed Server with default java arguments and leads this error.

Following is the server-start configuration that I have made to overcome this error.


I hope you will find this useful.

Weblogic -- HACMP Configuration, IP Address Change

I have done several Oracle implementations in AIX environments. From the Oracle's perspective , the technical work in AIX implementations are similar to the Oracle on Linux implementations. Ofcourse there are some differencse like using a different Package manager, and  executing different commands ,which effect our implementations.
On the other hand , these differences are not as big as a Senior Linux admin cant handle.
Even clustering in AIX is as so similar. I m talking about HACMP here.

As you may know; HACMP(High Availability Cluster Multiprocessing) is IBM's solution for high availability. It can be used on the AIX and IBM System p Linux.
Without any other clustering software like Weblogic clustering or Oracle Rac, Active/passive OID cluster can be build on top of  IBM HACMP Cluster..
The key point in here is the service ip used by the HACMP cluster nodes.. By using the service ip, applications and databases can run on all the nodes of the cluster in case of failover, without any problem , without any manual intervention.
Ofcourse, some of the big companies like the companies in Banking sector use HACMP for High Availability, and they expect our Oracle products to be HACMP aware.. They actually expect Oracle products to work/start without a need for reconfiguration even after a Node switch operation.
I have done this type of configuration for Oracle Database, OID 11g, SSO 10g and so on, but this time I will explain the behaviour of a pure Weblogic in this kind of HACMP node switch situation.

Normally, OID 11g , SSO and other we need to make configurations right. Take OID 11 as an example, it is working on its own Weblogic but it has components still controlled with opmn, that 's why extra effort is required to configure OID 11g to work with HACMP. The configuration is required to make an application like OID to start on a new node when a cluster node switch occurs. Once this configuration is done , Weblogic OID services can be started up in the new node without any need for reconfiguration .
Following action plan is good example for such a configuration;

1)Login to weblogic application console> Add a new machine with listen address updated to service-hostname>relate Admin Server and wls_ods1 managed server with this machine.
2)In Weblogic administration console change the listen addresses set for the Admin server and wls_ods1 managed server to the service-hostname.
3) Change the component registration setting in opmn with the following;
$ORACLE_INSTANCE/bin/opmnctl updatecomponentregistration -adminHost localhost -host <SERVICE-HOSTNAME> -adminPort 7002 -adminUsername weblogic -componentType OID -componentName oid1 -Port 3060 -Sport 3131
4)Use manage DIPServerConfigmanage to set DIPServer oidhostport to the service-hostname:port
For example: manageDIPServerConfig set -h localhost -p 7006 -D weblogic -attr oidhostport -value <SERVICE-HOSTNAME>:3060
5) Change all the hostnames declared in $DOMAIN_HOME/opmn/toplogy.xml to reflect the service-hostname
6)Look into start/stop scripts, change the hostname if declared any.

On the other hand; for a standalone Weblogic which may provide an Application Server environment for an ADF application , there is no effort needed to make it Cluster aware.

In Application server like Weblogic , the cluster awereness is supplied through the cluster service-name or cluster service ip.. So, when your application server and its services use service-name , actually when they listen on the service name, or let's say when they bind to the service name, they become cluster aware.. I mean when a node switch occurs, they may be started on the new node without a need for a reconfiguration.. It is because this service-name and its interface is present on all of the nodes in a cluster.

 When we talk about a pure weblogic, we said that there is no need to make any configuration to support this.

Okay. Lets prove it..

When you execute netstat to see the address of the admin server's port, or to see the address of the managed server's port on a Weblogic Server which is running on a AIX, you will see that; Weblogic listens on all of the available ip addresses on a  machine..

Following is an example of a Weblogic which listen on all the available ip addresses on a AIX machine, which is a member of a HACMP cluster...

local ip, service ip and localhost...

Admin Server is listening on all the available ip addresses.

           LISTEN
tcp        0      0  10.87.23.62.7006       *.*                    LISTEN
tcp        0      0  10.87.23.65.7006       *.*                    LISTEN
tcp        0      0  127.0.0.1.7006         *.*                    LISTEN

Managed Server is listening on all the available ip addresses.

          LISTEN
tcp        0      0  10.87.23.62.7009       *.*                    LISTEN
tcp        0      0  10.87.23.65.7009       *.*                    LISTEN
tcp        0      0  127.0.0.1.7009         *.*                    LISTEN


So, as the Weblogic listens on all available addresses, it also listens on the service ip..
Thus we dont need to reconfigure our Weblogic Server to be cluster awarer, it is already cluster aware :)
This behaviour of weblogic is actually caused by not specifying a Listen address in both Admin Server's and Managed Server's configuration..


Weblogic Server MBEAN Reference:

If a server's listen address is undefined, 
clients can reach the server through an IP address of the computer that hosts the server, 
a DNS name that resolves to the host, or the localhost string. 
The localhost string can be used only for requests from clients that running on the same computer as the server.

So based on this info; a pure weblogic works compatible with the HACMP cluster.
Nevertheless we may want to  configure both Admin Server and Managed Server to listen on service ip address only.
This can be accomplished by modifiying the config.xml or using Weblogic console... 
On the other hand; I dont see any need for this :)

Wednesday, November 19, 2014

EBS 12.2 - Country-Specific Functionalities

One of my collegue asked me the purpose of Country-Specific Functionalities we choose during the installation of EBS 12.2.

Actually we are talking about the following rapidwiz installer screen here;


As you see above , installer wants us to choose a Region... Normally I choose Turkey in here(as our customers are located in turkey, and their workers are in Turkey and they are Turkish), but doing these things blindly is not a good thing , that 's why I m here writing this post about this particular screen.

Okay, this screen actually provides us an opportunity to activate the standart Product localizations delivered by  the RapidWiz installer. These localizations are delivered in the standard product by Oracle Applications Development.. That 's why the localizations which we install after the installations are called Add-On localizations..

Actually there are 3 type of localization in EBS;

  • Product Localizations: Delivered in the standard product by Oracle Applications Development
  • Add-on Localizations: Delivered by the Regional Field Centers (Add-on Localization Teams) via My Oracle Support
  • Partner Localizations: Delivered by partners including ISVs and system integrators
So , we choose the Product Localization in this screen. As I mentioned before; Product Localizations are standard product features and are installed when you run Rapid Install. You simply need to license those you wish to use; this can be achieved from Rapid Install during the initial install, or from License Manager in Oracle Application Manager at a later stage
You may also notice that there are several countries to choose in this installation screen.

One more thing about this screen; The product localization are actually grouped.. I mean when you choose a country here, you actually activate the product localization of a particular region.. 

Product Localizations are available for Asia Pacific, Europe,Americas, China and India. They are divided into four main categories:

Regional Localizations (JG)
Asia/Pacific Localizations (JA)
European Localizations (JE)
Latin America Localizations (JL)


So, when you choose Argentina here; then Latin American Localizations (JL) will be activated.

In addition to that ; there are other legal requirements that are common for several countries. Those "shared functionalities" are included in the Regional Localizations (JG). JG is automatically activated for any country that require localizations.

For EBS 12.2 installations&upgrades, you may follow my Installation documents in parallel with Oracle Support Documents:
http://ermanarslan.blogspot.com.tr/2014/04/ebs-oracle-ebs-1220-fresh-installation.html
http://ermanarslan.blogspot.com.tr/2014/03/ebs-oracle-ebs-122-vision-installation.html
http://ermanarslan.blogspot.com.tr/2014/06/ebs-122-non-rac-asm-installation.html
http://ermanarslan.blogspot.com.tr/2014/04/ebs-1220-to-1223-upgrade-adcdelta4.html
http://ermanarslan.blogspot.com.tr/2014/06/ebs-122-ebs-on-virtualized-oda-x4-2.html
http://ermanarslan.blogspot.com.tr/2014/05/ovm-oracle-vm-server-328-installation.html
... and more.. Just search EBS 12.2 in my blog , you will find several documents and real life examples about EBS 12.2 installations & problems-solutions & upgrades & migrations and so on...


Friday, November 14, 2014

Linux/Java -- Java Class Permissions, watch out for chmod 77 !

We had an unexpected error in the EBS login flow. It was obvious that there was a problem while executing the java classes. The error was displayed in the browser as Unexpected but there was no details about it. Apache was displaying the class not found errors for a class located in the $JAVA_TOP, but the class name was not written in the Apache errorlog file..

As I knew the login flow, I checked the java class directories, actually the classes which play role in the login flow of EBS, and I suddenly saw a weirdness in the permissions of SessionMgr.class file.

The permissions of SessionMgr.class was like following;

----rwxrwx 1 applprod applprod 33642 Nov 14 16:32 SessionMgr.class

Okay.. when we decode it , it is 000 111 111 in binary ,  and it corresponds to 077 in the language of chmod utility.

Then I checked the history and found the following;

chmod 77 SessionMgr.class

So , it was obvious that someone accidentally use chmod 77 instead of chmod 777 ..
As you may know, this numbers are used to specify the permissions according the modes.
First digit is for : Owners
Second digit is for : Group
Third digit is for : Others

And for the numbers;

in binary ;

first digit is for: read
second digit is for : write
thirdy digit is for: execute

So , chmod 777 means -> Owners, Group and Others can read, write and execute this file.

On the other hand, when you change the owner of the file using chmod 77 , it is interpreted as chmod 077 not as 770. Thus, the owner can not read write or execute the file , and the problem arises.

Okay... 
It is also important to know the required Java Class permissions in a server.

Again , lets see the 077 effect on a Java Class:
[applprod@erman tmp]$ ls -al HelloWorld.class 
----rwxrwx 1 applprod applprod 427 Nov 14 17:04 HelloWorld.class

 java -classpath /tmp HelloWorld
Exception in thread "main" java.lang.NoClassDefFoundError: HelloWorld
Caused by: java.lang.ClassNotFoundException: HelloWorld
        at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
        at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
Caused by: java.io.FileNotFoundException: /tmp/HelloWorld.class (Permission denied)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:106)
        at sun.misc.URLClassPath$FileLoader$1.getInputStream(URLClassPath.java:1001)
        at sun.misc.Resource.cachedInputStream(Resource.java:59)
        at sun.misc.Resource.getByteBuffer(Resource.java:154)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:249)
        at java.net.URLClassLoader.access$000(URLClassLoader.java:56)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
        ... 6 more
Could not find the main class: HelloWorld.  Program will exit.

Now see with the chmod 400;

[applprod@erman tmp]$ ls -al HelloWorld.class 
-r-------- 1 applprod applprod 427 Nov 14 17:04 HelloWorld.class
[applprod@tegvoracle tmp]$ java -classpath /tmp HelloWorld
Hello, World!

So , java works with read permission. Only with read permission.
It seems it is because , the file is read and interpreted by the java command. Java does not execute the class file directly, so it does not need a read permission for that.. 

So for a class to be executed, chmod 400 is enough. I mean; the owner of the file should be the one who needs to execute it ,and the permissions can be 400 ...

It is like sh..  In order to execute a script with sh, you need to have only read permission on it; like the following;

-r-----r-- 1 applprod applprod     9 Nov 14 17:25 erm3
[oraprod@erman tmp]$ sh erm3    (not that different user but read permissions)
erm

Keep that in mind..

Linux -- Using Linux Dialog and a handy Bash Script for Database Interaction

This script is written to manipulate a single table data.. It works like a dialog, but it is not a dialog actually, it is a script in which we use case,functions and other commands like sqlplus..
It is a good example for orderly writing bash scripts  , since it is function based..
The script is written to add , remove and read the data in an oracle database table.
It takes the password from the user, but it does not display it on the terminal for security reasons.

In past years , I have done these kind of works using Linux dialog. 
Using Linux Dialog, I could provide a form on the terminal . The users can take actions using selections on the form ; like the following;


Above script was written for easing the system administration in EBS apps Tier..
Users of this interface should not be System administrators. Everyone can use this application and make administration operations.
User will connect to the system using ssh, and this apps tech interface menu will be displayed in its terminal( putty or ssh secure shell or zoc does not matter)..
Whenever the user quits from the application , the connection will be closed too. So linux will act like a kiosk machine.


So , when you compare the following script with the script above, you can say that the following script is a primitive one.
On the other hand ,I want to share it because it is well ordered and it is a good example for beginners.

The script is as follows;

#START OF THE SCRIPT
echo "Welcome to Table Data management"

mainmenu () {
echo "What would you want to do?"
echo "Enter 1 to add data to the  list"
echo "Enter 2 to delete data from the exception list"
echo "Enter 3 to display data in the exception list"
echo "Enter 4 to exit"
read selection
processtheselection
}

insertuser()
{
echo "Please enter the username that you want to add to the exception list"
read username
echo "Please supply me the APPS schema password for this operation":
read  -s apps_schema_password
sqlplus -s apps/$apps_schema_password <<EOF
insert into blabla_table values (upper('$username'));
commit;
EOF
}

deleteuser()
{
echo "Please enter the username that you want to delete from the exception list"
read username
echo "Please supply me the APPS schema password for this operation":
read  -s apps_schema_password
sqlplus -s apps/$apps_schema_password <<EOF
delete from blabla_table where upper(username)=upper('$username');
commit;
EOF
}

displaylist()
{
echo "Please supply me the APPS schema password for this operation":
read  -s apps_schema_password
echo "Users in the Exception list:"
sqlplus -s apps/$apps_schema_password <<EOF
set pagesize 100
set linesize 120
select upper(username) as "EXCEPTION LIST USERS" from blabla_table;
EOF
}

exitapp()
{
echo "Exiting from the Application"
exit;
}

processtheselection()
{
case $selection in
    1) echo "You have selected  --add an user--..."
        insertuser
        mainmenu
      ;;
    2)  echo "You have selected --delete an user--..."
        deleteuser
        mainmenu
      ;;
    3)  echo "You have seleted --display the exception list--..."
        displaylist
        mainmenu
      ;;
    4) exitapp
        echo "Exiting from application"
        exit
      ;;
    *)  echo "Please enter a valid number"
     mainmenu
     ;;
esac
}

mainmenu;
##END OF THE SCRIPT


Example Run:



Welcome to SSO User Management
What would you want to do?
Enter 1 to add user to the exception list
Enter 2 to delete a user from the exception list
Enter 3 to display the users in the exception list
Enter 4 to exit
3
You have seleted --display the exception list--...
Please supply me the APPS schema password for this operation:
Users in the Exception list:

EXCEPTION LIST USERS
----------------------------------------------------------------------------------------------------
ERBANT
ALI
2 rows selected.

What would you want to do?
Enter 1 to add user to the exception list
Enter 2 to delete a user from the exception list
Enter 3 to display the users in the exception list
Enter 4 to exit
1
You have selected --add an user--...
Please enter the username that you want to add to the exception list
ERMAN
Please supply me the APPS schema password for this operation:

1 row created.
Commit complete.

What would you want to do?
Enter 1 to add data to the exception list
Enter 2 to delete data from the exception list
Enter 3 to display data in the exception list
Enter 4 to exit
3
You have seleted --display the exception list--...
Please supply me the APPS schema password for this operation:
Users in the Exception list:

EXCEPTION LIST USERS
----------------------------------------------------------------------------------------------------
ERBANT
ALI
ERMAN
3 rows selected.

What would you want to do?
Enter 1 to add data to the exception list
Enter 2 to delete data from the exception list
Enter 3 to display data in the exception list
Enter 4 to exit
2
You have selected --delete an user--...
Please enter the username that you want to delete from the exception list
ERMAN
Please supply me the APPS schema password for this operation:

1 row deleted.
Commit complete.

What would you want to do?
Enter 1 to data to the exception list
Enter 2 to delete data from the exception list
Enter 3 to display data in the exception list
Enter 4 to exit
3
You have seleted --display the exception list--...
Please supply me the APPS schema password for this operation:
Users in the Exception list:

EXCEPTION LIST USERS
----------------------------------------------------------------------------------------------------
ERBANT
ALI
2 rows selected.

What would you want to do?
Enter 1 to add data to the exception list
Enter 2 to delete data from the exception list
Enter 3 to display data in the exception list
Enter 4 to exit
4
Exiting from the Application

Thursday, November 13, 2014

EBS R12 -- Login flow (java) -- from the programmer's perspective

It is a good thing to know, what happens behind the scenes when you press the login buttom in EBS R12 login screens.. Having the know-how makes you more capable in diagnosing the login problems and also bring you the opportunity to  customize the EBS login process..   Customizing will not be supported by Oracle and it will most likely overwritten by a tech-patch, but it is still good to have the knowledge..

Okay, I 'll briefly write about the login  flow in EBS .
Note that : this flow is tested and verified.

When you press the login button in EBS login screen; the controller of the loginCO in webui package is triggered (oracle.apps.fnd.sso.login.webui.LoginCO)
So , the controller LoginCO passes the contol to the standard servlet, when the login button is pressed.(APPS_VALIDATION_SERVLET)

The servlet is registered with a class in backend. Actually apps_validation_servlet is the function_name and AuthenticateUser class is the servlet here..
select function_name,web_html_call from fnd_form_functions where function_name like '%APPS_VALIDATION_SERVLET%'

Function_name:                              Web_Html_Call
APPS_VALIDATION_SERVLET AuthenticateUser

So AuthenticateUser Web html call is triggered here which actually calls the oracle.apps.fnd.login.AuthenticateUser servlet class..
In other words;  when AuthenticateUser web call is made, oracle.apps.fnd.login.AuthenticateUser class is executed, and this class calls SessionMgr class which in turn calls te SessionManager class and db package for the validation. Validation is made in the db package , and the work is done actually in the fnd_web_sec.validate_login functiıon.. and the package returns "Y" to java when the user/pass info is true.. Thus java continues and logs the user in.

To sum up, the login process is as described above.. We have plsql , java and a little web related redirections in the process of login in EBS R12..
Okay.. Lastly, I want to write a few sentences about the customization... This process can be customized using a personalization,  classes which should be extended from the standart java classes and a smooth plsql package.. On the other hand, any amateur customization attempt will be a failure.. Any commented code in standard classes may trigger "invalid session" or "no longer active" errors at any time, so if you want to customize this process, you need to be want you are doing, and you have to support it yourself..

Monday, November 10, 2014

Oracle ADF -- notes

ADF is the Application development framework developed by Oracle.. It is a framework developed on top of Java EE , used commonly in Oracle Fusion Application.. Oracle itself even uses it in its own Applications.
ADF applications is developed using Jdeveloper.
ADF and Jdeveloper products are usually equal in versions.. That is ; we use Jdeveloper 12c to develop ADF 12c applications.
ADF 12C  renders the UI with the HTML5.. On the other hand; In ADF 11g there are still some components which are rendered using Flash..

If we are interested to develope Web Applications, we need to talk about ADF..
There is also another framework named MAF(Mobile Application Framework) and it used for developing mobile applications.. In MAF, HTML5 and Javascript technologies are used for rendering the UI components..
To develope ADF applications, we only need to have a Jdeveloper as an IDE.. We do all the things from design to deployment using Jdeveloper..
The developments in SOA and Webcenter are also done using Jdeveloper.  For example UI for SOA is done using ADF.
ADF uses Model-View-Controller architecture.


When we are making developments in ADF, we usually work on XML files. We dont deal with java code so often..

ADF Business Components are the database access technology. It is based on the Object Relational Model like the other frameworks such as Hibernate, Sprink, Toplink etc..
With this model, in generally speaking; we declare our tables as java objects and access them it through thes defined objects.



We have more than 150 Faces components in ADF. Most of the faces components come with the Ajax support. ADF brings Ajax, XML and Javascript together to supply partial page rendering.. With partial page rendering, only the range which needs to be refresh is get refreshed..
In other words; not the whole page needs to be refreshed when there is a need to change something in a particular field..

ADF 11G components includes ADF Faces, ADF Taskflow, ADF Model/Binding, ADF Business Components and ADF Security..

ADF faces supply the UI with AJAX Support.
ADF taskflow supplies the mechanism to declare the flows and reusable webpages.
Model & Bindings binds the UI to Business Services.
Business Components provides the database access.
ADF security provides authentication and authorization services for our applications.

IN ADF taskflow, we can make method calls or routers ..
Login - > Dashboard - > Method Call -> Route - > commit
                                                                          I -> return

Metadata Services (MDS) is used for customizations


We can build different pages according to the persons or sites.. A repository is needed for MDS. This repository can be file based.. Also it can be a database(db schemas are created using rcu), as well.

By using MDS, application customizations or user customizations can be done.. Design at Runtime can also be implemented. (for designing dashboards)

As I mentioned before; we use Jdeveloper to develop our applications..


The installation is very easy, next and next.. :)
After the installation we start our Jdeveloper and we start the built-in Weblogic server that comes with the installation using Run>Start Server Instance.. So we use Jdeveloper to manage the Weblogic Server that comes with the installation.. Afterwards, we can connect to the weblogic console using http://localhost:7010/console. Default admin user = weblogic , Default password= weblogic1




Jdeveloper's aim is to make us design our projects visually. We can use drag drop operations to add components to our projects. Jdeveloper supplies What You See is What you get kind of UI design. (WYSIWYG)

We can design our application visually, but also can go in to the code if we need to..
There are a lot of features in the IDE. For example , we can make dependency analysis  or we can use contextual linking. (we can see the hierarchy by clicking on the source)
In addition We can make EJB(Enterprise Java Beans) modelling using Jdeveloper. Oracle Toplink can be used as well. Using XML, the data sources in the databases are mapped to the Java as java objects .
Note that : JSF 1.2 is supported in ADF 12C..

Also , in Jdeveloper, we can create web services easily, and we can run / test these webservices inside the Jdeveloper.  We can even transform an ADF Business Component in to a Web service.
Note that: Web Services are declared using a language named WSTL.

Team development can also be implemented in Jdeveloper. We can use tools like CVS, Serena and Subversion inside the Jdeveloper to supply Team development.
Besides java, We can  also do database development in Jdeveloper. Sql Developer comes as embedded to Jdeveloper.
Database Connection can be declared and the database object can be reaced using the Database Navigator of Jdeveloper. We have query builder feature in the IDE and we can even check the Explain plans of our queries.

The developments done using ADF, can be deployed to the application server as jar files. We can declare direct or DataSource based DB connections in Jdeveloper.. Naturally, there are advantages if using DataSource.. (like connection pooling)
In ADF security, besides pages, taskflows can also be maintained under the security mechanisms.

ADF Business Components generates events.. They are like bridges between our applications and Business Processes. 



We can use Groovy scripting language to make elementary calculations (we can take the salary from the database and make add / divide /compare operations using groovy)

ADF faces includes a set of over 150 Ajax enabled JSF components for building richer WEB UI.
Dynamic and interactive applications can be developed using it. (Drag-drop, Ajax Enabled ,Complete JavaScript api , Partial page rendering etc..) It is in Web 2.0 standards.
Validation can be done in UI without going into the server.. So, the inputs entered from the UI can be controlled in the UI level. There is also listener logic.. When a button is pressed, an action listener works. There is graphs , Flash based components and also SVG rendering..
Besides, we have template logic in ADF. Templates can be created and used for every page. (For example: A template can be created and used in every page for writing the Copyright or for displaying login-logout buttons)

Normally, the pages are jsf in ADF., but they can be broken up into page fragments. These fragments can be used in several pages. This supplies the opportunity to spend less effor in development. For example: if we need to have a form in every page of our application, then we may build a page fragment which includes this form and reuse this fragment in several pages.

The logic behind the page regions is also similar. The primary reason to use the regions is reusing.. For example: we can build a workflow and put it into the page regions of several pages.

Another component for the reusability is Declarative components. For example: we can put our calendar , button  etc.. in to a panel and then transform this panel in to a component to reuse in everywhere we want.

Okay, lets talk about the Controller layer in ADF. It is like  a bridge between Moden and View. It provides a mechnasim to control the flow of the Web Application.

Lets explain this with an example:
When a button pressed; the controller determines the action that needs to be taken..  When a submit button is pressed, controller determines the action like saving the inputs entered and navigating to the next page.

When we open the Jdeveloper for the first time, it makes we to choose a role.. According to the role we choose, the capabilities of the IDE is determined. In Jdeveloper , there is planning capability. We can have checklist.. So , we can follow the project to see whether the datasource is created or business services are created.. We can follow this kind of activities with checking their statuses(started/in progress/not started)
This application overview is a capability of the IDE. Like tutorials , we have also a description/how-to for accomplishing the steps. There is also tutorial style information in Help.

Jdeveloper can deploy the projects into a remote WLS server, too.
In Jdeveloper ,we can preview the files without actually opening them.. We can do this even for images.
The ADF components can be accessed using the Component Palet of JDeveloper. Drag and Drop method can be used to put the component into the pages.
In resource palet, all te DB and Application connections can be seen.

In ADF, te object-table mapping are made through the Entity objects.


So EO objects tranforms the database into the code , and to make insert , update, delete against the Entity Objecs, View Objects need to be created.
View Objects supply read-only queries or a way to update the data in the database through the Entity objects. Note that: Readonly View Objects reach the database directly, not through the Entity objects.
When we update and commit a View Object, View object updates the cache of the Entity Object..So the Entitiy object makes the update.

Multiple View Objects can be created using a single Entity Object. View Objects also supply looking to the Entitry objects from different perspective.
View links can be created  between the View Objects. In addition to that; associations  can be created between Entity Objects. Thus the relationship inside the DB layer can be supplied using View Links and Associations.

Refactoring is another capability of Jdeveloper. When we rename an object , we can use refactoring.Using this refactoring functionality , we can change the object name everywhere in the project (where the renamed object is referred)(dependency)

The syncronization between Entity Objects and Database is done automatically. On the other hand, we need to make manual refreshes for the Wiew Objects. (Talking about the data here.. We need manual refres for DDL anyways)

There is also an option to create Joint View objects. That is , we can build one View Object using columns from several Entity Objects.


Another thing to consider is - > In Jdeveloper, it is easy to create Web Services.
We can add  @WebService statement just above the class definition in the Java code, thus  the webservice will be created automatically.


Also, we put @WebMethod  statement just above the method definition.


Here , we will find an example for that :https://docs.oracle.com/cd/E18941_01/tutorials/jdtut_11r2_52/jdtut_11r2_52_1.html

Okay. We can say that ; Oracle Applications & Enterprise Manager uses ADF faces, too.
Note that : We do not need to use ADF Business Components in order to use ADF faces. ADF Business Components is  not a must for ADF Faces. We can build the UI using ADF FAces but can build we backend using a totally different thing than Business Components.
ADF comes with a strong Java Script API. Javascript play a big role in building rich context.
Javascript is a scripting language. It works on Client, and supplies doing some works on client without going to the server.
In ADF server, java scripts are transferred as compressed.

When we think of validation in ADF ; we can say that the first validation is done in the Browser.. ( For example whether the user put date into a date field)
In the backend , there is a central validation in the ADF BC components.
In the context of validation, we can display exceptions, rollback transactions and display default pages. Note that we can build our own custom exceptions for this.
Note that: In JSF there is no client side validation. On the other hand; in ADF there is client side validation using Javascripts.

When we talk about ADF Data Controls, we can list the following;

ADF Business Components.
Java Classes
EJB
URL
Web Services
Essbase
Place Holder
Our own..

Data Controls are the bridges between the source data soruce and User interface in ADF Web applications.
View objects resides in Data Controls. Also, we can build our own method and place in Data Controls. In other words, the objects that make us reach the data reside in Data Control.

User Interface <--> Data Bindings <--> Data Controls <--> Business Service

Using Bindings, the UI components can be binded to the Data Control objects (For ex: when button pressed , do the commit)

Bindings for a page is stored in the PageDef.xml file.

Page.jspx -------------------------------- DataBindings.cpx-------------PageNameDef.xml

DataBindings.cpx : This file defines the Oracle ADF binding context for the entire application and provides the metadata from which the Oracle ADF binding objects are created at runtime.
PageNameDef.xml : . These XML files define the Oracle ADF binding container for each web page in the application. The binding container provides access to the bindings within the page. We will have one XML file for each databound web page.

When, we add a button to the form and using drap/drop we put the "commit action" in to the action listener of the button, we are done. It is the bind..

We can reach the bindings using Expression Language which is a java standard(EL).

#{bindings.Product.rangesize) =>

Note that : Product is view object and rangezise is the option of the Entity Object that the View Object related with. (for ex: Show me the rows in the Product table but in range 1-10)

or

actionlistener=#{ bindings.Commit.execute}

Okay. That's all for now. 
I will continue to write about ADF .. Next ADF post will be through examples..

Sunday, November 9, 2014

EBS -- Java Cache & Block Corruption in Workflow Tables

EBS has a caching mechanism for java and since 11.5.10 CU2 several EBS products started to use Java Caching. 
 The objects and data which our OC4J or JVM processes like oacore uses this java cache to provide us faster access on pages and related data structures. Like the Sga in the database tier that provide us the buffer caching, Java cache provide us caching in the JVM's memory.
When our JVM instances like oacore accesses the EBS database and get some data from it, the data is cached in the related JVM memory. In the case of oacore it is the oacore's process memory.
Oacore core process resides in the middle  tier, so the memory for jvm caching is in the middle tier.. The cache starts to be filled up when data is first accessed by the JVM/oacore..

Following is taken from Oracle.. It describes the Java Caching Mechanism used in EBS.


Also the java cache in EBS is distributed.. That means, if you have multiple JVMs in the middle tier; lets say multiple oacore; then they will all have their own java caches.. These caches are syncronized, though.
Following diagram show the distributed java caching mechanism in EBS.




As you see in the diagrams above , the cache is located in the middle tier, between Apache and Oracle Database. Even updates from outside the jvm that invalidates the data cached in jvm is recognized using the Business events(invalidation message).

The cache accesses the database in both ways, for read and for write..In the second diagram, you see the reads from the db and writes into the, db better. So the cache is updated first and the database is updated secondly.. It works something like write-back.

It is a good thing to have such a cache in the middle tier, because it saves us from the time to access the database for everytime we need an information. One of the important things is that If an object is invalidated in a JVM, it will be reread from the database when next accessed. The new version will then be stored in the JVM's cache.   That 's why we can conclude that ; if te data in cache is not invalid, then the cache provides the data.. Also if the data is invalid in the cache, it will be reread from the database.

So far so good, now we know the necessary information about the Java Cache used in EBS.

Lets continue the incident that has made me write this post;

Recently an issue was escalated to me .. It was encountered in a EBS 11i with a 11gR2 Oracle Database running on an Exadata X3

The error was "Unable to generate forwarding URL. Exception: oracle.apps.fnd.cache.CacheException".
Some oaf pages were not able to be displayed because of this error.
The issue was critic because related oaf pages were used for approvals.
After the initial diagnostics , I understood that the SYSADMIN could open the pages without any problems.. The problem have arised when the oaf pages were trying be opened by the Business Users..
So, it should be something related with the data.. Maybe a profile..  But the issue was interesting because the failing code was coming in the Cache package. The customer was encountering caching exceptions..

The log file of java cache is located under $COMMON_TOP/rgf/<Instance >_<Hostname> and it provides error messages specific to Java Caching.

So , I decided to have a look at that.

I have found the following in error stack;

oracle.apps.jtf.base.resources.FrameworkException
at oracle.apps.fnd.cache.GenericCacheLoader.load(GenericCacheLoader.java:232)
at oracle.apps.fnd.cache.GenericCacheLoader.load(GenericCacheLoader.java:199)
at oracle.apps.fnd.cache.GenericCacheLoader.load(GenericCacheLoader.java:174)
at oracle.apps.fnd.cache.GenericCacheLoader.load(GenericCacheLoader.java:149)
at oracle.apps.jtf.cache.GenericCacheLoader.load(GenericCacheLoader.java:87)
at oracle.ias.cache.CacheHandle.findObject(Unknown Source)
at oracle.ias.cache.CacheHandle.locateObject(Unknown Source)
at oracle.ias.cache.CacheAccess.get(Unknown Source)
at oracle.apps.jtf.cache.IASCacheProvider.get(IASCacheProvider.java:656)
at oracle.apps.jtf.cache.CacheManager.getInternal(CacheManager.java:4794)
at oracle.apps.jtf.cache.CacheManager.get(CacheManager.java:4617)
at oracle.apps.fnd.cache.AppsCache.get(AppsCache.java:216)
at oracle.apps.fnd.functionSecurity.User.getUser(User.java:336)
at oracle.apps.fnd.functionSecurity.FunctionSecurity.getUser(FunctionSecurity.java:527)
at oracle.apps.fnd.functionSecurity.RunFunction.createURL(RunFunction.java:1190)
at oracle.apps.fnd.functionSecurity.RunFunction.init(RunFunction.java:389)
at oa_html._RF._jspService(_RF.java:81)
at oracle.jsp.runtime.HttpJsp.service(HttpJsp.java:119)
at oracle.jsp.app.JspApplication.dispatchRequest(JspApplication.java:417)
at oracle.jsp.JspServlet.doDispatch(JspServlet.java:267)
at oracle.jsp.JspServlet.internalService(JspServlet.java:186)
at oracle.jsp.JspServlet.service(JspServlet.java:156)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:588)
at org.apache.jserv.JServConnection.processRequest(JServConnection.java:456)
at org.apache.jserv.JServConnection.run(JServConnection.java:294)
at java.lang.Thread.run(Thread.java:662)
Caused by: oracle.apps.jtf.base.resources.FrameworkException: ORA-01578: ORACLE data block corrupted (file # 36, block # 495018)
ORA-01110: data file 36: '+DATA_OSRV/dev/datafile/apps_ts_tx_data.470.839415343'
ORA-26040: Data block was loaded using the NOLOGGING option
at oracle.apps.jtf.base.resources.FrameworkException.convertException(FrameworkException.java:607)
at oracle.apps.jtf.base.resources.FrameworkException.addException(FrameworkException.java:585)
at oracle.apps.jtf.base.resources.FrameworkException.<init>(FrameworkException.java:66)
at oracle.apps.jtf.base.resources.FrameworkException.<init>(FrameworkException.java:88)
at oracle.apps.jtf.base.resources.FrameworkException.<init>(FrameworkException.java:202)
at oracle.apps.jtf.base.resources.FrameworkException.<init>(FrameworkException.java:218)
at oracle.apps.jtf.base.resources.FrameworkException.<init>(FrameworkException.java:249)


Yes... I found an ORA - error coming from the database..

Caused by: oracle.apps.jtf.base.resources.FrameworkException:
ORA-01578: ORACLE data block corrupted (file # 36, block # 495018)
ORA-01110: data file 36: '+DATA_OSRV/dev/datafile/apps_ts_tx_data.470.839415343'
ORA-26040: Data block was loaded using the NOLOGGING option.

So , the error ORA-01578 was indicating that : The corruption was a software block corruption which was a former LOGICAL corrupt block marked as formally corrupt.

No matter what; It was a corruption error.. So the cache was trying to read the data in the corrupted block, but it could not read it because the block was corrupted.. If we had no cache in EBS; we would see an Apache error maybe.. Maybe the error would be ""You have encountered an unexpected error. Please contact the System Administrator for assistance."

Okay I was happy to find the cause; but it didnt last long.
Unfortuneatly, we had no rman backup in this environment (as it is DEV)..  No cold backups or anything.. Besides the partition was nologging in Dev.

Anyways.. I searched the database to find the object that this corrupted block was belonged;

select segment_name, segment_type, owner
from dba_extents
where file_id = 36
and 495018 between block_id
and block_id + blocks -1 ;

It was a table: APPLSYS.WF_LOCAL_ROLES. It was a workflow related table , and it could be syncronized using the Synchronize Workflow LOCAL table..
It was obvious that the problem was arising in the OAF pages which were trying to access the WF_LOCAL_ROLES table..  This also showed that, oacore first takes the data into its cache , and then builds the web page and sends it client.

Firstly, I tried to repair the block using DBMS_REPAIR package.

BEGIN
DBMS_REPAIR.ADMIN_TABLES (
TABLE_NAME => 'REPAIR_TABLE',
TABLE_TYPE => dbms_repair.repair_table,
ACTION => dbms_repair.create_action,
TABLESPACE => 'APPS_TS_TX_DATA');
END;

BEGIN
DBMS_REPAIR.ADMIN_TABLES (
TABLE_NAME => 'ORPHAN_KEY_TABLE',
TABLE_TYPE => dbms_repair.orphan_table,
ACTION => dbms_repair.create_action,
TABLESPACE => 'APPS_TS_TX_DATA');
END;
/

SET SERVEROUTPUT ON

DECLARE num_corrupt INT;
BEGIN
num_corrupt := 0;
BEGIN
DBMS_REPAIR.CHECK_OBJECT (
SCHEMA_NAME => 'APPLSYS',
OBJECT_NAME => 'WF_LOCAL_ROLES',
REPAIR_TABLE_NAME => 'REPAIR_TABLE'
);
END;SELECT OBJECT_NAME, BLOCK_ID, MARKED_CORRUPT
FROM SYS.REPAIR_TABLE;

SET SERVEROUTPUT ON
DECLARE num_fix INT;
BEGIN
num_fix := 0;
DBMS_REPAIR.FIX_CORRUPT_BLOCKS (
SCHEMA_NAME => 'APPLSYS',
OBJECT_NAME=> 'WF_LOCAL_ROLES',
OBJECT_TYPE => dbms_repair.table_object,
REPAIR_TABLE_NAME => 'REPAIR_TABLE',
FIX_COUNT=> num_fix);
DBMS_OUTPUT.PUT_LINE('num fix: ' || TO_CHAR(num_fix));
END;
/

As I expect te DBMS_REPAIR has found the corrupted blocks, recorded them but could not fix them.. It was normal actually, because DBMS_REPAIR actually doesnt fix anything. It just let us mark and skip the corrupt blocks if we want to.

So I have used DBMS_REPAIR.SKIP_CORRUPT_BLOCKS as follows;

BEGIN
DBMS_REPAIR.SKIP_CORRUPT_BLOCKS (
SCHEMA_NAME => 'APPLSYS',
OBJECT_NAME => 'WF_LOCAL_ROLES',
OBJECT_TYPE => dbms_repair.table_object,
FLAGS => dbms_repair.skip_flag);
END;
/
SELECT OWNER, TABLE_NAME, SKIP_CORRUPT FROM DBA_TABLES
WHERE OWNER = 'APPLSYS';

By skipping the corrupt blocks, the problematic OAF pages have started to be working again. The cache problem went away , as expected.
Lastly I executed the concurrent program "Synchronize WF LOCAL tables" to syncronize WF_LOCAL_ROLES table again.

Strange isnt it?? Sometimes simple things like block corruption in the workflow tables can cause critical user functions to not work..

Friday, November 7, 2014

EBS 12.2 -- adop hangs -- adop problem-- case sensitive hostnames

Adop may hang /will not respond / will wait for input / will wait in read,  if your hostname does not match the hostname in $CONTEXT_FILE.

I have seen this even with the following scenario;

Machine's hostname is set to be "VisionR12", but the hostname in $CONTEXT_FILE is "visionr12"
adop hangs/waits on read system call..
Even a difference in case sensitivity may trigger this problem..
We have to pay attention for that.
Dont use Big letters in your hostnames at all. No need.

Linux-- Setting the hostname FQDN or Short? --a detailed approach , + a look from EBS perspective

EBS Application Tier processes running on Linux may encounter problems because of a wrong hostname setting of the Operating System. Thus the hostname we set for Linux must be appropriate.
Why appropriate? Because FNDLIBR uses the hostname it gathers from the kernel.
Lets use the strace utility to see what the process is doing when our start script executes FNDLIBR;
strace FNDLIBR FND FNDCPBWV apps/apps SYSADMIN 'System Administrator' SYSADMIN.
Okay.. I will not copy&paste the entire trace here, but the obvious thing is that FNDBLIR uses uname calls and gets the hostname..
uname({sys="Linux", node="ermanhost.domain.com", ...}) = 0
Note that, uname command also gets the hostname using uname system call. On success, zero is returned.


The hostname comes from the below strucute ;
struct utsname {
char sysname[]; /* Operating system name (e.g., "Linux") */
char nodename[]; /* Name within "some implementation-defined
network" */
char release[]; /* Operating system release (e.g., "2.6.28") */
char version[]; /* Operating system version */
char machine[]; /* Hardware identifier */
#ifdef _GNU_SOURCE
char domainname[]; /* NIS or YP domain name */
#endif
};


So using uname , FNDLIBR obtains the hostname from the kernel.
To demonstrate, I'll write a little C program and execute it while tracing with strace;
Our program to get and display the uname using struct data;

#include <stdio.h>
#include <sys/utsname.h>
int main ()
{
struct utsname u;
uname (&u);
printf (“%s release %s (version %s) on %s\n”, u.sysname, u.release, u.version, u.machine);
return 0;
}

We compile it;
gcc /tmp/ourprogram.c

We run it
It display the following.
./a.out
Linux release 2.6.32-100.26.2.el5 (version #1 SMP Tue Jan 18 20:11:49 EST 2011) on x86_64

When we trace it using strace;
./a.out 
execve("./a.out", ["./a.out"], [/* 22 vars */]) = 0
brk(0)                                  = 0xd2d000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb55dc02000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb55dc01000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=107884, ...}) = 0
mmap(NULL, 107884, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fb55dbe6000
close(3)                                = 0
open("/lib64/libc.so.6", O_RDONLY)      = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220\332a\0335\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1722304, ...}) = 0
mmap(0x351b600000, 3502424, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x351b600000
mprotect(0x351b74e000, 2097152, PROT_NONE) = 0
mmap(0x351b94e000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x14e000) = 0x351b94e000
mmap(0x351b953000, 16728, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x351b953000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb55dbe5000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb55dbe4000
arch_prctl(ARCH_SET_FS, 0x7fb55dbe46e0) = 0
mprotect(0x351b94e000, 16384, PROT_READ) = 0
mprotect(0x351b41b000, 4096, PROT_READ) = 0
munmap(0x7fb55dbe6000, 107884)          = 0
uname({sys="Linux", node="ermanhost.domain.com", ...}) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb55dc00000
write(1, "Linux release 2.6.32-100.26.2.el"..., 90Linux release 2.6.32-100.26.2.el5 (version #1 SMP Tue Jan 18 20:11:49 EST 2011) on x86_64
) = 90
exit_group(0)  

We see this program calls uname syscall and that returns the same thing as FNDLIBR's syscall..

uname({sys="Linux", node="ermanhost.domain.com", ...}) = 0

Then, we can say  that at the lowest level , FNDBLIR does the same thing to retrieve the hostname , same as our little program does.

Note that : hostname information is available in the proc file system too.
cat /proc/sys/kernel/hostname
ermanhost.domain.com
Also , hostname command can be used to display the hostname too.
hostname -f
ermanhost.domain.com
hostname
ermanhost.domain.com

But wait... You see, hostname -f and hostname returns the same thing in this system. "-f argument" is used to display FQDN ,but what about the output of hostname with no arguments??*

Actually they shouldnt return the same thing because;
hostname will print the name of the system as returned by the  gethost-name function.
The FQDN is the name gethostbyname returns for the host name returned by gethostname.

So in this system gethostname and gethostbyname return the same thing - > FQDN..
In other words; in this system hostname returns FQDN from everywhere..

Lets see what these gethostname and gethostbyname functions are...

int gethostname(char *name, size_t len);
gethostname() returns the null-terminated hostname in the character array name, which has a length of len bytes. If the null-terminated hostname is too large to fit, then the name is truncated, and no error is returned (but see NOTES below). POSIX.1-2001 says that if such truncation occurs, then it is unspecified whether the returned buffer includes a terminating null byte.

struct hostent *gethostbyname(const char *name);
The gethostbyname() function returns a structure of type hostent for the given host name. Here name is either a hostname, or an IPv4 address in standard dot notation If name is an IPv4 or IPv6 address, no lookup is performed and gethostbyname() simply copies name into the h_name field and its struct in_addrequivalent into the h_addr_list[0] field of the returnedhostent structure...
Basically; gethostbyname returns FQDN for the host name returned by gethostname.

When we trace the commands hostname and hostname -f , we see that "hostname -f" reaches nsswitch.conf and host.conf file. So it is network aware..

read(3, "#\n# /etc/nsswitch.conf\n#\n# An ex"..., 4096) = 1698
open("/etc/host.conf", O_RDONLY) ;

The nsswitch.conf file(The Name Service Switch (NSS) configuration file), /etc/nsswitch.conf, is used by the GNU C Library to determine the sources from which to obtain name-service information in a range of categories, and in what order.

When we open the /etc/nsswitch.conf;
we see the following line;
hosts:      files  dns

So this nsswitch.conf says that the system first attempts to resolve host names and IP addresses by querying files and if that fails, it tries querying a DNS server.

So the gethostbyname which is used by the hostname -f command reads /etc/nsswitch.conf and /etc/host.conf to decide whether to read information in /etc/sysconfig/network or /etc/hosts.

Note that :
We have fully qualified hostname defined in /etc/sysconfig/network.
cat /etc/sysconfig/network
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=ermanhost.domain.com

We have a lot of lines in start scripts which uses this file;
cd /etc/rc.d
grep -R /etc/sysconfig/network *|wc -l
317

When we open /etc/host.conf 
we see the following line;

order hosts,bind

This means  -> "first , use /etc/hosts to retrive the hostname , if you cant find it then try dns query"

So nsswitch.conf and host.conf say pretty much the same thing here.  So why do we have both of them?
It seems because the older Linux standard library, libc, used /etc/host.conf as its master configuration file, but new GNU standard library, glibc, uses /etc/nsswitch.conf.

Insteresting thing is that;
hostname -s command , which is used for returning the shortname of the servers uses /etc/nsswitch.conf and /etc/host.conf files, returns the short hostname as expected.

So what do we have so far;

hostname -s retruns ermanhost   (GOOD)
hostname -f returns ermanhost.domain.com (GOOD)
hostname returns ermanhost.domain.com  (BAD) ..

hostname command should not return ermanhost.domain.com(FQDN) when it is called without any arguments..
It uses gethostname as expected, but it should not return the FQDN..

As I mentioned above this might be related with the HOSTNAME defined as FQDN in /etc/sysconfig/network. 

Lets explore this ;

When we execute hostname command 
hostname is derived from -> uname directly(without going nsswitch.conf or host.conf -> uname derives this info from the kernel structrure -- > So we need to know ;

What does set this hostname as FQDN in the structure.
When is it set?
What is the configuration file that is used by the cede that sets the hostname as FQDN in the kernel structure?  ( I suspect this is /etc/sysconfig/network bytheway)

So we need to have a look to the boot process of Redhat Based Linux and here is it ;

I m not gonna startup from the Bios :)
Here is the info we need  :
When the init command starts, it becomes the parent or grandparent of all of the processes that start up automatically on the system. First, it runs the /etc/rc.d/rc.sysinit script....

When we look at the startup scripts , we see the following line in /etc/rc.sysinit file;

if [ -f /etc/sysconfig/network ]; then
    . /etc/sysconfig/network

We also see the following lines;

# Set the hostname.
update_boot_stage RChostname
action $"Setting hostname ${HOSTNAME}: " hostname ${HOSTNAME}

So , we have found the command that sets the hostname..
It basically gets the hostname from the /etc/sysconfig/network file at sets the hostname accordingly..

So far so good.. We know what we need to know about setting & getting the hostname in Linux.

Lets summarize the gathered info, make our comments and describe the best practice for setting hostnames in Linux :
  • HOSTNAME in /etc/sysconfig/network should be the machine name- not the FQDN. 'hostname' should ideally simply return the actual hostname.
  • /etc/resolv.conf must be properly configured for searching the domain.
  • /etc/hosts mut be properly configured to contain both FQDN and machine name.
DEMO:

False setting: 
[root@ermanhost ~]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=ermanhost.ermandomain.com
[root@ermanhost ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain
10.34.50.104 ermanhost.ermandomain.com ermanhost
[root@ermanhost ~]# hostname -s
ermanhost
[root@ermanhost ~]# hostname -f
ermanhost.ermandomain.com
[root@ermanhost ~]# hostname
ermanhost.ermandomain.com --> this shouldnt be FQDN

Good setting:

[root@ermanhost ~]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=ermanhost
[root@ermanhost ~]# hostname
ermanhost                    -->>   GOOD
[root@ermanhost ~]# hostname -f
ermanhost.ermandomain.com
[root@ermanhost ~]# hostname -s
ermanhost

Note that: if you change the /etc/sysconfig/network without rebooting the server, your hostname will still use the old hostname.. To change hostname in linux you need to issue hostname "newname" command , and then you must change the /etc/sysconfig/network file with the new hostname.. We change the /etc/sysconfig/network for making the change permenant after reboot.


Now lets see the effect of a wrong hostname setting in EBS ;
Note that : this is applicable for 11i, R12 and 12.2

Here it is documented as follows;

Concurrent Managers Fail To Start After New Install of Release 12 (Doc ID 413164.1)   
Basically, when we try to start concurrent managers in a machine with a long hostname; we end up with the following and concurrent managers cant be able to start.

ERROR
APP-FND-01564: ORACLE error 12899 in insert_icm_record
Cause: insert_icm_record failed due to ORA-12899: value too large for column "APPLSYS"."FND_CONCURRENT_PROCESSES"."NODE_NAME" (actual: 31, maximum: 30)

This is because the column in FND_CONCURRENT_PROCESSES table is VARCHAR2(30).
So we need to use a hostname which must be maximum 30 chars long.

But what if we need to have long fully qualified DN?
Then, we need to apply the approach as I wrote above.(The Good setting)

That's all what I need to say about this topic.

Okay.. In this article , we have learned the logic behind the hostname setting of linux. We have worked with strace , written a c program to get the hostname from the kernel structure, reviewed the boot process of linux  , stated the proper setting for hostname and lastly seen this information in action on  a EOracle BS problem.

I hope you 'll find it useful.