Wednesday, September 3, 2014

EBS 12.2 -- TCP Socket (KGAS) wait

TCP Socket (KGAS) waits are very popular in these days.. At least I have seen them around.. Last week, these waits appeared in a Glassfish environment... This week, they appeared in a E-Business Suite 12.2 environment. So now it is worth to write about them :)

First of all, TCP Socket(KGAS) is such a wait  that we cant tune in the Oracle Site.
In spite of the name(socket), the session waiting in this event may not only wait for the data arriving from the socket, but it may wait for the data that will come from the place where the session goes using TCP.(For example: Application Server, Mail Server etc)..  I mean the Application Server performance comes into play.
Note that: The event is not related with client-server architecture of the Oracle database. 
So this means -> when we see this wait event ,we need to check the target Application Server, too.
The most important thing about this event is it arises when a session goes to another server using UTL_TCP... UTL_TCP is used by  the packages like UTL_HTTP or UTL_SMTP..
So when we see this wait event, we need to check the use of this packages basically..

Anyways, as our focus is EBS, lets talk about these TCP Socket (KGAS) in EBS 12.2 by walking through a real life example..

It all started with a performance issue :)
Customer reported a performance problem in their custom Create Accounting process.
This custom create accounting was a heavy process which is undertaken by several concurrent programs..
The performance problem was so obvious.. Concurrent Programs were running for 1 day, and they just could not be completed..

To investigate, I found the request ids of these running programs and found their sids using their request ids, as always..
Once I found the sids, I checked their wait events..
There were 3 concurrent programs related with Create Accounting, and 2 of them were waiting in the PL/SQL lock timer wait. So as PL/SQL lock time is a purposeful wait , these 2 concurrent programs were eliminated.. In other words; It was obvious that; these 2 were waiting for the other one to finish.
Okay, so far so good..
When I analyzed the third one, I saw that it was waiting in the TCP Socket (KGAS) wait and its sql statement was a simple insert..
Initially, I stopped and restarted the whole process , but after a while , it stucked  in the TCP Socket (KGAS) again. This time it was doing an update..
It was very strange.. I mean , what could be the relation between an insert/update and TCP Socket (KGAS)?

I started to think that there were another processes which could not be seen at the first sight, but ofcourse I needed prove it.
To analyze further, I dumped the Error Stack of the running process..
To do this, I found the OS process id of the relevant db session and used oradebug tool as follows;

SQL> oradebug setospid 21641
Oracle pid: 163, Unix process pid: 21641, image: oracle@erptest
SQL> oradebug unlimit
Statement processed.
SQL> oradebug dump errorstack 3
Statement processed.


The purpose was to see the exact code ( C , PLSQL etc..) that was running around the TCP Socket (KGAS) wait..

When I analyzed the error stack, I saw the code in the stack.. The problem was there clearly..

0x72947638       135  package body SYS.UTL_TCP
0x72947638       338  package body SYS.UTL_TCP
0x72947638       408  package body SYS.UTL_TCP
0x72948498       107  package body SYS.UTL_SMTP
0x72948498       126  package body SYS.UTL_SMTP
0x72948498       636  package body SYS.UTL_SMTP
0x72948498       647  package body SYS.UTL_SMTP
0x72949a28        93  package body APPS.BLALA_SEND_MAIL --> The problem 
0x6480a630        31  APPS.BLABLA_GL_JE_HEADERS_AIUT


So as seen above, there was a call to a custom send_mail package and this send_mail package was using utl_smtp and as a matter of fact -> utl_tcp to send emails..

Furthermore; this custom send_mail was triggered by another custom code named BLABLA_GL_JE_HEADERS_AIUT and when I analyzed it, I have seen that it was a trigger :) 
 A trigger which was working line by line and trying to send emails ..

This was a sufficient for being the cause for these high TCP Socket (KGAS) waits.
At the end of the day, I requested this trigger or design to be fixed and until it is fixed, I requested the developers to make this trigger disabled..

In conclusion; 
I can say that whenever you see TCP KGAS, you can be sure that somewhere in your codepath there is a call to the UTL_* package listed above.. 
At the same time, you may not able to see these package calls at the first glance.. 
If this is the case, use high level tracing/debugging technics of Oracle to see them.. 
For example Use error stack like I did in this example ..

3 comments :

  1. What was the code fix that was performed as I have the same issue now. Th code hasbeen running for 9 years without a problem.

    ReplyDelete
  2. There was a trigger which was trying to send emails in every insert or update operation. So, when there is a high count of inserts, it was causing the issue. In our case, the trigger was disabled and optimized in a way to decrease the TCP kgas waits.

    ReplyDelete