Sunday, April 10, 2022

EBS -- poor performance in Autoconfig -- due to Java and unnecessary waits on Syscalls -- FUTEX_WAIT & FUTEX_WAIT_PRIVATE

Recently upgraded an EBS database from 11.2.0.4 to 19C. The source apps tier was an EBS 12.1.3 and the database was an Oracle Ent. Edition 11.2.0.4. Both applications and database was running on Oracle Linux 7 Operating Systems (OEL 7.9). 

Due to the low application version, we took lots of patching actions before doing the actual database upgrade and the multi tenant conversion.

The source environment was a multi-node E-Business Suite. It was consisting of 2 SSO enabled apps nodes (Non-shared Application Filesystem), and a 2-nodes RAC database.

The production iteration was our third iteration in this project (note that we do 3 iterations at least..). 

So we already had perfected our operation. We had a detailed SOP and a Runbook. We had a detailed upgrade schedule and we estimated the duration of the upgrade tasks carefully.

But! the production 19C upgrade took longer than we thought.

The problem was in the execution of adpatch actions and autoconfig.

We had lots of patches to apply to the apps tier nodes and we had to run autoconfig several times just to make the apps tier ready for the 19C upgrade. ( you know these things changes from env to env.. In this env, there were several apps tier patches required and needed to be in place before upgrading the database to 19C and converting it to a multi-tenant  -- 1CDB-1PDB)

The real cause that made the autoconfig and adpatch run slower was an inactive wait.. We kept following the autoconfig logs during those executions for instance... And we saw that autoconfig was waiting even during the JDBC connections.. There were no issues with the database connections but autoconfig was just waiting in these kinds of basic operations and that's why the total run time of autoconfig was extending.. The total duration of an autoconfig was almost 15 mins and this was an issue that can not be ignored.. Adpatch was not doing any better than autoconfig by the way..

Anyways, we used strace to get the system calls of those running java processes.. You know... In apps, we start with sh, we continue with perl and most of the time we end up with java.. :) So we traced the java processes while they were being executed by the autoconfig (or adpatch)

We saw that the java processes were waiting on FUTEX_WAIT system calls while we were seeing those unnecessary waits.. 

Futex / Fast Mutexes are just the locking mechanisms that are used for basic locking, or as a building block for higher-level locking abstractions such as semaphores and POSIX mutexes or condition variables. This is why, we thought that there was probably a contention on a memory location, which was protected by one of those mutexes (operated by calling futexes :)  

Following is from the Man page:

long syscall(SYS_futex, uint32_t *uaddr, int futex_op, uint32_t val, const struct timespec *timeout, /* or: uint32_t val2 */ uint32_t *uaddr2, uint32_t val3);

The futex() system call provides a method for waiting until a certain condition becomes true. It is typically used as a blocking construct in the context of shared-memory synchronization. When using futexes, the majority of the synchronization operations are performed in user space. A user- space program employs the futex() system call only when it is likely that the program has to block for a longer time until the condition becomes true. Other futex() operations can be used to wake any processes or threads waiting for a particular condition.

The uaddr that we saw in the FUTEX_WAIT system calls was always the same, not changing.

So same processes were waiting on FUTEX_WAIT and the uaddr was always the same. .(this might be normal & expected, because of the implementation of virtual memory..)

Note that, I just mentioned FUTEX_WAIT so far but FUTEX_WAIT_PRIVATE is not so different than that. It seems, FUTEX_WAIT_PRIVATE is just the product of an optimization done by linux glibc to make futexes faster when they're not shared between processes. -- so just wanted to shed a light on this one.. Lets continue;

So at first we thought that those FUTEX_WAITs were caused by a contention. In order to see the contention and find the blocker ,we installed "stap" and used the script that was provided by Redhat.. (note that, the problem was on an Oracle Linux but they are almost the same right :)

The installation of stap was a little throublesome, but we installed it and used the procedure given in the Redhat article which is publically available in the following url > 
--IDENTIFYING CONTENDED USER-SPACE LOCKS

Suprisingly, there was no contention reported, even when the autoconfig was waiting for getting an established jdbc connection.

Then, we made a deep dive, checked some futex examples, wrote some code to implement futex waits and guess what we found; the contention was not the only possible cause of futex waits. A poor written code or a blocking task (slow I/O, CPU shortage , high load) could also indirectly cause this kind of an issue.

Okay now; just suppose I m the main thread and your are the child thread.. Now suppose, I ( the main) was just written in such a way that, I just acquire a lock (semaphore, mutex, futex you name it) and then create "you" / the threads and then do some other fast stuff (some things that require I/O) and only then I release that lock. 
Suppose;  you the threads are written in such a way, that you need to get that lock to start doing your actual work. Okay not a dead lock, but it is a lock!

Now suppose those fast stuff that I just mentioned were blocked.. So I m waiting for their return and that's why I don't release the lock.. Well, you will "wait"..

Okay, I need to stop this story telling, because it just started getting weird :) And I felt like we should speed up a bit. Therefore, I am now heading towards the conclusion.

So, we thought that OS or JVM might be the real cause that was making us wait, but we also thought that a poorly written code might be the cause.. So we might need a patch for it. (autoconfig patch, a tech patch or something like that..)

Java thread dump didn't help much. By the way, we got it by following the MOS note below;

*How to Obtain a Thread Dump (Stack Traces) from a Java Process or from a Core File of a Java Process on Linux (Doc ID 1282871.1)

Then, we tried dropping the fs caches of Linux.. Because some example codes, that were written in a way to read some files and import the contents of those files into the database; were running very fast in their very first executions, but they were start waiting on FUTEX_WAIT in their subsequent exeuctions. So, we thought that this might be caused by a misbehaviour of Linux FS caching and we just tried to drop the fs caches and retest.. 

It was a nice try but it didn't solve the issue :)

Okay, I m speeding up!

Well, finally we found the cause..

It was /dev/random & I was there , done that! :)

Read -> https://ermanarslan.blogspot.com/2020/05/entropy-linux-kernel-csprngs-devurandom.html -- this is one my favorites.. "Entropy, Linux Kernel - CSPRNGs, /dev/urandom vs /dev/random and all that"

Some background info about /dev/random and /dev/urandom:

In Linux, we have /dev/urandom and /dev/random for this. These are character devices and they look like files. We read them like we read files and when we read 100 bytes from them, they actually run CSPRNG on the entropy pool and give us the random number we need.

These tools provide us limited and uniform random bytes when we need. Moreover, the source they are fed, is populated by the unpredictable events.

But, as you may ask, we have two devices, right? /dev/random and /dev/urandom.. So which one should be used in which case? This is definitely the question that one may ask.

Well, first describe the difference between these tools, so that maybe we can make a decision depending on those differences.

The main difference between /dev/random and /dev/urandom is that, /dev/random tracks the entproy that we have in the entropy pool and it blocks when the entropy is low. (remember the entropy that I mentioned in the first part of the blog post).. It is basically implemented in a way to block itself when it thinks that the unpredictability is low.

Reference for the above: Erman Arslan's Oracle Blog :)

Entropy and Claude Shannon again! :)


We tested it by providing /dev/urandom using a command line argument to those java programs ..Djava.security.egd=file:/dev/./urandom did the job and we saw that this action cleared the waits.

However; we had to generalize it and somehow made it system-wide. 
We used the following for that ->
  • Open the java.security file of the related JDK/JRE.. (ex: JAVA_HOME/jre/lib/security/java.security)
  • Change the line: "securerandom.source=file:/dev/random" to "securerandom.source=file:/dev/./urandom"
  • Note that, we need to change the line to /dev/./urandom. Otherwise, java ignores it... For instance java ignores /dev/urandom.. (the one without /./ is ignored!) 
  • Save the changes.. 

Okay. That 's it :) This was for all my followers, and those ones who work in Oracle Application Technology & Oracle Linux Support..

I hope it will be useful.. 

No comments :

Post a Comment

If you will ask a question, please don't comment here..

For your questions, please create an issue into my forum.

Forum Link: http://ermanarslan.blogspot.com.tr/p/forum.html

Register and create an issue in the related category.
I will support you from there.