Wednesday, September 30, 2015

Oracle Linux 6.6 - "kernel 3.8.13-98.1.1.el6uek.x86_64" "not using all available CPU cores"

We have recently seen a problem in Oracle Linux 6.6, as it was not using all the cpu cores available on that server. It was running on a VMWare, but the problem was not in VM actually.

The configuration was like below;

socket 1 = cpu0,cpu1,cpu2,cpu3 socket2= cpu4,cpu5,cpu6,cpu7

The problem was in the utilization.
That is, when using 3.8.13-98.1.1.el6uek.x86_64, Oracle Linux 6.6 was using only 4 cpu cores. We have analyzed cpu utilization properly and it didnt just not allocate the last 4 cpu cores.
Oracle Linux 6.6 was seeing all the 8 cpus on the other hand..

We have used taskset executable to force a process to run on a specific cpu core which Oracle Linux normally did not not utilize and seen that the process have started running on that cpu core without any problems and we could also see that cpu utilization of that cpu core have become %100, as expected.

[root@somehost opt]# taskset -c -p 6 2313
pid 2313's current affinity list: 0-7
pid 2313's new affinity list: 6

[root@somehost~]# top
top - 19:10:14 up 4 days, 6:02, 6 users, load average: 1.06, 0.63, 0.32
Tasks: 432 total, 2 running, 430 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.3%us, 0.7%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 0.7%us, 0.7%sy, 0.0%ni, 98.0%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 0.3%us, 0.7%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.7%us, 0.3%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
..
..
Cpu6 : 99.7%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st


So, when forced, Oracle Linux 6.6 with 3.8.13-98.1.1.el6uek.x86_64 kernel was using all the cores, but normally the scheduler automatically did not utilize the 4 cores coming from the second cpu socket, even under a very loaded situation as seen below. (cpu4,5,6,7 is not used.. not utilized..)

op - 12:51:32 up 3 days, 23:43, 3 users, load average: 16.74, 9.82, 5.30
Tasks: 454 total, 18 running, 436 sleeping, 0 stopped, 0 zombie
Cpu0 : 92.2%us, 6.6%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 1.2%si, 0.0%st
Cpu1 : 94.2%us, 4.7%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 1.1%si, 0.0%st
Cpu2 : 93.4%us, 4.8%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 1.8%si, 0.0%st
Cpu3 : 92.7%us, 5.4%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 1.9%si, 0.0%st
Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Mem: 32687204k total, 32521340k used, 165864k free, 104284k buffers
Swap: 33554428k total, 32992k used, 33521436k free, 21020212k cached


The strange thing was , the issue could not be reproduced in 3.8.13-44 el6uek kernel.
When booted with 3.8.13-44 el6uek kernel, Oracle Linux 6.6 have seen and utilized all the cpu cores without any problems, perfectly in balance.

So, the problem basically was   "Oracle Linux 6.6 with 3.8.13-98.1.1.el6uek.x86_64 kernel.

The problem was looking like the same that was discussed in the discussion that I have created in Oracle Community. Avi Miller from Oracle replied to that similar problem and stated that this is a known issue 3.8.13-98.2.1 (tracked by internal bug 21662) So , the workaround was downgrading to the previous UEK3 release or using the redhat compatible kernel for the time being.

Actually, a similar problem was there in 3.8.13-98.1.1, as well.
So, we are for now continuing with the older kernel 3.8.13-44 el6uek  and probably upgrade after the internal bug 21662 will be resolved.

RDBMS -- ORA-28750: unkown error in SSL web service call

If you are trying to do someting like the following and end up with ORA-28750 error, then the server that you are trying to connect, is probably using an SHA2 ssl certificate.

SQL> DECLARE
2 lo_req UTL_HTTP.req;
3 lo_resp UTL_HTTP.resp;
4 BEGIN
5 UTL_HTTP.SET_WALLET ('file:/wallet/','welcome1');
6 lo_req := UTL_HTTP.begin_request('https://ip_address/erman?wsdl');
7 lo_resp := UTL_HTTP.get_response(lo_req);
8 dbms_output.put_line(lo_resp.status_code);
9 UTL_HTTP.end_response(lo_resp);
10 END;
11 /

DECLARE
*
ERROR at line 1:

ORA-29273: HTTP request failed
ORA-06512: at "SYS.UTL_HTTP", line 1029
ORA-28750: unknown error
ORA-06512: at line

You may think that you wallet is problematic but it is not..
If that 's the case, then you are probably using an older version of Oracle Database :)
Something like 11.1.0.7 maybe... SHA2 is certified in Oracle Database 11.2.0.3 and above.
(To check the certificate you can just save it to your laptop, rename it to .cer then double click on it..)

So if that 's case, then you are two options:

1) upgrade your db ..
2) If the server that gives that web services is on your control, change the server side certificate to be 'GeoTrustSSLCA-G3'

We choose option 1 :) , as it is time.. it is even the time for an 12c upgrade..

EBS 12.2 -- what if we apply a patch with hotpatch option, if that patch is not suitable to be applied as hotpatch?

If not stated by an Oracle Support document or by any Oracle Document including Patch readme's, we just create online patching cycle and apply EBS 12.2 applicaiton patches online.
So what if we apply a patch with hotpatch option, if that patch was not suitable for this?

First of all, these patches are considered as unsafe patches. They are not tested by Oracle and probably not suitable for applying directly in to the run filesystem while the application services are running.

The answer for the question, on the other hand; is simple. We just take the risk and need to be prepared for the followings ;
  • invalid objects, missing code dependencies 
  • mismatches between code level of file system and database 
  • missing column data, or other data integrity problems 
  • out-of-date indexes and materialized view definitions 
  • invalid data in runtime caches
:)

So, hotpatches was maybe more stable in EBS 12.1 or we thought that they were stable as maybe we didn't know the problems that users was expriencing while we were patching the system with an hopatch. 
But in EBS 12.2, the effects are on system-wide as you see in the above list.

Well..At the bottom line, that's why I wan to remind that it is not safe to apply a patch with hopatch option in EBS 12.2 -- unless it is stated by Oracle Support Documents or patch read me files.

EBS 12.2 -- why a downtime patch is faster than hotpatch?

Recently wondering, how can a downtime patch be faster than a hotpatch..
It was stated in all the Oracle Documents, but the reasons was not there..
I was trying to find a logical reason for this, eventough i couldnt see a change in the adop's behaviour while applying the patches with downtime option. adpatch was used again in the background, the online patch cycle was not used and it was just like applying a hotpatch from those point views.
Thus, I concluded that applying a patch with downtime option is considered faster, just because the server resources was not used by EBS users or application services.
Anyways, today I got an answer from an SR that I recently created for this question. The return that Oracle support did was just approved my idea.
So at the bottom line, applying a patch in downtime patch is faster than applying a patch with hotpatch option and it just because the system resources was not used by EBS users or application services when applying a patch with downtime option.

Tuesday, September 15, 2015

EBS 12.2 -- The new Simplified Home Page

EBS 12.2 has a UI feature that looks really nice. Discovered it recently. Don't ask why so late :) It is probably becausea we are too busy in the backend, so can't find time to navigate in the Application Screens :)

Anyways, the simplified Home page introduced in EBS 12.2 make you feel like you are in an Ipad application..

Here is how the home page looks after enabling the simplified home page in EBS 12.2:


Pretty cool right:) ? It just does not look like an EBS webpage :)

Well here is the key for enabling it:

That is;
In order to use this feature, The "Self Service Personal Home Page Mode" profile must be set to
"Framework Simplified"
After setting this profile option , you can relogin and start the use this simplified Homepage.

Note that : the profile option can be set at multiple levels by the system administrator, including at the individual user level.

Friday, September 11, 2015

ASM / Grid 11g -- Resync vs Rebalance

I wrote an article about Asm Resilvering vs Asm Rebalance in one of my previous post
http://ermanarslan.blogspot.com.tr/2015/05/exadata-asm-resilvering-vs-asm-rebalance.html

In this post, I will explain the difference between Asm Resync vs Asm Rebalance shortly.
Nowadays, I can't post several blog posts like I have done in the previous month, because I m working on a Book which will be released in Jan 2016 .
So that's why I can write to several blog articles and, when I m able to write them, I need to keep them short in these days:)

Anyways, "the difference between Asm resyncing and rebalancing" was a question posted by one of the followers in Exadata Facebook group; and here is the answer for that:

Their usages are different and can be compared by examining the fast disk resync and fast rebalance operations introduced in 11g
Resync is something like syncronizing the disks with the data that should reside on them. (can be used on transient failure ) (ASM 11g New Features - How ASM Disk Resync Works. (Doc ID 466326.1))
Rebalancing is something like spreading the data evenly across all  thedisks in a disk group. (can be used on a disk replace operation)