Monday, January 31, 2022

RDBMS -- ORA-04030: out of process memory when trying to allocate X bytes (QERGH hash-agg,qeshHDAlloc.3

This one is for giving you some insights on an ORA-04030 error encountered while executing an expensive query that had lots of hash group by operations in its execution plan..

ORA-04030: 67108888 bytes... (QERGH hash-agg, qeshHDAlloc.3)

This seems like a PGA memory allocation problem, but the problem is actually in the SQL. (supposing your PGA memory is sized properly) 

Tuning the problematic SQL and make it use less memory can be the cure..

If that's not possible, setting the following underscore parameter can be considered...

alter session set "_gby_hash_aggregation_enabled" = false;

Note that, If tthis parameter is set to FALSE (default TRUE) , then the HASH GROUP BY operations are disabled. (setting it in the session level, disables the hash group by operation on the session level of course..)

So, when you set this in the session level, you will see a direct change in the execution plan, and that may be a quick workaround for you..

Before:


After:

We have one more option to be a workaround for this issue. That is, setting the _pga_max_size in the session level, to a lower value. 

By setting the _PGA_MAX_SIZE to a smaller value , we actually internally force Oracle 's hash-join to use the less memory.. (yeah.. there are bugs in these area..)

Ref:

ORA-4030: Out Of Process Memory (QERHJ hash-joi,QERHJ list array) Raised When Using _PGA_MAX_SIZE (Doc ID 1471103.1)

But! as you see, but of the wokarounds are based on underscore parameters, so it is better to get approval from Oracle Support before using these workarounds.. 

Saturday, January 29, 2022

Exadata X8M-2 & PCA X8-2 -- Part 4 -- Introducing Private Cloud Appliance (actually going a little further than that)

Remember my " Exadata X8M-2 & PCA X8-2 Migration series"? This is Part 4 and here I am, introducing PCA X8-2 to you. Yeah, I know PCA X9-2 just released, but! this machine is still new, as you don't see it in every day in the data centers .. at least in here you don't :)

As you may guess, we are continuing our journey on Exadata and PCA migration, and here it is -> the Private Cloud Appliance! Note that, we already migrated all the databases to Exadata X8M-2 , so from now on (and at least for a while), I will write mostly on the application migration related part of the story :)

This machine provides a high available, too fast, fully engineered and high tech platform for applications. An easy to deploy virtualized apps environment we are talking about here.. It is good for boosting the performance of the apps, getting rid of the single point of failures, and scaling out..This machine is the key for having a state-of-art platform in our data center for hosting our apps reliably.  

Even at the first sight, it looks like a perfect platform for hosting E-Business Application Tier, any kind of FMW product (including Forms and Reports), Oracle Business Ingelligence Application Tier and Clustered Weblogic environments! what do you say?

In this post, I will make a quick overview of the machine and the platform. Of course I will give try to give some insights when appropriate.. Note that; I may also play the role of devil's advocate when necessary :)

Alright.. Let's go!

The installation is simple.. You just fill out a form (for specifying the IP address, hostnames etc..), set a date and a hardware engineer from Oracle installs the system by doing a general health check and completing the easy installation process.

After the installation, you can directly login to the PCA Admin GUI using the standard url (<IP_address>:7002/dashboard).

PCA delivers a ready to use Oracle VM Server environment. 

I feel like I'm hearing those voices "Still Oracle VM Server"? Yes.. It is still okay! :)

But! in PCA X9, we have KVM.

PCA X9 announced -> https://blogs.oracle.com/infrastructure/post/oracle-private-cloud-appliance-x9-simplifies-developing-and-deploying-diverse-workloads

Check the Private Cloud Appliance Web Page as well -> https://www.oracle.com/engineered-systems/private-cloud-appliance/

The platform is fully virtualized and from the an admin point of view; it is not any different than a standard Oracle VM Server environment.. Oracle VM Manager installed and configured during the installation automatically.. So you(admins) just start using the virtualized environment by connecting to Oracle VM Manager. (<IP_address>:7002/ovm/console)

PCA Admin GUI  and Oracle VM Manager run on the active admin node, so that IP Addresses I mentioned above, is the IP address directed to the active admin node.. (Note that we have multiple admin/management nodes in this machine)

Note: OVM Manager works better with Firefox.. In our case, the version built-in OVM manager was 3.4.6.

We have also ZFS storage in this machine!

2 ZFS controllers, 1 shelf, 2 management nodes and 2 compute nodes in the base RACK!

Let's take a look at the virutalized OVM environment;

We have 1 server pool configured. We have 2 servers (compute nodes) in the pool and both of them have 96 cores and 382 GB memory.. (again for the Base Rack)

We have 3 repositories. One of them is named Rack1-Repository.. It is a shared repository, sized 3TB. The other two are local repositories, each of them is 1TB in size.. So these are the defaults.

Rack1-Repository is built on top of ZFS. Local repository are built on top of the local disks of the compute nodes.

Rack1-Repository is presented to both of the nodes at the same time, as it is a shared repository. Each local repository is presented to the related compute node.

We must not touch these repositories, so we leave them as is..

Note that, we may resize the Rack1-Repository if needed.. (check Oracle Support for the document that gives the instuctions for this.)

We don't have any ISOs or templates in the repositories by default.. So if we want to use some ISOs or templates, we need to follow the proper procedure that we generally follow in OVM environments.. (attention to the VM types.. HVM, PVM... In order to have PVM -- paravirtualized virtual machines, the procedure is a little different..)

The server management network is also the live migration network.. (we have live migration option in OVM, as you may already know)

We also have a vm network for cluster heart beat and storage related communicatons.

We have 2 more networks named default external and default internal.

We have bond interfaces in the compute nodes and bond1 is designed to be used for the VM traffic.

In order to manage the ZFS , we need to access the ZFS admin url. (example url : ZFS login url : https://<IP_Address>:215 .. We may also use ZFS command by accessing the ZFS storage using SSH.

The Rack1_Repository that I mentioned above is not sufficient for placing all the VM disks.. So as the first thing, we create a new LUN in the ZFS and present it to our Oracle VM Server environment and make use of it.

Note that, we created 2 repositories on 2 different LUNS on ZFS. One for the TEST environments, and one for the PROD.

[ PCA ] How to Increase Repository Capacity in the Oracle Private Cloud Appliance (Doc ID 1938558.1)

Once we finish our work with the ZFS, we use Oracle VM Manager to define a repository on the new Luns and then upload our ISO images and start/boot our virtual machines.

Note that, the boot disks of the virtual machines should be in the first place in the boot order.. (otherwise, virtual machine can not boot)

We move a bit randomly I know : ) but we will do okay:) 

Okay.. An important recommendation after all I have said so far, is the following;

*I recommend installing EM 13.4 (or higher version) for managing the PCA as a whole. EM can manage both the PCA itself, and the virtualized environment delivered with it. So we integrate PCA and the Oracle VM Server environment to EM. With this move, the administration becomes centralized and the quality of the administration increases.

https://www.oracle.com/a/ocom/docs/engineered-systems/pca-enterprisemanager-3399271.pdf 

* I don't recommend fully utilizing the ZFS storage.. I mean using the 70% of the total space is okay, but we should not exceed it, unless it is really necessary..

Okay.. What else?

ZFS is there for VM images and apps data. ZFS is in HA mode.. Multiple controllers are connected to a HA Shelf. In the standard config, we have 100TB usable space. This space can be extended to 3.3 Petabytes if necessary.(with additional cost of course).. For this extension, we can even choose to use Flash arrays.

There are 2 compute nodes (can be extended to 25 compute nodes) as I mentioned earlier and they are in virtualized mode.. These are the OVM hosts where our VM machines run. So, we have OVM as the OS of these nodes.

In each compute node we have;

2 Intel Cpus (Xeon 8620 24 core) and minimum 384 GB memory.. Note that, these are for Base PCA X8.

Memory in each mode can be extended to 1.5TB..

We have multi tenant option for isolating the virtualization tier, isolating it by having different servers  pool, different storage, network and compute etc.. (tenant groups)

The physical network is redundant. We have 2 networks and we have Cisco spine and leaf switches in each of them. In addition to that, we have another Cisco for the management network.

The internal communication is based on 100gbits ethernet. Things like vm heartbeat, storage communication, live migration can use these channels.

We have also ILOMs and PDUs as you may guess. We have a seperate management & maintanence network for connecting these utilities.

That 100gbit network goes through the leaf and spine switches.

Each compute node connects to 2 leaf switches and each leaf connects to the spine in a redundant manner. Each spine connects to the ports that goes to the storage, magement nodes and customer network.

In spine switches, we have 4 ports for the customer.. (port 1-4) These are reserved to be used for the customer's own network requirements. Port 5 (5/1 and 5/2) is the uplink port.  Port 1-4 can be configured with a 1x 100GBit or1x 40 GBit or 4x25 GBit or 4x10Gbit. Port 5 is 10Gbit.

PCA connects to the outside world using the TOR (Top of Rack) switches.

PCA's internal network (application management, storage access, inter vm communication) uses predefined IP addresses. But, as these ip address are not related with the external networks, they do not interfere.

While each RACK component has 1 IP address; each Storage, management and compute nodes have 1 additional IP addresses for ILOM.

Compute nodes connect to both the internal network and customer's data center network. By default, we don't have assigned IP addresses for connecting to the customer network in compute nodes.. (for the isolation), but we can configure compute nodes and assign them IP addresses from the customer network if we want to.(for instance we want to mount a NFS to the compute nodes.. a NFS hosted by machine in customer network..)

We need to be sure that the subnets given below should not be used in the customer datacenter network in order to prevent conflicts;

192.168.4.0/24 – internal machine administration network: connects ILOMs and physical hosts

192.168.32.0/21 – internal management network: traffic between management and compute nodes

192.168.64.0/21 – underlay network for east/west traffic within the appliance environment

192.168.72.0/21 – underlay network for north/south traffic, enabling external connectivity

192.168.40.0/21 – storage network: traffic between the servers and the ZFS storage appliance

We also need to pay attention to the VLAN settings. It is recommended to avoid the use of the following VLANs outside;

1 – the Cisco default VLAN

3040 – the default service VLAN

3041-3072 – a range of 31 VLANs reserved for customer VM and host networks

3073-3099 – a range reserved for system-level connectivity

VLANs 3090-3093 are already in use for tagged traffic over the /21 subnets listed above.

3968-4095 – a range reserved for Cisco internal device allocation

*Virtual machines can be configured to connect to the the ouside network via the default_external network that comes built-in with the PCA's OVM.. VLANs can also be defined on this network. (actually VLANs can be configured on the internal network as well..) The default_internal network can be used for the private interconnect network between the VMs hosted on PCA's OVM.

Well.. We have different networks in PCA;

- Administration network: Communication via Cisco Nexus 9348GC-FXP Switch. This is for Management. The IP addresses are on 192.168.4.0/24. All the nodes have management IP addresses here. In addition, ILOM IP addresses are also in this range.

PCA Admin GUI and VM manager also serve from this network. Spine switches are used when accessing them from the outside. (5/1 and 5/2 ports)

- Data network: All data flows through Cisco Nexus 9336C-FX2 Switches. They are in leaf-spine design.  Leafs are like the bridges between the RACK components and Spines, on the other hand, act as backbones and do the routing work.

We have bond devices named bond1 on the compute nodes. These  are connected to the Leaf switches. They are configured in link aggregation mode.

What else? Ohh okay -> VLAN for the management network...

If the management network should be on a VLAN, it can also be set in the PCA Admin GUI -> "The management node external network settings are configurable through the Network Settings tab in the Oracle Private Cloud Appliance Dashboard. "If this network is a VLAN, its ID or tag must be configured in the Network Setup tab of the Dashboard."

But! No VLAN for VM traffic or PCA management by default! -> "The default configuration of the PCA places both the default networks for VM traffic and PCA Management traffic on the physical ports 5/1 and 5/2 of the two spine switches, with "no overlying VLANs" to segregate the traffic.."

Here is the list of recommendation in different network configurations for PCA -> 

1.Ports 5/1 and 5/2 are used for unsegregated VM and Management traffic. No VLANs are assigned to either. Not Recommended -> ACTUALLY THIS ONE IS NOT RECOMMENDED.

2. Ports 5/1 and 5/2 are used for segregated VM and management traffic by assigning a single VLAN for management traffic through the dashboard. 

3. Ports 5/1 and 5/2 are used for segregated VM and management traffic by assigning one or more VLANs for VM client traffic on top of the default_external network through OVM Manager

4. Ports 5/1 and 5/2 are used for segregated VM and management traffic by assigning both a single VLAN for management traffic through the dashboard and assigning one or more VLANs for VM client traffic on top of the default_external network through OVM Manager. --> THIS ONE IS THE BEST PRACTICE.

5. Ports 5/1 and 5/2 are used for management traffic only without an VLAN. All customer network traffic is through a new custom network, with or without VLANs on a port group residing on one or more of the physical ports 1-4. Thedefault_external network is unused. .

6. Ports 5/1 and 5/2 are used for management traffic only with a VLAN assigned through the dashboard. All customer network traffic is through a new custom network, with or without VLANs on a port group residing on one or more of the physical ports 1-4. The default_external network is unused. THIS LAST ONE IS ALSO ONE OF THE BEST PRACTICES.

--So as you see in the 6th recommendation, customet networks can also be created and customer network traffic can also be configured to be on that network.. (Note that we are OKAY with the 4th recommendation...)

There are too much to say, but I think that, reading following document is a must ->


Networking the Oracle Private Cloud Appliance
and Oracle Private Cloud at Customer. A guide to connecting Oracle Private Cloud X8-2 solutions into the datacenter.

The tricky part is the network.. The network admins should configure their switchs according to what we expect..
In the document above, there are some example configuration and commands for the network side.. These examples are given for the Cisco and Juniper switches.. I mean for the customer side switches..

I m sharing the one for the Cisco as follows;

interface port-channel221

 switchport mode trunk

 switchport trunk native vlan 313

 spanning-tree port type normal

 no negotiate auto

interface Ethernet1/19

 description "Rack22 9336 RU22 P5"

 channel-group 221 mode active

interface Ethernet1/20

 description "Rack22 9336 RU22 P5"

 channel-group 221 mode active

interface Ethernet1/21

 description "Rack22 9336 RU23 P5"

 channel-group 221 mode active

interface Ethernet1/22

 description "Rack22 9336 RU23 P5"

 channel-group 221 mode active

Well, this is for Cisco but your network admins  will see this and check their side accordingly.. We have tested the same config with Huawei and Huawei switches can work with PCA as well..

We defined VLANs and made Virtual machines use those VLANs without any problems.. But! the network admin had some throubles configuring their side of switches and this example commands were the cure. 

Network admin checked this, and corrected his side accordingly..
The topologies given in the page 11,Physical Connection to the Data Center Network section of this whitepaper, are also important. There, in those topologies you see the switches, ports and LACP config in a general picture.. So these should also be reviewed..


Note that, we don't need to do anything extra in the PCA switches.. We don't need to do connect them and do some customizations to use VLANs. All the things need to be done in the customer switches. So keep that in mind!

Almost forgot! We have a cloud at customer version of PCA , which we call PCC as well.. 
Customers feel themselves closer to cloud (or have some regulations), can use this one.. 
EXACC / ECC customers already know the advantages of cloud at customer model :)

What else.. Aha ! I have actually gone through all the things that I wanted to share. So, this is a perfect place to end.

Thanks for reading :)

Wednesday, January 19, 2022

Weblogic /Hyperion Planning -- JPS-01013: The credential store DN cn=CredentialStore,cn=opssSecurityStore,cn=JPSContext,cn=opssRoot is missing

Here is an interesting story that I found myself in recently.. It was an Hyperion environment and customer started to encounter weird JPS-01013 errors while starting the Admin Server of the underlying Weblogic.

Actually, the recent change that was done in the environment; was related with the IP address setting of the repository database (The database that was hosting the schemas created with RCU..).

In brief; customer was going from a single instance to RAC 2 node config, and the jdbc urls needed to be changed ..

Although, such a change will be handled by updating the data sources and maybe some app-specific config;I think the customer have done more than that:) I don't have that part of the story, but somehow they executed RCU again and again in this environment, where we already had all the schemas for the repository in place.. 

The errors were generic, but it was clearly seen that they were all related with the repository schemas .. There was an Weblogic-internal ldap related process failing actually..

First thing we needed to check was the data sources of Weblogic... As the repository hostname and IP addresses have been changed, this might be related with the data source settings..

In order to check the data sources, we needed to use the filesystem directly.. So we opened the jdbc xml files residing in our Weblogic Domain directory/config/jdbc..

The files were there and the jdbc urls defined inside them were correct.. ( so it seemed customer changed it already).. So this was not a jdbc url-related connection problem..

Then we checked the repository database.. The issue was related with the OPSS, so we checked the OPSS records..

Just a little background info for OPSS:

Oracle Platform Security Services (OPSS) provides enterprise product development teams, systems integrators (SIs), and independent software vendors (ISVs) with a standards-based, portable, integrated, enterprise-grade security framework for Java Standard Edition (Java SE) and Java Enterprise Edition (Java EE) applications. OPSS is the underlying security platform that provides security to Oracle Fusion Middleware including products like WebLogic Server, SOA, WebCenter, ADF, OES to name a few. OPSS is designed from the ground up to be portable to third-party application servers.

We first checked the registered OPSS schema owner using the query below;

SELECT OWNER FROM System.SCHEMA_VERSION_REGISTRY$ where COMP_ID = 'OPSS';

We saw that there were 3 Owners/schemas registered for the OPSS component..

Then we checked each schema, by querying the jps_attrs tables.. We checked the value of the attrval column and expected to see the "cn=CredentialStore,cn=opssSecurityStore,cn=JPSContext,cn=opssRoot" there.. 

select * from OPSS_SCHEMA1.jps_attrs where attrval = 'cn=CredentialStore,cn=opssSecurityStore,cn=JPSContext,cn=opssRoot';

We saw that, only one of these registered schemas had that attrval value. At this moment, we concluded that, the data sources were misconfigured... I mean jdbc url settings that we see in those data source files were correct but the information related with the OPSS schema users and passwords were probably wrong.. Probably, this was a result of executing those unnecessary RCUs.. 

Well.. We opened those jdbc data source xml files using our file editor and updated the username to be the name of the OPSS schema that had the needed attrval value..

Then, we updated the password field of those jdbc data source files as well.. (the jdbc files related with OPSS .. for instance -> opss-datasource-jdbc.xml)

Note that, the password fields in the jdbc xml files needed to be updated with encrypted passwords.

So we encyprted the password of the correct OPSS schema and update the files with the encyrpted values..

For encyrpting the password; we sourced the setDomain.env and executed "java weblogic.security.Encrypt <our_password>

Having the jdbc files updated, we also modified the bootstrap credentials of Weblogic using WLST..

This action was also required.. Without this action, Weblogic didn't get the changed values.

In order to modify the bootstrap credential ; we used ->

wls:/offline>

modifyBootStrapCredential(jpsConfigFile='/Oracle/Middleware/user_projects/domains/EPMSystem/config/fmwconfig/jps-config.xml',username='ERM_OPSS',password='testtest')

With that, we could be able start the Admin Server and the problem dissapeared. :)

Wednesday, January 5, 2022

Erman Arslan's Oracle Forum / December 2021 - "Q & A Series"

This month again, I did not break the tradition and continued to support customers and the community despite the load of my work (leading my department, mission critical support activities & projects + intellectual gymnastics - physics and stuff like that ) ! :) Remember, you can ask questions and get remote support using my forum. Just click on the link named "Erman Arslan's Oracle Forum is available now". Click here to ask a question", which is available on the main page of Erman Arslan's Oracle Blog.




-- or just use the direct link:


Come on, let's see what we've been up to in December 2021.
(+ in the beginning of Jan 2022) 
#amaerman

row cache lock on drop table

Qeustion Form & report upgrade 

How to compile a pls and clear func admin cache

Filesystem Block size by big

Cross edition triggers by sg

Cutover and cleanup by satish

AWR is empty by big

adop cutover failled by big

Database performance monitoring tools by satish

RAC change IP by Roshan

CVE-2021-44228 Advisory for EBS by big

Apache log4j by satish

Patch 20725445: RCA: WRONG BASE AMOUNTS

Planing Migration from Existing Hardware to ODA

multi node setup EBS 12.2 non shared APPL_TOP

12.2.11 and 19c by Laurel

Applied patches to patch edition homes by satish

read only filesystem by Roshan

Quantity record with SET1 and SET2 -EBS 12.2.10

adop fails : ETCC not run in the database node

Cannot send testmail: Invalid role by big

Weblogic conflicts R12.2.5 by satish

APEX login page does not respond by big


Monday, January 3, 2022

RDBMS -- TRIGGER -- ORA-06502: PL/SQL: numeric or value error: character string buffer too small / a WEIRD CASE

This is about an interesting problem, that we encountered after migrating a critical database from HP-UX to Exadata.. I must admit that it was very challenging, the solution touched the foundations of our problem-solving mechanism and probably improved that mechanism for solving these types of problems that we may face in the future, faster.

Before giving you the story, I just want to remind that;

We are just trying to be Bayesians (having our subjective beliefs) and have priors on things ..

In addition to that, there is something that I can consider as a weakness. That is; we can find ourselves in a desperate situation if we get a false evidence (or it is better to say if we happen to misinterpret the evidence), that touches the foundation of our decision making mechanism that we use for finding the cause of the problem that we deal with. This type of an evidence can be an illusion that can be caused by an insufficiently defined error message or by lack of information, by lack of knowledge..

Two more words before start;

-What you get instantly, is not the whole nature of the mechanism.

-Make sure , you didn't come across to a  too non-representative part of  the distribution.

:)

Okay, now let's get down to business and look at our problem;

We migrated a mission critical database (running on a critical environment) to a newly deployed Exadata X8 using XXTS (accelerated with RMAN incremental backups) .. 

Actually, we had already done maybe 10 Exadata migrations in this customer environment, but in this case, we had ORA-06502 errors all over the place.. I mean once we enabled 6502 trace (alter session set events '6502 trace name errorstack forever, level 3') in the database level, we saw that, there were more than one PLSQL that was getting these type of errors.. (probably, overlooked by the testing team..)

Anyways; the error stack was;

ORA-06502: PL/SQL: numeric or value error: character string buffer too small

----- Current SQL Statement for this session (sql_id=1mspsdarsts3r) -----

UPDATE DENEME20 SET TEST_KODU =:B3 , DENEM_KODU =:B2 WHERE SOZLESME_NO =:B1 

----- PL/SQL Stack -----

----- PL/SQL Call Stack -----

  object      line  object

  handle    number  name

0x574be81d0        66  TEST.COMM_BEFORE_UPDATE

0x3d0285ac0       152  package body TEST.UA_DO_IT 

0x46a4c0dc0         2  anonymous block

Okay, here we go.. Let's see what we have done for finding the solution. Let's see the evidences that we collect and let's see the actual cause and the solution :)

1)

It was clear that, the Before Update Trigger TEST.COMM_BEFORE_UPDATE was failing at Line 66.

We checked that line 66 by using tools like Toad and views like dba_source.. (This is the actual mistake actually.. We will come this topic later.). Line 66 was a "select into" clause.. So the trigger was written in a way to query a table and put the value that it gets into a variable, which had the exact same length that the relevant table column had.

2)

The target database was created with a different NLS Language and territory setting. Different than the source.. So, those parameters were changed directly. On the other hand, as you already know that , NLS_DATABASE_PARAMETER can not be changed after the database creation.. But! this should not be a problem, because those parameters were changed in the instance level.. 

Still, this situation have created a big question mark for us.. Because, the "select into"  that we identified by checking the line 66 of that trigger could fail because of a char conversion problem.. Besides, we had some Oracle Support documens to support this idea.. That is, in some Oracle Support notes, we saw that a wrong NLS Lang could indirectly trigger some bugs in the code..

3)

We also suspected from NLS SEMANTICS settings. We checked it both in the database level, and  PLSQL object level ( DBA_PLSQL_OBJECT_SETTINGS) (that is set during the compile time) , but it was set to BYTES everywhere.. Both in the source and target..

4)

Most of the failing PLSQL codes were triggered by the DBMS SCHEDULER.. So there were database jobs.. They were not DBMS Jobs, but still they were scheduled using the Database engine, and it was the database itself that was executing these failing codes. Although, the NLS setting of these scheduler jobs were correct, we suspected that maybe database was setting those NLS parameters wrong in its execution time.. Maybe dbms scheduler was still using NLS_DATABASE_PARAMETER.. (there is no info about that bytheway, so it was just an educated guess at that moment:)

5)

A Sql Developer tool, installed in another client machine (different than our client machine where we have other database tools installed) could run those PLSQLs without any problems! I mean, direct executions.. (not using DBMS SCHEDULER).. However, DBMS SCHEDULER was still failing and we were still failing -- even if we run those PLSQLs directly, we were getting ORA-06502.. But, Sql Developer on that client could run them without any problems! 

We also checked the NLS preferences of this SQL Developer and saw that they were set correctly. We also saw that, even when we change the NLS setting of the session and set them to a wrong NLS Lang and territory; Sql Developer could run those problematic PLSQLs without getting any errors.. (Note that, this was a wrong try.. Actually, Sql Developer gets its env from its preferences..So it was actually no surprise :)

6)

We had similar cases in the other database environments ( Core Banking environment, Reporting environments) and we had a memory about them, a memory was telling us these of errors were already solved earlier.. It was telling -> just check the NLS setting :)  -- what an intuition :)

Well, you see how all these things bring us closer to the NLS settings, but they also lead us to ineffectiveness and insolvency.

The actual solution comes at this moment, we suddenly get the following info (we recognized someting);

In the call stack we actually got the correct line number, but our interpreations was wrong;

I mean, our error stack told us line 66 of that Before Update Trigger was failing and that was correct.

But! the line 66 was not that "select into" clause.

the line info in dba_source or toad didn't give us the correct info..

In case of triggers; the code includes both a trigger definition and PLSQL Block , so it it in the form of "Trigger Definition + PLSQL block"

ORA-06502 is a PL SQL error, so the Line 66 should be in the PL SQL block.. So we need to discard those definition lines and start counting from the word DECLARE...

In that case, we found the failing line is a variable assigment.. A variable assignment that was done using the return of a custom function and that custom function was the real cause, because it had a fixed variable string for storing the hostname.. After the Exadata migration, the hostname (FQDN) was extended, so it just couldn't fit and that's what caused the error. A substr modification or a variable size modification made us workaround that bug caused by the custom code, but at the end of the day, it made us learn something important and made us update our priors. 

--Note: That SQL Developer was running on a client which had a short hostname, and that's why it could fit in the relevant variable and that's why that Sql Developer didn't get any errors during our test runs.

Okay that is it! But still, look back at the first paragraph and think before you finish reading:)