Wednesday, April 26, 2017

Linux -- using the screen for uninterrupted command operations (especially when you don't have Vnc)

This is a very old topic actually. Especially Linux folks will not pay attention to this one (because they probably already know it for a long time ago). However, in Oracle DBA world, it is a valuable topic.
We always use vnc for uninterrupted operations.(sometimes nohup as well)
That's what said to us. Especially for critical and long running OS operation (including copy, move , manual database upgrades and Rman operations) , we most of the time use VNC and get a X session , which is not interrupted, even if we lose our connections to our servers.

the "screen" command, on the other hand; can be thought as an alternative to the VNC connections in a way.
Ofcourse screen doesn't have the capability (it is not designed for this aim) to give us a X session (GUI), but it let us to have uninterrupted command operations (like the VNC does).
Besides screen program let us continue with our shell/terminal even after we disconnect from the server. That is even after disconnecting and reconnecting , we can still see all the shell outputs which have been produced since we have started working in our screen terminal.

Following filtered info is from the man page of the screen program;

Screen  is  a  full-screen window manager that multiplexes a physical terminal between several processes (typically interactive shells). 
When screen is called, it creates a single window with a shell in it (or the specified command) and then gets out of your way so that you can use the program as you normally would. 
Programs continue to run when their window is currently not visible and even when the whole screen session is detached from the userâs terminal.

Well... screen is an easy to install and use tool. (In case of Oracle Linux, screen can be installed using "yum install screen" command)

Here is a demo for you ->

We connect to our server and execute the screen command as follows;
When we execute the screen command, a screen session/terminal gets created as shown below;
(I showed the pids of the sessions using echo $$ to let you see the terminal's pid gets changed when we execute the screen command).


In the screen session, we first execute the ls -al command and see its output. Then, we execute our next command, which is "du -sh / ".. While our command is working, we close our terminal (by closing our terminal program to mimic an unexpected disconnection).



After we close our terminal program (which is an SSH client in my case), we relogin to our server and check our ongoing screen sessions using "screen -ls" command, as shown below;



Once we identify our screen session using its id (pid actually, and it is 3729 in this case), we use "screen -r" command to attach our ongoing screen session, as depicted below;


After we execute the screen -r command, we attach to our screen terminal and see our du -sh operation is still on-going. (we even see the output of ls -al command , that we executed earlier)


If we decide to kill our screen terminal, we can always use exit command while we are in the screen terminal prompt. Alternatively, we can use "screen -S <pid> -X kill" command  to kill our screen terminal, without even attaching to it.


The last thing that I want to mention in this blog post is, the difference between the nohup command and the screen command. 
That is, screen is not only used for daemonizing a process. It works more like a terminal window manager. For instance, we can disconnect from the screen terminal while our command is working, and then reconnect to that terminal in case our command requires an input from us. Also, we can reconnect to the terminal and check the command outputs , which were produced while we are not connected to the server at all and so on.. So, screen and nohup are not the same thing.

Tuesday, April 25, 2017

Top 60 Oracle Blogs And Websites for Oracle DBAs ( Erman Arslan's Oracle Blog is on the list!)

I'm proud to be on the TOP-60 Oracle Blogs and Websites list.
I have worked hard for this blog from the first day and now it is a big pleasure to be on the list with the big names.
I would like to thank you, my followers and readers... Also I would like to thank you, my forum users.
Your interest, comments and feedbacks have been my biggest motivation both for writing this stuff and for all the reseaches and lab works that I have done for supplying an unique content in this blog.

Thursday, April 20, 2017

EBS 12.2 -- Blank page problem viewing concurrent out and logs, Hostname with Capital Letters, FNDWRR & node_info.txt

Recently encountered a problem on a new EBS implementation.
The problem appeared when the customer wanted to see the outputs and logs of the concurrent programs.
When they tried to open a concurrent log in the browser, a blank page was displayed.

In order to solve this issue, we did almost all the diagnostics, such as;
  • Enabling sqlnet trace in the apps tier listener.ora
  • Running "FS Diagnostic Test Program" with the parameter MODE=FULL 
The only error that we saw after running FS Diag Test Program was the following;

-- FNDFS did not return an error code 
-- FNDWRR did not create a debug logfile. 
-- FNDFS did not create a debug logfile. 
-- Displaying first 25 lines returned by FNDWRR.exe: 
-- BEGIN FILE ------------------------------------- 
-- END FILE --------------------------------------- 
-- ERROR: Unable to transfer file successfully! 
  • Setting the profile "Viewer:Text" to have no value (remove 'browser')  and retesting viewing a log file. This was to bypass FNDWRR code process and pass the FNDFS output directly to the built-in viewer.
This was almost working. We could display the text output, but we could not display any other output formats such as pdf or xml.
  • We did the tests documented in "How to Troubleshoot FNDFS and FNDWRR (Problems Viewing Concurrent Request Output) ( Doc ID 847844.1 )".  FNDWRR could display output from OS succesfully.
  • Checked Apache logs  for CGI entries (FNDWRR is a CGI program) /access_log.* error_log.* etc.. We could only see HTTP 200 there. GET /OA_CGI/FNDWRR.exe?temp_id=796020226 HTTP/1.1" 200 - 
  • Checked database alert log (just in case)
  • Generated a Fiddler Trace and reviewed it. No errors
  • Applied  "Patch 17075726:R12.FND.C - SUPPORT CONCURRENT MANAGER ON A SERVER CONFIGURED TO USE VIRTUAL HOSTING " , didn't fix.
  • We disabled all the anti-virus and proxy checks in the client side, but issue remained.
  • We created a SR (Oracle Support Service Request) , but it didn't help much. 
  • Did the browser configurations below, but they didn't help at all.

  • Set up Oracle E-Business Suite to run through the 'Trusted Sites' zone with a 'Medium' Security Setting.
    Tools -> Pop-up Blocker -> Pop-up Blocker Settings -> Allowed Sites
    Tools -> Internet Options -> Privacy -> Pop-up Blocker: Settings -> Allowed sites
    Selected (Checked) the following values:
    Tools -> Internet Options -> Security -> <zone> (e.g. Trusted Sites) -> Enable Protected Mode
    Tools -> Internet Options -> Advanced -> Enable 64-bit processes for Enhanced Protection Mode

THE CAUSE AND THE SOLUTION:

Well.. After all these diagnostics works, guess what we realized?

"The server name was written in capital-letters."  

Yes, the issue was caused by that.. 

The hostname of the EBS servers should not be written in capital letters. 

This is not supported and this issue is one of the causes of this support lack.. (note that, this issue is not documented, so that's why I m writing it :) )

The autoconfig and postclone are also affected by this problem, but there are workarounds for them. 

However; you can't get rid of FNDWRR (viewing concurrent log and out) problems, if you have a hostname with capital letters.

As for the solution, we updated $EBS_APPS_DEPLOYMENT_DIR/oacore/APP-INF/node_info.txt
and modified the lines which were including the hostname. 

We made all hostnames written in lower-case letters and the issue got resolved even without restarting anything.

Note:

txkFNDWRR.pl reads the node_info.txt and set the env variables accordingly. Using the $OA_HTML environment variable, txkFNDWRR.pl executes the FNDWRR, which in turn makes the conc log or out be displayed in browser screens.

I wrote this blog post to identify the diagnostics actions that need to be done while diagnosing such a problem and to show you the throubles that we may find ourselves in, if we have an unsupported configuration.

Tuesday, April 18, 2017

GRC -- GRC & EBS implementation //errors and solutions

Recently installed GRC 8.6.6, PCG 7.3.3 and PEA 8.6.6 using the Intallation guides delivered with the products to implement GRC with an EBS 12.1.3 instance.

The main document to be followed for doing this type of an installation was Enterprise Governance Risk and Compliance (EGRC) Product Information (Doc ID 1084596.1) and basically what we did was the following;
  • Installing an 11g Oracle Database for GRC
  • Installing a Weblogic 12C for GRC
  • Installing ADR 12C on Weblogic
  • Installing database for Schemas for GRC using RCU utility
  • Creating a Weblogic Domain for GRC
  • Deploying GRC application using Weblogic Console
  • Upgrading GRC
  • Installing PCG on EBS
  • Installing critical PCG Patches
  • Installing PAE on EBS
  • Fixing any errors reported by the functional team
It was not the thing that is frequently done in our customer environments, so maybe that's why we encountered some errors and spent some time to find their solutions, which I find useful to share with you.

We basically encountered 3 errors. 2 of them were caused by the complexity of the documentation and one of them was directly related with a corrupted data.

Let's see what those errors and their solutions are;

ERROR 1 - unable to synchronize access in grc 8.6.6

The connection test that we did for the datasource that we created on GRC, was succesful; but the synchronize acces job (in GRC application) failed with the following error. (error reported in grc.log)

ERROR [ExecutorThread-11] DataSourceService:1343 Error while setting ETL completed
java.lang.RuntimeException: Failed to serialize the object:
Descriptor Exceptions:
---------------------------------------------------------
Exception [EclipseLink-59] (Eclipse Persistence Services - 2.3.1.v20111018-r10243): org.eclipse.persistence.exceptions.DescriptorException
Exception Description: The instance variable [thingSavedStates] is not defined in the domain class [oracle.apps.grc.domain.datasource.SourceSyncState$SourceSyncStateBuilder], or it is not accessible.
Internal Exception: java.lang.NoSuchFieldException: thingSavedStates
Mapping: org.eclipse.persistence.oxm.mappings.XMLCompositeCollectionMapping[thingSavedStates]
Descriptor: XMLDescriptor(oracle.apps.grc.domain.datasource.SourceSyncState$SourceSyncStateBuilder --> [])
Runtime Exceptions:
---------------------------------------------------------
at oracle.apps.grc.domain.datasource.SourceSyncState.toXML(SourceSyncState.java:135)
at oracle.apps.grc.dataservices.dao.impl.spring.datasource.DataSourceDaoSpr.updateSyncState(DataSourceDaoSpr.java:1249)
at oracle.apps.grc.dataservices.dao.impl.spring.datasource.DataSourceDaoSpr.setEtlCompleted(DataSourceDaoSpr.java:1946) 


Solution:

USER_MEM_ARGS should be updated correctly in setDomainEnv.sh.
As documented in:
http://docs.oracle.com/cd/E51797_01/doc.8651/e52268.pdf
page : 2-14

Action plan:

1. Stop application Server
2. Backup setDomainEnv.sh file
3. Update the USER_MEM_ARGS parameter in setDomainEnv.sh file . The modification should be done between the comment and if statement.

# IF USER_MEM_ARGS the environment variable is set, use it to override ALL MEM_ARGS values 
<<<<<Changes should come here>>>>>>>
case "${SERVER_NAME}" in "AdminServer")
USER_MEM_ARGS="-Xms2048M –Xmx2048M" ;;
...
....
USER_MEM_ARGS="${USER_MEM_ARGS} -XX:PermSize=256m -XX:MaxPermSize=512m -XX:ReservedCodeCacheSize=128M -Djava.awt.headless=true -Djbo.ampool.maxpoolsize=600000 -Dfile.encoding=UTF-8 -Djavax.xml.bind.context.factory=com.sun.xml.internal.bind.v2.ContextFactory" 
 <<<<<Changes should come here>>>>>>>
if [ "${USER_MEM_ARGS}" != "" ] ; then

4. Ensure the eclipselink-2.3.1.jar file exist in the below path. <MW_HOME>/grc866/grc/WEB-INF/lib/ location

5. Start the application server

6.Retest the issue

ERROR 2 - unable to syncronize access in grc 8.6.6 

The connection test that we did for the datasource that we created on GRC, was succesful; but the syncronize acces job (in GRC application) failed with the following error. (error reported in grc.log)

DEBUG [EtlExtractor-1254779240] GrcLogPrintStream:73 STDOUT (oracle.core.ojdl.logging.ConsoleHandler:118) <Apr 6, 2017 11:10:11 AM EEST> <Error> <Default> <ODI-1217> <Session TCG_SCEN_Users_21 (85154) fails with return code 7000.
ODI-1226: Step TCG_INTR_Users_21 fails after 1 attempt(s).
Caused by: ODI-1240: Flow TCG_INTR_Users_21 fails while performing a Integration operation. This flow loads target table Users.
Caused by: org.apache.bsf.BSFException: exception from Jython:
Traceback (most recent call last):
File "<string>", line 50, in <module>


Solution:

PRE_CLASSPATH was not properly set in the setDomain.env.

As documented in "http://docs.oracle.com/cd/E51797_01/doc.8651/e52268.pdf
page : 2-2" , the PRE_CLASSPATH should be updated properly.

Action Plan:

1. Locate the following lines in the file: 

if [ "${PRE_CLASSPATH}" != "" ] ; then CLASSPATH="${PRE_CLASSPATH}${CLASSPATHSEP}${CLASSPATH}" 
export CLASSPATH 
fi 

2. Add the following before the above lines: 

PRE_CLASSPATH="/grc865/grc/WEB-INF/lib/jython- 2.5.1.jar:${PRE_CLASSPATH}" 
export PRE_CLASSPATH

Note: Replace with the actual path to your middleware home.

ERROR 3 - unable to synchronize access in grc 8.6.6 

The connection test that we did for the datasource that we created on GRC, was succesful; but the synchronize acces job (in GRC application) failed with the following error. (error reported in grc.log)

DEBUG [EtlExtractor-976389891] DataSourceDaoSpr:1234 updateSyncState(oracle.apps.grc.domain.datasource.SourceSyncState@22489656) 
ERROR [EtlExtractor-976389891] AvailableResource:173 myBlocks null 
ERROR [EtlExtractor-976389891] LocalEtlTcgWriter:412 A problem occurred in LocalEtlTcgWriter.writeData: 
java.lang.NullPointerException: myBlocks null 
at oracle.apps.odin.reasonerio.file.page.AvailableResource.askBlock(AvailableResource.java:174) 
at oracle.apps.odin.reasonerio.file.page.PageManager.createNewPage(PageManager.java:92) 
at oracle.apps.odin.reasonerio.file.page.PagingGrccChannelManager.expandChannel(PagingGrccChannelManager.java:498) 
at oracle.apps.odin.reasonerio.file.page.PageChannelController.<init>(PageChannelController.java:79) 
at oracle.apps.odin.reasonerio.file.page.PagingGrccChannelManager.createChannel(PagingGrccChannelManager.java:456) 
at oracle.apps.grc.reasonerio.graph.blockbytype.writer.BBTGraphWriter.getAttributeWriter(BBTGraphWriter.java:148) 
at oracle.apps.grc.appservices.connector.LocalEtlTcgWriter.writeNode(LocalEtlTcgWriter.java:669) 
at oracle.apps.grc.appservices.connector.LocalEtlTcgWriter.persistResults(LocalEtlTcgWriter.java:395) 
at oracle.apps.grc.appservices.connector.LocalEtlExtractor.persistResults(LocalEtlExtractor.java:540) 
at oracle.apps.grc.appservices.connector.LocalEtlExtractor.retrievePersistResultsLocal(LocalEtlExtractor.java:349) 
at oracle.apps.grc.appservices.connector.LocalEtlExtractor.retrievePersistResults(LocalEtlExtractor.java:239) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:606) 
at org.python.core.PyReflectedFunction.__call__(PyReflectedFunction.java:175) 
at org.python.core.PyObject.__call__(PyObject.java:355) 
at org.python.core.PyMethod.__call__(PyMethod.java:215) 
at org.python.core.PyMethod.instancemethod___call__(PyMethod.java:221) 
at org.python.core.PyMethod.__call__(PyMethod.java:206) 
at org.python.core.PyObject.__call__(PyObject.java:381) 
at org.python.core.PyObject.__call__(PyObject.java:385) 
at org.python.pycode._pyx0.f$0(<string>:51) 
at org.python.pycode._pyx0.call_function(<string>) 
at org.python.core.PyTableCode.call(PyTableCode.java:165) 
at org.python.core.PyCode.call(PyCode.java:18) 
at org.python.core.Py.runCode(Py.java:1204) 
at org.python.core.Py.exec(Py.java:1248) 
at org.python.util.PythonInterpreter.exec(PythonInterpreter.java:172) 
at org.apache.bsf.engines.jython.JythonEngine.exec(JythonEngine.java:144) 
at com.sunopsis.dwg.codeinterpretor.SnpScriptingInterpretor.execInBSFEngine(SnpScriptingInterpretor.java:322) 
at com.sunopsis.dwg.codeinterpretor.SnpScriptingInterpretor.exec(SnpScriptingInterpretor.java:170) 
at com.sunopsis.dwg.dbobj.SnpSessTaskSql.scripting(SnpSessTaskSql.java:2472) 
at oracle.odi.runtime.agent.execution.cmd.ScriptingExecutor.execute(ScriptingExecutor.java:47) 
at oracle.odi.runtime.agent.execution.cmd.ScriptingExecutor.execute(ScriptingExecutor.java:1) 
at oracle.odi.runtime.agent.execution.TaskExecutionHandler.handleTask(TaskExecutionHandler.java:50) 
at com.sunopsis.dwg.dbobj.SnpSessTaskSql.processTask(SnpSessTaskSql.java:2913) 
at com.sunopsis.dwg.dbobj.SnpSessTaskSql.treatTask(SnpSessTaskSql.java:2625) 
at com.sunopsis.dwg.dbobj.SnpSessStep.treatAttachedTasks(SnpSessStep.java:577) 
at com.sunopsis.dwg.dbobj.SnpSessStep.treatSessStep(SnpSessStep.java:468) 
at com.sunopsis.dwg.dbobj.SnpSession.treatSession(SnpSession.java:2128) 
at com.sunopsis.dwg.dbobj.SnpSession.treatSession(SnpSession.java:1930)

Solution:

While working to solve the issues (ERROR 1 and ERROR 2), we tried to run the synchronize access job again and again. After solving those issues,  we realized that; those failed tries corrupted the ETL data.

Action Plan:

1. Take backup of GRC environment database, WLS  filesystem, ETL repository filesystem, GRC Reports filesystem 
Note: your ETL Repository and GRC Reports locations can be found in the GRC application 
(Navigator > Setup and Administration > Manage Application Configuration) 

Example ETL repository: /Oracle/EGRCC/grcc_etl 
Example Report location: /Oracle/EGRCC/grcc_rep 

2. Stop GRC application server 

3. Kill any pending java processes(if they are still active)

4. Clear cache

cd $MW_HOME/user_projects/domains//bin 
rm -rf ..../servers/AdminServer/tmp/* 
rm -rf ..../servers/AdminServer/logs/* 
rm -rf ..../servers/AdminServer/cache/* 

5. Stop and start GRC Database 

6. Drop the SNP_* tables (ODI repo) 
--Drop all the table with name starting with SNP_. 
Check the following query returns 0 rows after dropping the tables;
select count(*) from all_tables where table_name like 'SNP_%'; 

7. Delete "temp.repository" and "raw" directories and their sub directories under home/grc_etl(Example ETL repository: /Oracle/EGRCC/grcc_etl) 
--"DO NOT touch "persistence"
--application should be shut down for deleting these directories.

9. Start the GRC application server 

10. Run Access and Transaction synchronization as the first step after the GRC application is started.

Wednesday, March 29, 2017

RDBMS -- nostalgia : What happens when we update a table and commit

Recently, one of the juniors in my team  asked me the following question :

What happens when I update a table and commit the change after a few seconds..
The question was actually a copy&paste from an OCP preparation book and we needed to order the following events in the correct sequence for answering it:

A. Oracle reads the blocks from data file to buffer cache and updates the blocks.
B. Changed blocks from the buffer cache are written to data files.
C. The user commits the change.
D. LGWR writes the changed blocks to the redo log buffer.
E. The server process writes the change vectors to the redo log buffer.
F. LGWR flushes the redo log buffer to redo log files.
G. A checkpoint occurs.

I liked the question and wanted to answer it. I wanted to answer it by providing a paragraph for describing the things that suppose to be happen all the way down from updating the table in memory to writing the modified block in to the datafile.

This was a nostalgia for me.. It reminded me my old times, when I was a Core Dba.

Here is what I wrote;

1) The update statement that we issue from our client application (Toad,sqldeveloper,sqlplus etc) is sent to the Server process (Shadow process or  LOCAL=NO etc)  -- Client+Shadow Process or Client Task + Server Task, always there are 2 tasks. (as the name implies -> TWO TASK)

2) Server process checks the shared pool to see if our statement is already there. Checks the object permisssions and blablabla. (I m skipping this) At this stage; Server Process parses the statement (if not present in the shared pool, hard parse)

3) Server Process loads the data, that we want to update, in to the shared pool , or finds its place in the shared pool, if it is already there.

4) Server process (in the name of our client) updates the data in the buffer cache. (Server process is a reflection of Oracle)

5) As we commit our update almost instantly, LGWR write the relevant redo record from Log Buffer (a memory are in the SGA) to the Redolog file. This is almost a synchronous write, as we commit immediately. At this point, the data is not yet updated in the datafile, but it is written in to the redolog file. (writeahaed logging) .. At this point; if we somehow can read the block directly from the datafile, we see the value stored in the block is still the same( not update, before the update)

6) DBWR writes the changed block to the disk (datafile) in the background, in its own time, with an efficient timing. This is a asynchronous write. This is asynch because of the performance reasons.. If it would be synch, we needed to wait for the seek times in the foreground.. LGWR, on the other hand; requires minimal seek time, as it write synchronously. It writes with OS block size and it writes only the redo vector. (not db IO, not a block wirte)

7) When the checkpoint happens, DBWR writes to the datafiles. (Note, DBWR may write to the datafiles without checkpoints, as well.. DBWR may write to the datafile when the count of dirty blocks in the buffer cache increases and when the usable area for free buffers is downsized)

The important question is this one:
When DBWR writes this dirty (changed) blocks  to the datafile?

DBWR is normally a sleepy process. It sleeps until a triggering event happens. One of these triggering events is generated by CKPT. CKPT awakens the DBWR and makes it flush the dirty blocks from the buffer cache.

If we come back to our question.. I think there is an error in the question.

"LGWR writes the changed blocks to the redo log buffer" --> This is a wrong information. LGWR write from redo log buffer to redolog file.

Aha, sorry. Ignore the sentence above.. There is a note in the original question:
It says: "ignore the actions that are not relevant."

So, we should ignore this wrong info. Well... Then, this question can be answered properly.

On the other hand, the question queries a very specific thing. 

This is a question inside the question..
It wants us to answer this question ->
Which one of the following happens first?
Update in memory, or the Writing the change vector to the log buffer (this is done by the Server process).

Why do I find this as a very specific and detailed thing? 
Because these two events are done almost at the same time. (according to several resources), but the update in the buffer cache seems happening one tick earlier.

The question actually gives us a clue here. The clue is "Oracle reads the blocks from data file to buffer cache and updates the blocks." 
So, it gives "Read and update" in the same sentence, and as the read is the first thing; then there is no need to think about the question above. The "Update" should definitely be earlier than the write that is taken in place in the log buffer. (Note that: Shadow process writes to the log buffer)

The commit should be done after both of these events, because the questions says: commit the change "after a few seconds" , so at least 1 or 2 seconds pass. 

That 's why the correct answer is ->

"A,E,C,F,G,B"

Feel free to comment ...

Wednesday, March 15, 2017

RDBMS Licensing -- CPU / Core limits for Named User Plus licenses

As you may already know, Database licensing (Standard Edition 2 and Enterprise Edition) can be done in two ways.

1) Cpu based licensing. 2) NUP(Named user plus) based licensing.

I will not go in to the details about these licensing methods, because it is not my job or my interest.
However, I want to shed a light on a specific topic, which can be a little confusing.

Although the information that you will find below is on a specific licensing topic, it gives general information about database licensing, as well.

Note that, the information given below is about Enterprise Edition, as we mostly use Enterprise Edition (i.e Oracle EBS databases are Enterprise Edition).

The topic that I want to inform you about, is the CPU limits for Named User Plus licensing.

That is, although Named User Plus licensing is done on the basis of database user count, there is a limit for the Cpu/core as well.

In other words; you can't just buy 25 number of User Named Plus licenses and run your database on a server which has 24 CPU cores(enabled).
(Note that, the Enterprise Edition requires a minimum of 25 Named User Plus per Processor licenses or the total number of actual users, whichever is greater.)

Let's take a closer look at this;

The CPU based licensing for Oracle Database Enterprise Edition is actually done on core-basis.
We count the cores of our database server, then multiply this total physical core count with a core factor (0.5 for Intel CPUs) to calculate the needed CPU/processor licenses for our database environment.

This is also applicable for deriving maximum CPU count that we can have for X number of Named User Plus licenses.

Let's take a look at the following example to strengthen that I just explained.

Suppose you have 50 Named User Plus licenses and want to know the maximum Cpu/core count that you can have with these licenses.

50 user plus can support up to 2 Cpu/core licenses.
These 2 cores  are actually Oracle cores, which should be divided with the processor core factor for deriving the maximum cpu core counts that we can have. (cpu core factor for intel is 0.5)

So, 2  / 0.5 = 4 cpu cores.. Thus, we can say that we can have 4 cores enabled, if we have 50 named user plus licenses.

For instance; if we have 50 named user plus licenses and if we have  an ODA X6-2S , then we should enable only 4 core of it.

Similarly, if we want to enable all the cores of ODA X6-2S ( 10 cores total), then we need to do the following calucataion to calculate the extra licenses that we will need->

10-4=6 -> 6* 0.5 = 3 extra core licenses or 3*25 = 75 extra Named User plus licenses.

Note that, all fractions of a number are to be rounded up to the next whole number. For instance, if we get 1.5 as the result of these calculation, we need to round it up to 2.

References:

http://www.oracle.com/us/corporate/pricing/databaselicensing-070584.pdf

Database Licensing - Oracle

Product Minimums for Named User Plus licenses (where the minimums are per processor) are calculated after the number of processors to be licensed is determined, using the "PROCESSOR DEFINITION".

PROCESSOR DEFINITION:

The number of required licenses shall be determined by multiplying the total number of cores of the processor by a core processor licensing factor specified on the Oracle Processor Core Factor Table.


I wrote this article, because DBAs and Apps DBAs should know these things. At least we as DBAs and Apps DBAs should have some idea about these things; because they are frequently asked by the customers (this is for DBA consultants) and because we need to keep our companies in the safe side. (this is for all the DBAs)

Lastly, sharing the processor core factor table...
Processor Core Factor table (current one -- may be updated in the future)

Thursday, March 9, 2017

OAM -- EBS Home Page, login error, unexpected error, throubleshooting with Http Trace.

This is not the first unexpected problem that I have encountered during EBS and OAM implementations.
Yes.. This blog post will be about an issue that I have encountered after integrating an EBS 12.1 instance to OAM+OID environment.

I m writing it, because I want you to be aware of the diagnostics that can be done in such situations.

Let's start to our real life story...

I integrated EBS 12.1 successfully to the OAM and I could able to link our EBS users to OID accounts using EBS auto link feature.
However, after authenticating our users, I have ended up with an unexpected Error just after OA Home Page redirection.

The error that I encountered in EBS Home Page was as follows;


Yes, the error I was getting, was on a confirmation popup and yes it was in Turkish Language.. (actually later I realized that it could not be changed as it was statically written in Turkish language)

The error I was getting can not be found anywhere. (Oracle Support , or anywhere on the internet)
There were no errors in Webgate, AccessGate, EBS oacore, OAM managed server or OID managed server logs.
I was stuck. No clues in the server logs, no problems reported in anywhere...
At that point, I decided to get a HTTP trace on our client.
I dowloaded and installed Fiddler (https://www.telerik.com/download/fiddler) and started tracing. I reproduced the error and look what I 've found there in the Fiddler's trace file.


Well... I clicked on the page url listed in Fiddler, then I checked the Textview tab and saw the same error message written there.. The error message that I was getting in the EBS Home Page...

The error message was written inside a script and that script was clearly a custom one which was basically added to the standard code.

The script was written to check the window.name and raise error accordingly.

The first thing that came to my mind was the personalizations. Some early developer must have added this script to the EBS login page, and that script must not have been compatible with the OAM login.

In order to be sure, I disabled all the personalizations by setting Disable Self-Service Personal profile to Yes and retried the login.
That was it! I could login without any problem. I could even logout without any problems :)
At the end of the day, I forwarded this problematic personalization to the development team, as it was required to be modified.

Well...
You see what a little customization can do?
You see how a simple http trace can save our day? ( or let's say Http Web Debug)
You see the things that being an Apps DBAs requires? ( I mean the ability to narrow down the issue, choosing the right tool in the right time & place, the ability to learn and use any tool, when it comes to EBS diagnostics..)

That's it for now. See you in my next articles.

Friday, March 3, 2017

VNCR (Valid Node Checking for Registration), as an alternative for COST, CVE-2012-1675, a real life story.

Recently recommended VNCR (Valid Node Checking for Registration) for a customer RAC environment which was affected by Oracle Security Alert named CVE-2012-1675.

Reference:

The vulnerability was identified as the TNS listening poisioning, and the Oracle's suggestion was to use Class of Secure Transport (COST) to restrict instance registration.

Reference: 
  • Using Class of Secure Transport (COST) to Restrict Instance Registration in Oracle RAC (Doc ID 1340831.1)
However, we wanted to have a quick solution and at that moment; I recommended using the VNCR to restrict the nodes which can be registered to the RAC listeners (local and scan listeners)

This way; listeners will be able to prevent the remote instances and remote codes to register, thus we can protect the system indirectly at a certain level, without implementing the COST.

References:
  • How to Enable VNCR on RAC Database to Register only Local Instances (Doc ID 1914282.1) 
  • Valid Node Checking For Registration (VNCR) (Doc ID 1600630.1)
The implementation of VNCR was simple.. We just added the following lines to the listener ora files. (In this RAC environment, both Scan and local listeners were using the same listener.ora files, which were located in GRID Home, as recommended for RAC instances >= 11gR2)

VALID_NODE_CHECKING_REGISTRATION_LISTENER=ON
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN1=ON
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN2=ON
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN3=ON
REGISTRATION_INVITED_NODES_LISTENER_SCAN1=(<Node1'spubichostname>,<Node2pubichostname>)
REGISTRATION_INVITED_NODES_LISTENER_SCAN2=(<Node1'spubichostname>,<Node2pubichostname>)
REGISTRATION_INVITED_NODES_LISTENER_SCAN3=(<Node1'spubichostname>,<Node2pubichostname>)

Note that,  In RAC, remote listeners should be registered by all the RAC nodes, but the local listeners should be registered only by their local nodes.. 
So we didn't declared any invited nodes for Local listener, as we wanted local listeners to be registered only from the local nodes. 
(Setting VALID_NODE_CHECKING_REGISTRATION_LISTENER=ON is enough for that..!)

After adding the lines (seen above) to the listener.ora files, we restarted the scan and local listeners and that's it. (we could actually reload the scan and local listeners)

Following is a proof for VNCR. It is working..

Here, I m implementing the VNCR in the remote listener(scan), which is running on Node 1. Note htat, I m not adding Node 2 to the invited nodes list. As a result, only node 1 can register the scan listener, as you see below,  ->

[oracle@erm01 admin]$ lsnrctl status LISTENER_SCAN2

LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 03-MAR-2017 08:38:38
Copyright (c) 1991, 2013, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN2)))
STATUS of the LISTENER
------------------------
Alias LISTENER_SCAN2
Version TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date 03-MAR-2017 08:38:01
Uptime 0 days 0 hr. 0 min. 37 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN2)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=11.11.11.211)(PORT=1521)))
Services Summary...
Service "ERM" has 1 instance(s).
Instance "ERM1", status READY, has 1 handler(s) for this service...
Service "ermXDB" has 1 instance(s).
Instance "ERM1", status READY, has 1 handler(s) for this service...
The command completed successfully

Here I set the invitied nodes for adding the node 2 to the invited nodes list, and now I see the instance in node2 is registed to the LISTENER_SCAN2 , as well ->

[oracle@erm01 admin]$ lsnrctl status LISTENER_SCAN2
LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 03-MAR-2017 08:37:33
Copyright (c) 1991, 2013, Oracle.  All rights reserved.
 Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN2)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_SCAN2
Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date                03-MAR-2017 08:36:38
Uptime                    0 days 0 hr. 0 min. 54 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN2)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=11.11.11.211)(PORT=1521)))
Services Summary...
Service "ERM" has 2 instance(s).
  Instance "ERM1", status READY, has 1 handler(s) for this service...
  Instance "ERM2", status READY, has 1 handler(s) for this service...
Service "ermXDB" has 2 instance(s).
  Instance "ERM1", status READY, has 1 handler(s) for this service...
  Instance "ERM2", status READY, has 1 handler(s) for this service...
The command completed successfully

Well, this is the story of the day guys :). I just did this configuration 2 hours ago and here I m writing it :) I hope you will find it useful.

Monday, February 27, 2017

EBS R12 -- error in OACORE, Login page can not be dispayed - java.lang.NoClassDefFoundError javax.crypto.SunJCE_b

Here is a quick problem&resolution style article from me. I have encountered this problem in an EBS 12.1 instance.
Started the week with this ugly problem and actually it made me give a significant effort.
It first looked like there was a problem with the database connection, then it made me think that it is cased by the jdk (jdk corruption) , finally it was solved by increasing the heap sizes (Permsize actually)
It made me give lots of efforts, because there wasn't any heap size related error recorded in anywhere, in any log or output file.
The standard class "javax.crypto.SunJCE_b" and "the NoClassDefFoundError" made me think that it may be related with the Permsize.

Permsize controls the Permanent Space , which is used by stored classes, methods etc. 
-XX:PermSize: initial value
-XX:MaxPermSize: max value


Anyways, I will keep it short this time.. I need to write this and continue working on it.. (I m working on the reason behind..)

PROBLEM: Login page can not be displayed. (Internet Explorer message : "This page cannot be displayed")
oacore, oafm, forms and eveything could be started with status 0, but Application's Login page can not be displayed. Apache is there and redirecting to oacore, but oacore could not do its jobs due.

ERRORS IN OACORE (actually in LOG_HOME -> application.log):
Exception in thread "main" oracle.apps.fnd.common.AppsException: oracle.apps.fnd.common.PoolException: Exception creating new Poolable object.
at oracle.apps.fnd.profiles.Profiles.getProfileOption(Profiles.java:1509)
at oracle.apps.fnd.profiles.Profiles.getProfile(Profiles.java:362)

...
..
.

Caused by: java.lang.NoClassDefFoundError: Could not initialize class javax.crypto.SunJCE_b
at javax.crypto.Cipher.getInstance(DashoA13*..)
at oracle.security.o5logon.O5LoginClientHelper.decryptAES(Unknown Source)


SOLUTION: 

Increasing the heap sizes Xms , Xmx and XX:PermSize, and restarting oacore.
(Increased the heap sizes directly in opmn.xml located in $INST_TOP, it could be done with the autoconfig, as well.. I mean updating  the context file and  running autoconfig)

Saturday, February 25, 2017

ODA X6-2 -- using ACFS snapshots for database backup and restore

ODA X6 uses ACFS as the filesystem for database files. (ACFS is the default, but ASM can also be used as well)
So, in this blog post, I will demonstrate an ACFS snapshot based database backup & restore operation.
Note that, creating oracle database files on ACFS is allowed in RDBMS 11.2.0.4 and in later releases.
In ODA X6-2 , we have 12C or 11.2.0.4 databases, so that we can benefit from ACFS features such as the snapshot functionality.

Here I m doing a demo ->


Checking the ACFS volumes:


ASMCMD> volinfo --all
Diskgroup Name: DATA

         Volume Name: COMMONSTORE
         Volume Device: /dev/asm/commonstore-74
         State: ENABLED
         Size (MB): 5120
         Resize Unit (MB): 64
         Redundancy: MIRROR
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage: ACFS
         Mountpath: /opt/oracle/dcs/commonstore

         Volume Name: DATDEMO5
         Volume Device: /dev/asm/datdemo5-74
         State: ENABLED
         Size (MB): 102400
         Resize Unit (MB): 64
         Redundancy: MIRROR
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage: ACFS
         Mountpath: /u02/app/oracle/oradata/DEMO5

         Volume Name: DATDEMODB1
         Volume Device: /dev/asm/datdemodb1-74
         State: ENABLED
         Size (MB): 102400
         Resize Unit (MB): 64
         Redundancy: MIRROR
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage: ACFS
         Mountpath: /u02/app/oracle/oradata/DEMODB1

         Volume Name: DATDEMODB2
         Volume Device: /dev/asm/datdemodb2-74
         State: ENABLED
         Size (MB): 102400
         Resize Unit (MB): 64
         Redundancy: MIRROR
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage: ACFS
         Mountpath: /u02/app/oracle/oradata/DEMODB2

         Volume Name: DATDEMODB3
         Volume Device: /dev/asm/datdemodb3-74
         State: ENABLED
         Size (MB): 102400
         Resize Unit (MB): 64
         Redundancy: MIRROR
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage: ACFS
         Mountpath: /u02/app/oracle/oradata/DEMODB3

         Volume Name: DATDEMODB4
         Volume Device: /dev/asm/datdemodb4-74
         State: ENABLED
         Size (MB): 102400
         Resize Unit (MB): 64
         Redundancy: MIRROR
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage: ACFS
         Mountpath: /u02/app/oracle/oradata/DEMODB4

Diskgroup Name: RECO

         Volume Name: RECO
         Volume Device: /dev/asm/reco-296
         State: ENABLED
         Size (MB): 610304
         Resize Unit (MB): 64
         Redundancy: MIRROR
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage: ACFS
         Mountpath: /u03/app/oracle/
  • I have lots of ACFS volumes, including the one created for redologs (RECO). I choose to work with the ACFS volume named DATDEMO5, which is used by the DEMO5 database. The datafiles of the DEMO5 Database are located in DATDEMO5.
  • In order to test my ACFS snapshot based restore work, I create a table and drop it afterwards.But, jsut before dropping the table, I create an ACFS snapshot to be able to restore/rollback my database to a time before the table is dropped.


CREATING an ACFS SNAPSHOT:


[oracle@odademo ~]$ sqlplus "/as sysdba"SQL*Plus: Release 12.1.0.2.0 Production on Tue Feb 21 15:56:31 2017
Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> create table erman as select * from dual;

Table created.

SQL> select * from erman;

D
-
X
  • At this point, I create my ACFS snapshot..  So my snapshot contains the table named erman.
[grid@odademo ~]$ acfsutil snap info /u02/app/oracle/oradata/DEMO5
    number of snapshots:  0
    snapshot space usage: 0  ( 0.00 )
[grid@odademo ~]$ acfsutil snap create ERMANSNAP  /u02/app/oracle/oradata/DEMO5
acfsutil snap create: Snapshot operation is complete.
[grid@odademo ~]$ acfsutil snap info /u02/app/oracle/oradata/DEMO5
snapshot name:               ERMANSNAP
snapshot location:           /u02/app/oracle/oradata/DEMO5/.ACFS/snaps/ERMANSNAP
RO snapshot or RW snapshot:  RO
parent name:                 /u02/app/oracle/oradata/DEMO5
snapshot creation time:      Tue Feb 21 15:59:10 2017

    number of snapshots:  1
    snapshot space usage: 32768  (  32.00 KB )
  • After taking my snapshot, I drop the table named erman.
[oracle@odademo ~]$ . oraenv
ORACLE_SID = [oracle] ? DEMO5
The Oracle base has been set to /u01/app/oracle
[oracle@odademo ~]$ sqlplus "/as sysdba"

SQL*Plus: Release 12.1.0.2.0 Production on Tue Feb 21 16:01:18 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> drop table erman;

Table dropped.

SQL> select * from erman;
select * from erman
              *
ERROR at line 1:
ORA-00942: table or view does not exist
[grid@odademo ~]$ . oraenv
ORACLE_SID = [+ASM1] ? +ASM1
The Oracle base has been set to /u01/app/grid
  • At this point, I have no table named erman in my DEMO5 database.

RESTORING THE DATABASE FROM THE SNAPSHOT:

  • As seen above, the table named erman is dropped and now I restore my database using the ACFS snapshot that I took before dropping the table.
  • I first create a read-write snapshot from the read-only snapshot that I took earlier.
[grid@odademo ~]$ acfsutil snap create -w -p ERMANSNAP RW_ERMANSNAP /u02/app/oracle/oradata/DEMO5
acfsutil snap create: Snapshot operation is complete.
[grid@odademo ~]$ acfsutil snap info /u02/app/oracle/oradata/DEMO5
snapshot name:               ERMANSNAP
snapshot location:           /u02/app/oracle/oradata/DEMO5/.ACFS/snaps/ERMANSNAP
RO snapshot or RW snapshot:  RO
parent name:                 /u02/app/oracle/oradata/DEMO5
snapshot creation time:      Tue Feb 21 15:59:10 2017

snapshot name:               RW_ERMANSNAP
snapshot location:           /u02/app/oracle/oradata/DEMO5/.ACFS/snaps/RW_ERMANSNAP
RO snapshot or RW snapshot:  RW
parent name:                 ERMANSNAP
snapshot creation time:      Tue Feb 21 16:02:33 2017

    number of snapshots:  2
    snapshot space usage: 475136  ( 464.00 KB )
  • Then, I relink the DEMO5 directory residing in my related ACFS volume to the relevant snapshot path.
[root@odademo DEMO5]# cd /u02/app/oracle/oradata/DEMO5/DEMO5
[root@odademo DEMO5]# mv datafile datafile_old
[root@odademo DEMO5]# ln -s /u02/app/oracle/oradata/DEMO5/.ACFS/snaps/RW_ERMANSNAP/DEMO5/datafile /u02/app/oracle/oradata/DEMO5/DEMO5

[root@odademo DEMO5]# ls -al
total 40
drwxr-x--- 3 oracle asmadmin 8192 Feb 21 16:10 .
drwxrwxr-x 6 oracle oinstall 4096 Feb 21 16:08 ..
lrwxrwxrwx 1 root   root       66 Feb 21 16:10 datafile -> /u02/app/oracle/oradata/DEMO5/.ACFS/snaps/RW_ERMANSNAP/DEMO5/datafile
drwxr-x--- 2 oracle asmadmin 8192 Feb 20 16:48 datafile_old

[root@odademo DEMO5]# ls -al datafile
lrwxrwxrwx 1 root root 66 Feb 21 16:10 datafile -> /u02/app/oracle/oradata/DEMO5/.ACFS/snaps/RW_ERMANSNAP/DEMO5/datafile

[root@odademo DEMO5]# ls -al /u02/app/oracle/oradata/DEMO5/.ACFS/snaps/RW_ERMANSNAP/DEMO5/datafile
total 1773176
drwxr-x--- 2 oracle asmadmin      8192 Feb 20 16:48 .
drwxr-x--- 3 oracle asmadmin      8192 Feb 20 16:43 ..
-rw-r----- 1 oracle asmadmin 734011392 Feb 21 15:45 o1_mf_sysaux_dbowmlrw_.dbf
-rw-r----- 1 oracle asmadmin 754982912 Feb 21 15:52 o1_mf_system_dbownowh_.dbf
-rw-r----- 1 oracle asmadmin  72359936 Feb 20 22:48 o1_mf_temp_dbowpgm7_.tmp
-rw-r----- 1 oracle asmadmin 319823872 Feb 21 15:54 o1_mf_undotbs1_dbowoh07_.dbf
-rw-r----- 1 oracle asmadmin   5251072 Feb 20 16:48 o1_mf_users_dbowy2g2_.dbf
  • As seen in the above ls command outputs, the directory that my database file suppose to reside, is linked to the ACFS snapshot that I took earlier.
  • So, now I start my database to see if it can be started and see if the table that I dropped is now available.
[oracle@odademo ~]$ . oraenv
ORACLE_SID = [oracle] ? DEMO5
The Oracle base has been set to /u01/app/oracle
[oracle@odademo ~]$ sqlplus "/as sysdba"

SQL*Plus: Release 12.1.0.2.0 Production on Tue Feb 21 16:15:21 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 4294967296 bytes
Fixed Size                  2932632 bytes
Variable Size             889192552 bytes
Database Buffers         3372220416 bytes
Redo Buffers               30621696 bytes
Database mounted.
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1:
'/u02/app/oracle/oradata/DEMO5/DEMO5/datafile/o1_mf_system_dbownowh_.dbf'
  • My database requires recovery!!? Why? It is because I only created an ACFS snapshot for the datafile volume, I didn't create an ACFS snapshot for the RECO volume, in which my redologs reside.
  • Anyways, let's recover the database to show you that it can be opened, and the media recovery is working :)
SQL> recover database;
Media recovery complete.
SQL> alter database open;

Database altered.
  • So , at this point, I have restored and recovered my database.  So I can say that Oracle database can be restored from an ACFS snapshot.
SQL> select * from erman;
select * from erman
              *
ERROR at line 1:
ORA-00942: table or view does not exist
SQL> exit

  • However, the table named erman is not there. As explained earlier, this is expected, because we did not restore the redolog files.. I mean the redologs where in a different ACFS volume, so our snapshot restore didn't restore the redologs, and during the recovery, our table is dropped..:) interesting, isnt it? 

CONCLUSION:


Anyways, this blog post proves not only 1 thing, but 2, actually 3 :) ->
1)"The ACFS snapshot based restore mechanism" works as expected
2)"Oracle's media recovery mechanism" works as expected
3)We can get benefit from ACFS snapshot features even for the databases running on ODA platforms.