Wednesday, April 26, 2017

Linux -- using the screen for uninterrupted command operations (especially when you don't have Vnc)

This is a very old topic actually. Especially Linux folks will not pay attention to this one (because they probably already know it for a long time ago). However, in Oracle DBA world, it is a valuable topic.
We always use vnc for uninterrupted operations.(sometimes nohup as well)
That's what said to us. Especially for critical and long running OS operation (including copy, move , manual database upgrades and Rman operations) , we most of the time use VNC and get a X session , which is not interrupted, even if we lose our connections to our servers.

the "screen" command, on the other hand; can be thought as an alternative to the VNC connections in a way.
Ofcourse screen doesn't have the capability (it is not designed for this aim) to give us a X session (GUI), but it let us to have uninterrupted command operations (like the VNC does).
Besides screen program let us continue with our shell/terminal even after we disconnect from the server. That is even after disconnecting and reconnecting , we can still see all the shell outputs which have been produced since we have started working in our screen terminal.

Following filtered info is from the man page of the screen program;

Screen  is  a  full-screen window manager that multiplexes a physical terminal between several processes (typically interactive shells). 
When screen is called, it creates a single window with a shell in it (or the specified command) and then gets out of your way so that you can use the program as you normally would. 
Programs continue to run when their window is currently not visible and even when the whole screen session is detached from the userâs terminal.

Well... screen is an easy to install and use tool. (In case of Oracle Linux, screen can be installed using "yum install screen" command)

Here is a demo for you ->

We connect to our server and execute the screen command as follows;
When we execute the screen command, a screen session/terminal gets created as shown below;
(I showed the pids of the sessions using echo $$ to let you see the terminal's pid gets changed when we execute the screen command).


In the screen session, we first execute the ls -al command and see its output. Then, we execute our next command, which is "du -sh / ".. While our command is working, we close our terminal (by closing our terminal program to mimic an unexpected disconnection).



After we close our terminal program (which is an SSH client in my case), we relogin to our server and check our ongoing screen sessions using "screen -ls" command, as shown below;



Once we identify our screen session using its id (pid actually, and it is 3729 in this case), we use "screen -r" command to attach our ongoing screen session, as depicted below;


After we execute the screen -r command, we attach to our screen terminal and see our du -sh operation is still on-going. (we even see the output of ls -al command , that we executed earlier)


If we decide to kill our screen terminal, we can always use exit command while we are in the screen terminal prompt. Alternatively, we can use "screen -S <pid> -X kill" command  to kill our screen terminal, without even attaching to it.


The last thing that I want to mention in this blog post is, the difference between the nohup command and the screen command. 
That is, screen is not only used for daemonizing a process. It works more like a terminal window manager. For instance, we can disconnect from the screen terminal while our command is working, and then reconnect to that terminal in case our command requires an input from us. Also, we can reconnect to the terminal and check the command outputs , which were produced while we are not connected to the server at all and so on.. So, screen and nohup are not the same thing.

Tuesday, April 25, 2017

Top 60 Oracle Blogs And Websites for Oracle DBAs ( Erman Arslan's Oracle Blog is on the list!)

I'm proud to be on the TOP-60 Oracle Blogs and Websites list.
I have worked hard for this blog from the first day and now it is a big pleasure to be on the list with the big names.
I would like to thank you, my followers and readers... Also I would like to thank you, my forum users.
Your interest, comments and feedbacks have been my biggest motivation both for writing this stuff and for all the reseaches and lab works that I have done for supplying an unique content in this blog.

Thursday, April 20, 2017

EBS 12.2 -- Blank page problem viewing concurrent out and logs, Hostname with Capital Letters, FNDWRR & node_info.txt

Recently encountered a problem on a new EBS implementation.
The problem appeared when the customer wanted to see the outputs and logs of the concurrent programs.
When they tried to open a concurrent log in the browser, a blank page was displayed.

In order to solve this issue, we did almost all the diagnostics, such as;
  • Enabling sqlnet trace in the apps tier listener.ora
  • Running "FS Diagnostic Test Program" with the parameter MODE=FULL 
The only error that we saw after running FS Diag Test Program was the following;

-- FNDFS did not return an error code 
-- FNDWRR did not create a debug logfile. 
-- FNDFS did not create a debug logfile. 
-- Displaying first 25 lines returned by FNDWRR.exe: 
-- BEGIN FILE ------------------------------------- 
-- END FILE --------------------------------------- 
-- ERROR: Unable to transfer file successfully! 
  • Setting the profile "Viewer:Text" to have no value (remove 'browser')  and retesting viewing a log file. This was to bypass FNDWRR code process and pass the FNDFS output directly to the built-in viewer.
This was almost working. We could display the text output, but we could not display any other output formats such as pdf or xml.
  • We did the tests documented in "How to Troubleshoot FNDFS and FNDWRR (Problems Viewing Concurrent Request Output) ( Doc ID 847844.1 )".  FNDWRR could display output from OS succesfully.
  • Checked Apache logs  for CGI entries (FNDWRR is a CGI program) /access_log.* error_log.* etc.. We could only see HTTP 200 there. GET /OA_CGI/FNDWRR.exe?temp_id=796020226 HTTP/1.1" 200 - 
  • Checked database alert log (just in case)
  • Generated a Fiddler Trace and reviewed it. No errors
  • Applied  "Patch 17075726:R12.FND.C - SUPPORT CONCURRENT MANAGER ON A SERVER CONFIGURED TO USE VIRTUAL HOSTING " , didn't fix.
  • We disabled all the anti-virus and proxy checks in the client side, but issue remained.
  • We created a SR (Oracle Support Service Request) , but it didn't help much. 
  • Did the browser configurations below, but they didn't help at all.

  • Set up Oracle E-Business Suite to run through the 'Trusted Sites' zone with a 'Medium' Security Setting.
    Tools -> Pop-up Blocker -> Pop-up Blocker Settings -> Allowed Sites
    Tools -> Internet Options -> Privacy -> Pop-up Blocker: Settings -> Allowed sites
    Selected (Checked) the following values:
    Tools -> Internet Options -> Security -> <zone> (e.g. Trusted Sites) -> Enable Protected Mode
    Tools -> Internet Options -> Advanced -> Enable 64-bit processes for Enhanced Protection Mode

THE CAUSE AND THE SOLUTION:

Well.. After all these diagnostics works, guess what we realized?

"The server name was written in capital-letters."  

Yes, the issue was caused by that.. 

The hostname of the EBS servers should not be written in capital letters. 

This is not supported and this issue is one of the causes of this support lack.. (note that, this issue is not documented, so that's why I m writing it :) )

The autoconfig and postclone are also affected by this problem, but there are workarounds for them. 

However; you can't get rid of FNDWRR (viewing concurrent log and out) problems, if you have a hostname with capital letters.

As for the solution, we updated $EBS_APPS_DEPLOYMENT_DIR/oacore/APP-INF/node_info.txt
and modified the lines which were including the hostname. 

We made all hostnames written in lower-case letters and the issue got resolved even without restarting anything.

Note:

txkFNDWRR.pl reads the node_info.txt and set the env variables accordingly. Using the $OA_HTML environment variable, txkFNDWRR.pl executes the FNDWRR, which in turn makes the conc log or out be displayed in browser screens.

I wrote this blog post to identify the diagnostics actions that need to be done while diagnosing such a problem and to show you the throubles that we may find ourselves in, if we have an unsupported configuration.

Tuesday, April 18, 2017

GRC -- GRC & EBS implementation //errors and solutions

Recently installed GRC 8.6.6, PCG 7.3.3 and PEA 8.6.6 using the Intallation guides delivered with the products to implement GRC with an EBS 12.1.3 instance.

The main document to be followed for doing this type of an installation was Enterprise Governance Risk and Compliance (EGRC) Product Information (Doc ID 1084596.1) and basically what we did was the following;
  • Installing an 11g Oracle Database for GRC
  • Installing a Weblogic 12C for GRC
  • Installing ADR 12C on Weblogic
  • Installing database for Schemas for GRC using RCU utility
  • Creating a Weblogic Domain for GRC
  • Deploying GRC application using Weblogic Console
  • Upgrading GRC
  • Installing PCG on EBS
  • Installing critical PCG Patches
  • Installing PAE on EBS
  • Fixing any errors reported by the functional team
It was not the thing that is frequently done in our customer environments, so maybe that's why we encountered some errors and spent some time to find their solutions, which I find useful to share with you.

We basically encountered 3 errors. 2 of them were caused by the complexity of the documentation and one of them was directly related with a corrupted data.

Let's see what those errors and their solutions are;

ERROR 1 - unable to synchronize access in grc 8.6.6

The connection test that we did for the datasource that we created on GRC, was succesful; but the synchronize acces job (in GRC application) failed with the following error. (error reported in grc.log)

ERROR [ExecutorThread-11] DataSourceService:1343 Error while setting ETL completed
java.lang.RuntimeException: Failed to serialize the object:
Descriptor Exceptions:
---------------------------------------------------------
Exception [EclipseLink-59] (Eclipse Persistence Services - 2.3.1.v20111018-r10243): org.eclipse.persistence.exceptions.DescriptorException
Exception Description: The instance variable [thingSavedStates] is not defined in the domain class [oracle.apps.grc.domain.datasource.SourceSyncState$SourceSyncStateBuilder], or it is not accessible.
Internal Exception: java.lang.NoSuchFieldException: thingSavedStates
Mapping: org.eclipse.persistence.oxm.mappings.XMLCompositeCollectionMapping[thingSavedStates]
Descriptor: XMLDescriptor(oracle.apps.grc.domain.datasource.SourceSyncState$SourceSyncStateBuilder --> [])
Runtime Exceptions:
---------------------------------------------------------
at oracle.apps.grc.domain.datasource.SourceSyncState.toXML(SourceSyncState.java:135)
at oracle.apps.grc.dataservices.dao.impl.spring.datasource.DataSourceDaoSpr.updateSyncState(DataSourceDaoSpr.java:1249)
at oracle.apps.grc.dataservices.dao.impl.spring.datasource.DataSourceDaoSpr.setEtlCompleted(DataSourceDaoSpr.java:1946) 


Solution:

USER_MEM_ARGS should be updated correctly in setDomainEnv.sh.
As documented in:
http://docs.oracle.com/cd/E51797_01/doc.8651/e52268.pdf
page : 2-14

Action plan:

1. Stop application Server
2. Backup setDomainEnv.sh file
3. Update the USER_MEM_ARGS parameter in setDomainEnv.sh file . The modification should be done between the comment and if statement.

# IF USER_MEM_ARGS the environment variable is set, use it to override ALL MEM_ARGS values 
<<<<<Changes should come here>>>>>>>
case "${SERVER_NAME}" in "AdminServer")
USER_MEM_ARGS="-Xms2048M –Xmx2048M" ;;
...
....
USER_MEM_ARGS="${USER_MEM_ARGS} -XX:PermSize=256m -XX:MaxPermSize=512m -XX:ReservedCodeCacheSize=128M -Djava.awt.headless=true -Djbo.ampool.maxpoolsize=600000 -Dfile.encoding=UTF-8 -Djavax.xml.bind.context.factory=com.sun.xml.internal.bind.v2.ContextFactory" 
 <<<<<Changes should come here>>>>>>>
if [ "${USER_MEM_ARGS}" != "" ] ; then

4. Ensure the eclipselink-2.3.1.jar file exist in the below path. <MW_HOME>/grc866/grc/WEB-INF/lib/ location

5. Start the application server

6.Retest the issue

ERROR 2 - unable to syncronize access in grc 8.6.6 

The connection test that we did for the datasource that we created on GRC, was succesful; but the syncronize acces job (in GRC application) failed with the following error. (error reported in grc.log)

DEBUG [EtlExtractor-1254779240] GrcLogPrintStream:73 STDOUT (oracle.core.ojdl.logging.ConsoleHandler:118) <Apr 6, 2017 11:10:11 AM EEST> <Error> <Default> <ODI-1217> <Session TCG_SCEN_Users_21 (85154) fails with return code 7000.
ODI-1226: Step TCG_INTR_Users_21 fails after 1 attempt(s).
Caused by: ODI-1240: Flow TCG_INTR_Users_21 fails while performing a Integration operation. This flow loads target table Users.
Caused by: org.apache.bsf.BSFException: exception from Jython:
Traceback (most recent call last):
File "<string>", line 50, in <module>


Solution:

PRE_CLASSPATH was not properly set in the setDomain.env.

As documented in "http://docs.oracle.com/cd/E51797_01/doc.8651/e52268.pdf
page : 2-2" , the PRE_CLASSPATH should be updated properly.

Action Plan:

1. Locate the following lines in the file: 

if [ "${PRE_CLASSPATH}" != "" ] ; then CLASSPATH="${PRE_CLASSPATH}${CLASSPATHSEP}${CLASSPATH}" 
export CLASSPATH 
fi 

2. Add the following before the above lines: 

PRE_CLASSPATH="/grc865/grc/WEB-INF/lib/jython- 2.5.1.jar:${PRE_CLASSPATH}" 
export PRE_CLASSPATH

Note: Replace with the actual path to your middleware home.

ERROR 3 - unable to synchronize access in grc 8.6.6 

The connection test that we did for the datasource that we created on GRC, was succesful; but the synchronize acces job (in GRC application) failed with the following error. (error reported in grc.log)

DEBUG [EtlExtractor-976389891] DataSourceDaoSpr:1234 updateSyncState(oracle.apps.grc.domain.datasource.SourceSyncState@22489656) 
ERROR [EtlExtractor-976389891] AvailableResource:173 myBlocks null 
ERROR [EtlExtractor-976389891] LocalEtlTcgWriter:412 A problem occurred in LocalEtlTcgWriter.writeData: 
java.lang.NullPointerException: myBlocks null 
at oracle.apps.odin.reasonerio.file.page.AvailableResource.askBlock(AvailableResource.java:174) 
at oracle.apps.odin.reasonerio.file.page.PageManager.createNewPage(PageManager.java:92) 
at oracle.apps.odin.reasonerio.file.page.PagingGrccChannelManager.expandChannel(PagingGrccChannelManager.java:498) 
at oracle.apps.odin.reasonerio.file.page.PageChannelController.<init>(PageChannelController.java:79) 
at oracle.apps.odin.reasonerio.file.page.PagingGrccChannelManager.createChannel(PagingGrccChannelManager.java:456) 
at oracle.apps.grc.reasonerio.graph.blockbytype.writer.BBTGraphWriter.getAttributeWriter(BBTGraphWriter.java:148) 
at oracle.apps.grc.appservices.connector.LocalEtlTcgWriter.writeNode(LocalEtlTcgWriter.java:669) 
at oracle.apps.grc.appservices.connector.LocalEtlTcgWriter.persistResults(LocalEtlTcgWriter.java:395) 
at oracle.apps.grc.appservices.connector.LocalEtlExtractor.persistResults(LocalEtlExtractor.java:540) 
at oracle.apps.grc.appservices.connector.LocalEtlExtractor.retrievePersistResultsLocal(LocalEtlExtractor.java:349) 
at oracle.apps.grc.appservices.connector.LocalEtlExtractor.retrievePersistResults(LocalEtlExtractor.java:239) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:606) 
at org.python.core.PyReflectedFunction.__call__(PyReflectedFunction.java:175) 
at org.python.core.PyObject.__call__(PyObject.java:355) 
at org.python.core.PyMethod.__call__(PyMethod.java:215) 
at org.python.core.PyMethod.instancemethod___call__(PyMethod.java:221) 
at org.python.core.PyMethod.__call__(PyMethod.java:206) 
at org.python.core.PyObject.__call__(PyObject.java:381) 
at org.python.core.PyObject.__call__(PyObject.java:385) 
at org.python.pycode._pyx0.f$0(<string>:51) 
at org.python.pycode._pyx0.call_function(<string>) 
at org.python.core.PyTableCode.call(PyTableCode.java:165) 
at org.python.core.PyCode.call(PyCode.java:18) 
at org.python.core.Py.runCode(Py.java:1204) 
at org.python.core.Py.exec(Py.java:1248) 
at org.python.util.PythonInterpreter.exec(PythonInterpreter.java:172) 
at org.apache.bsf.engines.jython.JythonEngine.exec(JythonEngine.java:144) 
at com.sunopsis.dwg.codeinterpretor.SnpScriptingInterpretor.execInBSFEngine(SnpScriptingInterpretor.java:322) 
at com.sunopsis.dwg.codeinterpretor.SnpScriptingInterpretor.exec(SnpScriptingInterpretor.java:170) 
at com.sunopsis.dwg.dbobj.SnpSessTaskSql.scripting(SnpSessTaskSql.java:2472) 
at oracle.odi.runtime.agent.execution.cmd.ScriptingExecutor.execute(ScriptingExecutor.java:47) 
at oracle.odi.runtime.agent.execution.cmd.ScriptingExecutor.execute(ScriptingExecutor.java:1) 
at oracle.odi.runtime.agent.execution.TaskExecutionHandler.handleTask(TaskExecutionHandler.java:50) 
at com.sunopsis.dwg.dbobj.SnpSessTaskSql.processTask(SnpSessTaskSql.java:2913) 
at com.sunopsis.dwg.dbobj.SnpSessTaskSql.treatTask(SnpSessTaskSql.java:2625) 
at com.sunopsis.dwg.dbobj.SnpSessStep.treatAttachedTasks(SnpSessStep.java:577) 
at com.sunopsis.dwg.dbobj.SnpSessStep.treatSessStep(SnpSessStep.java:468) 
at com.sunopsis.dwg.dbobj.SnpSession.treatSession(SnpSession.java:2128) 
at com.sunopsis.dwg.dbobj.SnpSession.treatSession(SnpSession.java:1930)

Solution:

While working to solve the issues (ERROR 1 and ERROR 2), we tried to run the synchronize access job again and again. After solving those issues,  we realized that; those failed tries corrupted the ETL data.

Action Plan:

1. Take backup of GRC environment database, WLS  filesystem, ETL repository filesystem, GRC Reports filesystem 
Note: your ETL Repository and GRC Reports locations can be found in the GRC application 
(Navigator > Setup and Administration > Manage Application Configuration) 

Example ETL repository: /Oracle/EGRCC/grcc_etl 
Example Report location: /Oracle/EGRCC/grcc_rep 

2. Stop GRC application server 

3. Kill any pending java processes(if they are still active)

4. Clear cache

cd $MW_HOME/user_projects/domains//bin 
rm -rf ..../servers/AdminServer/tmp/* 
rm -rf ..../servers/AdminServer/logs/* 
rm -rf ..../servers/AdminServer/cache/* 

5. Stop and start GRC Database 

6. Drop the SNP_* tables (ODI repo) 
--Drop all the table with name starting with SNP_. 
Check the following query returns 0 rows after dropping the tables;
select count(*) from all_tables where table_name like 'SNP_%'; 

7. Delete "temp.repository" and "raw" directories and their sub directories under home/grc_etl(Example ETL repository: /Oracle/EGRCC/grcc_etl) 
--"DO NOT touch "persistence"
--application should be shut down for deleting these directories.

9. Start the GRC application server 

10. Run Access and Transaction synchronization as the first step after the GRC application is started.