Wednesday, July 29, 2020

EBS -- Integrated SOA Gateway / ISG - 'SOA Provider Not Accessible' After Autoconfig

We have seen an interesting issue in an EBS 12.1.3 instance.
ISG was enabled and suddenly, custom web services started to end up with SOA Provider not  accessible errors.

Actually these errors were seen when the customer tried to regenerate/redeploy the problematic web services using related EBS screens.

We knew that, Apps password was changed in this environment, recently.
As a post action of this change, autoconfig was also run.

Actually, they first changed the apps password in a TEST environment. After changing it in the TEST environment, they checked all the EBS system and didn't found any problems..

The web services could be successfully used. However; when they changed the APPS password in PROD, they encountered this issue.
In order to find the cause and fix the problem, we first checked the SOA configuration in PROD and wanted to ensure that it was configured properly.
We suspected an issue with the ASADMIN password migh have caused this.. We also saw that the file($ORA_CONFIG_HOME/10.1.3/j2ee/oafm/config/system-jazn-data.xml), that contains this ASADMIN password was modified almost at the same time as the autoconfig run.

The value written in credential tag was encyrpted and we decided to reset the ASADMIN password by changing that value;

<credentials>!password</credentials>

After this change, we restarted the application and the issue dissapeared.
Note that, when the OAFM is started, the password gets encyrpted.

At the end of the day, this showed us, running autoconfig in EBS 12.1.3 may change the ASADMIN password which is stored in system-jazn-data.xml file.. So in ISG enabled EBS 12.1.3 environments, it is a good idea to backup that file before running autoconfig in Apps Tier.

As we didn't see the same problem in the TEST environment, we concluded that autoconfig changes the value of the ASADMIN credentials to the welcome.. ( in encrypted form). We think so because, the ASADMIN password of the TEST environment was set to welcome and the issue didn't appear in the TEST environment, although the same autoconfig (apps tier) were run in there.

So, in our case, our findings showed us that if the ASADMIN password is set to welcome, we don't encounter this problem :)

This is the tip of the day. Stay tuned :)

Friday, July 24, 2020

Database Forensics - Incomplete/Interrupted Tablespace Drop operation / INTERNAL DDL / cause & effect

Being fast is an advantage. However, it always involves some risks.

Yes! it is priceless to complete the required work quickly.  However; there is always a big risk, when we try to be fast, try to gain some time during an important work.. For instance, gaining some time by not doing the checks we have to do...

As I always say, as long as there is a mistake that can be corrected, mistakes can always be made.
The real problem is making a mistake that cannot be corrected.
The same rule applies in the database world.. Recovarable, interruptible and restartable operations are safer.. On the other hand, we need to approach the unrecoverable and uninterruptable operations more seriously.


After these important advices, I want to tell you an interesting story.
Don't worry :) We will not switch our context, so as I mentioned in the title, this topic is in the database forensics field. Maybe, that's why it sounds interesting to me..

Did you ever encounter a situation where some tables in the database dropped automatically and internally by Oracle itself? Well, I did...

Did you ever see a situation where your datafiles are present but some of tables in the relevant tablespace are missing? I saw this too :)

In this story, we will be dealing with some dropped tables. Some important ones..

Suppose the business users or some functional super users called you and told that they are getting some errors on their application. Normally, what you would do is, checking the error and finding a solution, right? What you would generally see is an expection or an error stack that tells you the reasons, and with those reasons, you do some searches on the internet, or in the vendor's support website and consider applying the recommended solutions.

However, what if you conclude that the problem is much more critical? What if the error you see in that error stack tells you that some of the tables are missing.
Consider a scenario where you have thousands of tables dropped in a second and you realize it some time after..

Well, in this kind of a stiuation, you will have 2 important tasks.

First one is, you will have restore those tables, and recover them ..
This is logical, right :) However, there are some challenges for this.

For instance, you can't flashback a table if it was dropped with purge option.

If the application continued to run even after some tables were dropped, then Flashback database is not an option either.. I mean Flashback database option will flashback the database but then you will lose the changes that were done on the other tables which were available during this period between the time that tables were dropped and the time you realized that.

RMAN restore and databsae recovery are your options but you will have to be fast.. In addition you should have built a restore and recovery mechanism and a supporting infrastructure according to your RPO and RTO to use rman restore and recovery in this kind of a situation.

Using Log Miner  is another option.. You can mine the change operations which have occured after those tables were dropped and then apply them as DML instructions, after you flasback the database just before the point in time those tables were dropped.

So no need to panic :) -- Of course if you are an expert in these things and know what you are doing.

The second task is finding the reason behind these dropped tables. Who or what dropped them? What kind of a statement caused this to happen?

Well, in order to complete this task, you will check the following;

First, you will check database audit records. (if audit is enabled)

If audit is not enabled or not configured correctly, you will check the redologs/archivelogs generated during the time that tables were dropped.  This way, at least you can see the statement that was caused the issue and if you are lucky, you can also see the database user that executed the statement.

If the operation was a SYS operation, I mean if the operation was done using as sysdba as sysoper, and if AUDIT_SYS_OPERATIONS parameter was set to true, you can also see the statement and the user that executed it, by checking the OS audit files.

There are other methods to find the cause, but let's focus on our story.

So we have some dropped tables, and we checked the audit trail, but it was empty. (db audit was disabled)
Then we checked the redologs generated during that period using Log Miner and saw that those tables were dropped using DROP table purge statements and these statements/operations was recorded as INTERNAL DDL operations in redo. INTERNAL DDL means that, it is done by Oracle itself. Weird, but is was clearly seen so in v$logmnr_contents..

Alert log was not giving us any clues, so we checked the SYS aud files.
In one of the SYS aud files, we saw a DROP TABLESPACE INCLUDING CONTENTS AND DATAFILES statement. The tablespace name used in this operations was the one that the dropped tables belong. However; all the data files of that tablespace was still there, they were online and the tablespace looked healthy.

Probably, there was attempt for dropping the tablespace, maybe a wrong attempt.. There was an attempt, but it was clear that it wasn't successful, because we had all the data files in place.

Still, we focused on that.. We checked the internal mechanism of Oracle's tablespace drop and found that, if we execute a drop tablespace statement with include contents and data files clause, first; all the contents are dropped. (Tables etc).. Oracle executes drop commands internally. (This was the cause of that INTERNAL DDL operations recorded in Redologs)
Then, the OS files (db files) are deleted.. Lastly, we get 'Tablespace dropped.' message.

So if we interrupt (for instance using CTRL+c) this drop tablespace process, we may experience unstable situations where our datafiles are present but some of tables in the relevant tablespace are missing.

Well, I think we have already come to the end of this story :) 

A tablespace operation was executed and then it was interrupted using CTRL+C or it was aborted using some other method. So, Oracle started dropping the tables belong to that tablespace, and actually if it could finish dropping them, it would drop all the datafiles and remove the tablespace from the configuration... 
However, it was stopped and could do a little part of its work only. But still, even that little part caused great problems, as those dropped tables were critical..

What about the remaining tablespace and all those data files?  As the dropping the contents part of the drop tablespace operation did not complete, those data files and the tablespace itself were probably healthy. Still, we should made sure that all the data files were online , they didn't have any errors and they didn't need any recovery.
A query which can be used for that -> select file#, substr(name, 1, 50), status, error, recover from v$datafile_header where TABLESPACE_NAME='tablespace_name';

Yes. We came to the end of our article... As we have seen in our story, the faster we detect and realize, the better... This is correct almost in any cases.. Especially, for the problems that require restore and recovery.. This is for sure. The later we find the problem (the later we realize), the harder our work gets. It is needless to say that we should not skip important checkpoints to save time.
So these seem to be my last recommendations in this post.. I hope this helps :)

Sunday, July 19, 2020

The advantage of being an Apps DBA -- Quick conf. for DMZ Reverse Proxy Configuration for OBIEE & Analytics

As required by my position in recent years, my primary focus has been on Exadata, Core Databases, ODA, virtual machines, Hadoop clusters, OCI, GCP and Linux. However; those who know me, may remember that, I was spending almost all my time with EBS and Middleware in the past.

In this blog post, I will give you the benefit of being an APPS DBA. I will explain this through an example and show you how easy an APPS DBA can implement avanced middleware and web tier configurations when there is a need..Especially, when they expect a DBA to do such configurations...

This is ofcourse, a benefit of dealing with EBS.. As you may already know, EBS is tone of the most complex software systems in Oracle world. EBS contains many technologies. While administrating EBS environments, the nature of the EBS automatically makes you an expert in these technologies.

This is the key benefit that every DBA should have .. Especially. in these days when we consider cloud migrations.. In my opinion, DBAs with apps, middleware, systems and network knowledge will be advantageous on cloud operations and management.

Anyways, recently, a OBIEE system was required to be accessible from the internet. Not only the OBIEE but some other urls that are host by the OBIEE node should also be accessible from the internet. DMZ was the first thing that crossed on my mind.

This one was actually a pure middleware admin, web server admin , in some cases a system/linux admin work, but my team and I volunteered to do it.

We volunteered, beacuse we were already giving Managed Services for critical EBS environments, and we already knew what a DMZ is..

We knew how to implement a DMZ configuration for EBS. We knew Apache and Oracle HTTP Server. OBIEE was running on Weblogic, as EBS does :) We knew the required configuration. We also knew the recommended configuration as we were living in EBS world :)

Anyways we got the job, and as a trusted advisor I recommended a list of configurations to our customer.

As for the reference and lookup, again EBS documents were the keys :)

Oracle E-Business Suite R12 Configuration in a DMZ (Doc ID 380490.1)
Oracle E-Business Suite Release 12.2 Configuration in a DMZ (Doc ID 1375670.1)

Ofcouse there are several configurations and they depend on the security level that you are targeting. Reverse Proxies, External Tiers , Internal Tiers, Data Firewalls and so on..

However, in this case, we had to find a balance between security and ease of management.

Our purpose was to configure DMZ for OBIEE, in order to make it available for the internet access.
OBIEE was working with a full path TLS.. (this is almost a must when you consider internet access). 

We had 2 firewalls , one of them was in front of the DMZ and the other was located behind the DMZ. This is also the minimum firewall requirements in these types of cases.

Customer didn't want to have an extra node. Customer didn't want to increase the management efforts, either. 

So we decided to implement a reverse proxy configuration.

Note that,  a reverse proxy is a type of proxy server that retrieves resources on behalf of a client from one or more servers.. You can think it like, it is a proxy which operates in a reverse direction :)
Do you remember the difference between Forward Cross Edition triggers and Reverse Cross Edition triggers, that we use in Online Patching in EBS 12.2? The logic is similar :) Joking :) It is not that..
Anyways..

After this long introduction, lets focus on the implementation and the configuration that was done for the reverse proxy.

DMZ (demilitarized zone), known  by the name, is a purified zone in the network. It is a sacrificial area and it is a common requirement in security audits. 
Basically, we configure a zone between the external firewall and internal firewall and use it as an extra layer for the applications, which are required to be accessible from the public network.

We can use DMZ configuration for giving puıblic or external network access to the services such as FTP server, Mail Server or even EBS services. 

When we configure it correctly and securely, we can make the required modules our services of our applications or solutions be communicatable through the outside network, without compromising the security of our internal network.

In this context,  we implemented a reverse proxy-only DMZ configuration for OBIEE and some other services. We implemented a reverse proxy -only configuration, because ot the reasons I mentioned before. 

Following diagram shows the configuration that we have implemented for this specific case;


We have used Apache here. Ofcourse, we could have used Oracle Http Server , as well. The installation was simple.. The configuration was also simple , but it was a little tricky :)

1. First of all, we installed httpd (Apache) on the server (Oracle Linux 7), which was to be used as a reverse proxy server in DMZ.

yum -y install httpd

2. We disabled SELINUX and Firewall. (Note that, these are just the quick and easy steps to give you the idea.. You can leave your firewall and selinux running.. However; iff that's the case you should configure your firewall and selinux to permit the communication needed by this configuration)

vi /etc/sysconfig/selinux
SELINUX=disabled
systemctl stop firewalld
systemctl disable firewalld

3. On the reverse proxy server, the httpd.conf file located in /etc/httpd/conf.

We configured our Apache which was in the role of a reverse proxy and responsible to redirect all http requests that are made to http://dmz_rev_proxy_host/analytics.. Our reverse proxy should have been redirecting those request to the internal host -> http://internal_host_analytics: 9402/analytics.

We used multiple Location tags for redirecting multiple ports and addresses to the multiple internal urls. The ProxyPass directive was the main proxy configuration directive.. It was used to redirect the requests to the server on the internal network.
ProxyPassReverse directive was another important directive, and it was used to avoid bypassing the reverse proxy.
Ofcouse there are other parameters that need to be set according to your needs, but I will not dive that deep in this blog post.. 
Nevertheless, you can always write me if you have any questions. (for your questions, please use my forum -> http://ermanarslan.blogspot.com/p/forum.html)

So, our goal was to redirect the following DMZ node url requests to the following internal OBIEE node -> 

DMZ urls:

http:///dmz_rev_proxy_host:9402/analytics
http:///dmz_rev_proxy_host:9402/va
http:///dmz_rev_proxy_host:9402/dv
http:///dmz_rev_proxy_host:9402/bi-security-login

Internal OBIEE node urls:

http:///internal_host_analytics:9402/analytics
http:///internal_host_analytics:9402/va
http:///internal_host_analytics:9402/dv
http:///internal_host_analytics9402/bi-security-login

Note that, I am giving you some example urls, and as you see they are just http urls.. 
Again, ask me if you have any questions about reverse proxy configuration for SSL/TLS urls.

Now that we know our urls, let's look at the parameters we added to httpd.conf to achieve our goal.

vi /etc/httpd/conf/httpd.conf

<VirtualHost *:80>
<Location "/analytics">
ProxyPass http://internal_host_analytics:9402/analytics
ProxyPassReverse http://internal_host_analytics:9402/analytics
Order allow,deny
Allow from all
</Location>

<Location "/va">
ProxyPass http://internal_host_analytics/va
ProxyPassReverse http://internal_host_analytics:9402/va
Order allow,deny
Allow from all
</Location>

<Location "/dv">
ProxyPass http://internal_host_analytics:9402/dv 
ProxyPassReverse http://internal_host_analytics:9402/dv
Order allow,deny
Allow from all
</Location>

<Location "/bi-security-login">
ProxyPass http://internal_host_analytics:9402/bi-security-login 
ProxyPassReverse http://host_analytics:9402/bi-security-login
Order allow,deny
Allow from all
</Location>
Redirect Permanent /bi-security-login http://internal_host_analytics:9402/bi-security-login
ProxyPreserveHost On
</VirtualHost>

Note that, the Redirect Permanent parameter provides a permanent redirection to the bi-security-login page. We needed it, because 9402/va was redirecting to the http://internal_host_analytics:9402/bi-security-login.. The address in the browser's address was changing to the internal server's url and we didn't want that.

4. After making our changes in httpd.config, we restarted the httpd service.

systemctl stop httpd
systemctl start httpd

5. After these processes are completed and ofcourse after the customer completed the required firewall configurations and the network configurations for the internet access to the DMZ node and port, we could be able to access the analytical interfaces hosted by the analytical server in the internal network from outside the network, through the DMZ server.

As I finish my writing, I want to add 2 more things.

Be careful while dealing with reverse proxy configurations..If you change your urls, I mean if you use a configuration that  rewrites the urls on the fly, your styles and other links that you use in your applications may fail to load. So your applications should be compatible for this types of reverse proxy  configurations.. If they are not, you may still find a solution by using some extra directives in the httpd.conf.. Ofcourse if you are lucky :).. So in some cases, you may need to review and improve the code.. I mean the web-related code. That's why, always work with your apps teams while implementing this type of configurations.

Lastly, I have given you a basic reverse-proxy only configuration in this blog post.. But ofcourse it is not that secure.. Review the EBS notes that I shared with you earlier in this blog post and find the one that suits your needs..  I recommend you break the balance between security and ease of management, of course in favor of security :)

Friday, July 10, 2020

Tips for Exadata POCs -- Exadata X8-2M , 19C Database Version, Oracle RAT and Data Masking

If you are planning to make a POC on Exadata in these days, there are some things to keep in mind.
In this blog post, I will quickly shed some light on these things and I think you might find them interesting.

As you may already know, the latest generation of Exadata is the 8th generation.. X8-2 and X8-2M.. X stands for Intel, the first number is the generation and the second number is the number of sockets on the database servers: You know this story right? :)

Anyways, in 8th generation, we have actually 2 main models. One of them is Exadata X8-2 and the other one is X8-2M, which has a brighter future.

So if you decide to use X8-2M on your POC,  you'd better do your POC with Oracle Database 19C.
This is because; only 19.3 or greater has full support for the persistent memory commit accelerator. For instance; 18c will not directly use PMEM. It will send a message to the CellSrv, and this will find data in PMEM, so PMEM acts as an extra level of cache. On the other hand; 19c can access PMEM directly via RDMA.
So I think you know what I mean.. This is not a must, but in fact, this is necessary to unleash the performance of Exadata X8-2M.

One other thing to keep in mind is related with Oracle Real Application Testing (RAT)..  
That is , in order to be able to use Database Replay, your source and target databases should be in the same version.. In other words; after a captured workload is pre-processed, it can be replayed repeatedly on a replay system that is running the same version of Oracle Database.

But what about Data Masking? Actually I have never done that, but it can also be done on the fly while doing RAT. 
Sensitive data in database, SQL Tuning Sets and Database Replay workload capture files can be masked uniformly according to the definitions specified by business and regulatory requirements defined in the Data Masking definitions.  So you can perform a capture from production and then replay it securely in a masked non-production environment. 

That is it for now :) This one's was a quick tip. Hope this helps..