Friday, August 9, 2019

Weblogic -- Disaster Recovery implementations

No matter the application code is delivered by Oracle or not, we do see Weblogic in our implementation projects.

Like the FMW products such as OAM, OID and SOA use it as a built-in application server, other important Oracle application like EBS makes use of the enhanced capabilities of Weblogic Server in its application tier.

In addition to these packaged Oracle applications, we also see  thatcustom java code running on Weblogic and new projects are deployed on it.

Most of the applications residing on Weblogic and apps tier, have also a database layer for storing and querying data.

When it comes to deciding on the DR implementation, we easily conclude on the database disaster recovery methods, don't we?

For instance, we directly decide to use the Data Guard if the database in the source and target environments are both Oracle. Only if the database is not an Enterprise Edition Oracle Database (which means Data Guard can not be used), we think about other alternative solutions.

So far so good.

However; building and deciding on a correct disaster recovery solution for the Weblogic/the apps tier, is usually a little bit compex for us, for the DBAs.

In this post, I will shed a light on this subject by giving you the required method and prerequisities for it.

First of all, Weblogic DR implementation can be done by replicating the Weblogic filesystem basically.

We can either do it by a storage replication (like Netapp's Snapmirror) or by using a 3rd party tool  (like rsync)

If we don't have a storage environment, which is capable of replication the Weblogic filesystem across storages, then we can still implement our DR using a tool like rsync.

The replication that must be done for feeding the DR environment should be an as-is replication.
Such a replication should be done automatically (for instance using a scheduler like crond) and it is recommended to replicate the Weblogic filesystem at least once in a day.

This replication can be done while the Weblogic application server is running.

Patching activities should be done on the primary first.

After a successful patching operation,  the replication routine should be triggered manually to reflect the changes directly to the Weblogic DR filesystem.(considering the DB is already being replicated in the backend -- a manual database syncronization can also be triggered after these patching activities)

As for the switchover and failover operation;

If the the hostname of the primary Weblogic Server and the hostname of DR site Weblogic Server are the same, then we can start the services directly without doing anything extra in case of a failover/or switchover. (ofcourse we need to change the direction of the replication)

However; If  the hostname of the primary Weblogic Server and hostname of the DR site Weblogic Server are different, then we need to configure a virtual hostname for this weblogic environment. We need to configure it both for admin Server and Managed servers.

That is, the listen adress of the admin and managed servers should be based on the virtual hostnames, and that virtual hostname must be resolvable both from the Primary and DR site. Thus, even if the physical hostnames are different, we can still do a failover by just starting the services on the DR Site, without having to do anything extra.

If we have different hostnames for Primary and Disaster Sites and if we don't have a virtual hostname configured, we need to do some config changes in case of a failover or switchover ( config changes in files such as config.xml)

Ofcourse, the same logical requirements aplly for a database failover as well ( if that database is used by a Weblogic Server)

In a case where we switchover or failover the database tier of a Weblogic installation, then we need to change the database configuration of our Weblogic environment. (we have Data Sources right..)

Again, if we use virtual hostname for the db tier, or if we use a load balancer and configure our database and weblogic to use the hostname which is managed by the load balancer, then we can do our database layer failover by just starting/activating the DR site database without having to do anything extra.

I hope you get the idea.

Lastly, I will give you the list of actions which can be taken to do a failover or a switchover operation on a recommend Weblogic-Database configuration;

To perform a failover or switchover from the production site to the standby site when you use rsync:

*Shut down any processes running on the production site (if applicable).
*Stop rsync jobs between the production site hosts and standby site peer hosts.
*Use Oracle Data Guard to failover the production site databases to the standby site.
*On the standby site, manually start the processes for the Oracle Fusion Middleware Server instances.
*Route all user requests to the standby site by performing a global DNS push or something similar, such as updating the global load balancer.
*Use a browser client to perform post-failover or post-switchover testing to confirm that requests are being resolved at the standby site (current production site).
*At this point, the standby site is the new production site and the production site is the new standby site.
*Reestablish rsync between the two sites, but configure it so that replications go now in the opposite direction (from the current production site to the current standby site).

No comments :

Post a Comment

If you will ask a question, please don't comment here..

For your questions, please create an issue into my forum.

Forum Link: http://ermanarslan.blogspot.com.tr/p/forum.html

Register and create an issue in the related category.
I will support you from there.