Friday, July 9, 2021

Recoverability Roadmaps & Remediation Options - Oracle, Systems, Apps Technology, Virtualization and Engineered Systems

In this post, I want to share my thoughtson Recoverability, actually my approach that I use in Recoverability Assessments.. Actually, these types of assessments are comprehensive, they even include DR solutions, trainings, recovery processes and the continuous availability.

I start with the readiness.. Readiness in 3 different areas : People, Process and Technology. I review and rank the readiness for key areas that are enablers for availability, resiliency and recoverability by assessing current IT capabilities of the customer.

Once I generate the readiness documents, I do my analysis, determine the gaps and then present my recommendations. I support the customer in execution as well.. ( if they need me there..)

So it is pretty straight forward, but still requires lots of efforts :)

The assesments starts with the information gathering. I just gather the detailed information and do my analysis for a number of attributes in the following areas;  

Operational Staff, Response Plans, Recovery Testing, Program Maintenance, Business Expectations, Production & DR Facilities, Application Infrastructure, Data Restoration and Recovery Network.

During this first phase, we usually meet with the customers. I write down the people, process, and techonology findings. Then, we popuplate tool based discovery reports ( DB , Server, SAN healtchecks, Server grabs & logs etc..)

In the second phase, I create a recommendation list.  Next, I do the remediation roadmap, finalize the recoverability assessment document and lasty I give the final recoverability assessment repsentation (an executive presentation actually)

While analyzing the people and process findings, I check to see if there any any gaps in the following areas; business expectations, production & DR facilities, Application Infra, Data Restoration, Recovery network , Operational Staff, Reponse plan, Recovery Testing, Program maintanence and etc..

Following is an example of the GAPs that may be found in the Recovery network ;

No formal DR program
DR requirements unknown
Lack of formal documented policies or processes

Following is another example of the GAPs that may be found in Data Restoration area;

Lack of service levels with the business
No formal tiering structure
Recovery RTOs / RPOs have not been defined
Lack of recovery expectations

These are big gaps :) and they are here just to give you some examples, but I guess you understand  the scope of the work already..

In the technology analysis phase, I analyze the following layers through the following critieria;

Presenation layer, Login/application, Database, Compute, Storage, Network  --> Production HA, DR, Backup, Archiving.

Some examples for the technology findings in this phase;

A single point of failure (SPOF) exists which would cause a complete outage for the application.
Server configuration is not aligned with the intended high-availability design (cluster is misconfigured).

Well, after the findings, I create a recommendation matrix, and summarize these recommendations..
I analyze the recommendation from the implementation effort and business impact perspectives and then create a matrix to show the risk level / business impact and implementation effort  of each recommendations.


Business Impact goes low to high when you go upwards in the y axis, effort goes high to low when you go right in the x axis.. So,  action items/recommendations in the top right quadrant are given high implementation priority, due to low effort & high impact. So you get the idea..

As for the redmediation options, I give the as-is Architecture, then propose target solutions by considering/analayzing the gaps. There may be more than one solution proposed as part of the Gap Analysis against Recoverability Business Requirements .

Finally I create thre recoverabiliy roadmap and that's it :)

In the recoverability roadmap, I start with the areas of opportunities and build a 18 Months plan. (maybe further) . I list the actions that should be done in near term, in 6-12 Months and in 12-18 Months to reach the target state where we usually have the following;

Increased ROI
Standardized Environment
Ensured Recoverability 
Recoverability and continuous availability services aligned with the business needs
Operational Excellence
Organizational stability
Culture of Ensured availability & DR Services.

That's end of this post. I hope you find it useful.
If you need any advice or consultancy, feel free to contact me.

No comments :

Post a Comment

If you will ask a question, please don't comment here..

For your questions, please create an issue into my forum.

Forum Link: http://ermanarslan.blogspot.com.tr/p/forum.html

Register and create an issue in the related category.
I will support you from there.