Tuesday, June 4, 2024

Converged Database - Oracle Database 23AI Vector Search - Combine traditional search on business data with AI vector powered similarity search.

 

Oracle Database 23AI Vector Search: A Game Changer for Enterprise Search

Hey everyone, Erman Arslan here. Today, I'm diving into a revolutionary new feature in Oracle Database 23AI: Vector Search. This technology promises to completely transform how you search for information, especially within your enterprise data.

Understanding Semantics Through Vectors

Imagine searching for data based on meaning, not just keywords. That's the power of Vector Search. It uses machine learning embedding models, like ResNet for images and Bert for text, to convert your data into vectors. These vectors represent the semantic essence of your information. Similar entities will have vectors close together in this multidimensional space.

The Power of Combining Traditional and AI-powered Search

The beauty of Oracle Database 23AI is that it seamlessly integrates traditional search with AI-powered vector similarity search. This eliminates the need for placing separate Vector Databases. which can lead to data staleness, increased complexity, hard-to-maintain consistency, and security risks.

23AI: The Enterprise-grade Advantage

Here's where Oracle shines. Oracle Database 23AI, is a converged platform that eliminates the complexities of managing separate systems. It also tackles a major challenge of Large Language Models (LLMs): hallucination. By combining LLM knowledge with relevant search results from vector searches, 23AI ensures accurate and reliable responses.

LLM + AI Vector Search: A Powerful Knowledge Base

Imagine this: you have a vast knowledge base that combines real-time enterprise data with a broad range of information from the internet. That's the magic of LLM and AI Vector Search working together. Users submit queries, which are encoded as vectors and searched against the database. The closest matches are then fed to the LLM, empowering it to deliver comprehensive and informative responses.

"LLM + AI Vector Search" means broad Range of data from internet snapshot of data from a point in time + Private Enterprise Business Data !!! (Real Time updating the knowledge base...)

Unveiling the New SQL for Vector Power

23AI introduces a range of new SQL features to unleash the power of vector searches:

  • New SQL for Vector Generation: Easily generate vectors from your data.
  • New Vector Data Type: Store vector embeddings efficiently using the new VECTOR data type.
  • New Vector Search Syntax: Perform efficient similarity searches with the VECTOR_DISTANCE function and optional distance metrics.
  • New Approximate Search Indexes: Achieve high performance with approximate search indexes for large datasets.
  • New PL/SQL Packages and Integrations: Extend the functionality with PL/SQL packages and integrate with third-party frameworks for building robust AI pipelines.

Crafting Powerful Vector Search Queries

Here's an example query that demonstrates the power of vector search:

SQL

SELECT ... FROM JOB_Postings WHERE city IN (SELECT PREFERRED_ CITIES FROM Applications...) ORDER BY vector_distance(job_desc_vectors, :resume_vector) FETCH APPROXIMATE FIRST 10 ROWS ONLY WITH TARGET ACCURACY 90;

This query searches for job postings with job descriptions most similar to the provided resume vector, ensuring a perfect match for the candidate.

Choosing the Right Vector Index

23AI offers two types of vector indexes for optimal performance:

  • Graph Vector Index: In-memory index for fast and highly accurate searches on smaller datasets.
  • Neighbor Partition Vector Index: Scalable index for massive datasets that can't fit in memory. It delivers fast results with a high chance of finding relevant matches.
Here is an index creation Example/Syntax;

DDL

CREATE VECTOR INDEX photo_idx ON Customer(photo_vector) 
ORGANIZATION [INMEMORY_ NEIGHBOR GRAPH | NEIGHBOR PARTITIONS]
DISTANCE COSINE | EUCLIDEAN | MANHATTAN | ... WITH TARGET ACCURACY 90 (Here we can specify the accuracy.. )

Note that, we use APPROXIMATE keyword to tell the  optimizer use the relevant index But even if we specify that, Oracle's Cost Based optimizer can still do exact searches, if it finds the index access costly. Ex: FETCH APPROXIMATE FIRST 5 ROWS ONLY.

The Importance of Enterprise-grade CBO

Optimizing vector search queries, especially when combined with normalized enterprise data, requires an enterprise-grade Cost-Based Optimizer (CBO). 23AI delivers on this front, unlike purpose-built vector databases that lack this crucial functionality.

Beyond Single Vectors: Multi-Vector Queries

23AI empowers you to perform multi-vector queries, allowing you to search based on a combination of different vectors.

Key Differentiators: Why Choose Oracle Database 23AI

  • Transactional Consistency: Neighbor Partition Vector Indexes guarantee transactional consistency, making them ideal for high-speed, consistent operations.
  • Scale-out Architecture: Distribute vector search workloads across RAC nodes for exceptional scalability.
  • Exadata Offloading: Offload vector search tasks to Exadata Storage for even greater performance.
  • Seamless Integration: Oracle Sharding, parallel execution, partitioning, security, etc.. All work seamlessly with AI Vector Search.

AI Vector Search: The Engine of GEN AI Pipelines

23AI goes beyond search. It serves as the foundation for powerful GEN AI Pipelines. These pipelines seamlessly integrate document loading, transformation, embedding models, vector search, and LLM reasoning – all within the robust Oracle Database 23AI platform.

This is just a glimpse into the exciting world of Oracle Database 23AI Vector Search. Stay tuned for future posts where we'll delve deeper into specific use cases and explore the key features (like True Cache and Distributed-Database related enhancements...) of the new Oracle Database Release.

No comments :

Post a Comment

If you will ask a question, please don't comment here..

For your questions, please create an issue into my forum.

Forum Link: http://ermanarslan.blogspot.com.tr/p/forum.html

Register and create an issue in the related category.
I will support you from there.