Oracle Database 23AI Vector Search: A Game Changer for Enterprise Search
Hey everyone, Erman Arslan here. Today, I'm diving into a revolutionary new feature in Oracle Database 23AI: Vector Search. This technology promises to completely transform how you search for information, especially within your enterprise data.
Understanding Semantics Through Vectors
Imagine searching for data based on meaning, not just keywords. That's the power of Vector Search. It uses machine learning embedding models, like ResNet for images and Bert for text, to convert your data into vectors. These vectors represent the semantic essence of your information. Similar entities will have vectors close together in this multidimensional space.
The Power of Combining Traditional and AI-powered Search
The beauty of Oracle Database 23AI is that it seamlessly integrates traditional search with AI-powered vector similarity search. This eliminates the need for placing separate Vector Databases. which can lead to data staleness, increased complexity, hard-to-maintain consistency, and security risks.
23AI: The Enterprise-grade Advantage
Here's where Oracle shines. Oracle Database 23AI, is a converged platform that eliminates the complexities of managing separate systems. It also tackles a major challenge of Large Language Models (LLMs): hallucination. By combining LLM knowledge with relevant search results from vector searches, 23AI ensures accurate and reliable responses.
LLM + AI Vector Search: A Powerful Knowledge Base
Imagine this: you have a vast knowledge base that combines real-time enterprise data with a broad range of information from the internet. That's the magic of LLM and AI Vector Search working together. Users submit queries, which are encoded as vectors and searched against the database. The closest matches are then fed to the LLM, empowering it to deliver comprehensive and informative responses.
"LLM + AI Vector Search" means broad Range of data from internet snapshot of data from a point in time + Private Enterprise Business Data !!! (Real Time updating the knowledge base...)
Unveiling the New SQL for Vector Power
23AI introduces a range of new SQL features to unleash the power of vector searches:
- New SQL for Vector Generation: Easily generate vectors from your data.
- New Vector Data Type: Store vector embeddings efficiently using the new VECTOR data type.
- New Vector Search Syntax: Perform efficient similarity searches with the VECTOR_DISTANCE function and optional distance metrics.
- New Approximate Search Indexes: Achieve high performance with approximate search indexes for large datasets.
- New PL/SQL Packages and Integrations: Extend the functionality with PL/SQL packages and integrate with third-party frameworks for building robust AI pipelines.
Crafting Powerful Vector Search Queries
Here's an example query that demonstrates the power of vector search:
SELECT ... FROM JOB_Postings WHERE city IN (SELECT PREFERRED_ CITIES FROM Applications...) ORDER BY vector_distance(job_desc_vectors, :resume_vector) FETCH APPROXIMATE FIRST 10 ROWS ONLY WITH TARGET ACCURACY 90;
This query searches for job postings with job descriptions most similar to the provided resume vector, ensuring a perfect match for the candidate.
Choosing the Right Vector Index
23AI offers two types of vector indexes for optimal performance:
- Graph Vector Index: In-memory index for fast and highly accurate searches on smaller datasets.
- Neighbor Partition Vector Index: Scalable index for massive datasets that can't fit in memory. It delivers fast results with a high chance of finding relevant matches.
Here is an index creation Example/Syntax;
DDL
CREATE VECTOR INDEX photo_idx ON Customer(photo_vector)
ORGANIZATION [INMEMORY_ NEIGHBOR GRAPH | NEIGHBOR PARTITIONS]
DISTANCE COSINE | EUCLIDEAN | MANHATTAN | ... WITH TARGET ACCURACY 90 (Here we can specify the accuracy.. )
Note that, we use APPROXIMATE keyword to tell the optimizer use the relevant index But even if we specify that, Oracle's Cost Based optimizer can still do exact searches, if it finds the index access costly. Ex: FETCH APPROXIMATE FIRST 5 ROWS ONLY.
The Importance of Enterprise-grade CBO
Optimizing vector search queries, especially when combined with normalized enterprise data, requires an enterprise-grade Cost-Based Optimizer (CBO). 23AI delivers on this front, unlike purpose-built vector databases that lack this crucial functionality.
Beyond Single Vectors: Multi-Vector Queries
23AI empowers you to perform multi-vector queries, allowing you to search based on a combination of different vectors.
Key Differentiators: Why Choose Oracle Database 23AI
- Transactional Consistency: Neighbor Partition Vector Indexes guarantee transactional consistency, making them ideal for high-speed, consistent operations.
- Scale-out Architecture: Distribute vector search workloads across RAC nodes for exceptional scalability.
- Exadata Offloading: Offload vector search tasks to Exadata Storage for even greater performance.
- Seamless Integration: Oracle Sharding, parallel execution, partitioning, security, etc.. All work seamlessly with AI Vector Search.
AI Vector Search: The Engine of GEN AI Pipelines
23AI goes beyond search. It serves as the foundation for powerful GEN AI Pipelines. These pipelines seamlessly integrate document loading, transformation, embedding models, vector search, and LLM reasoning – all within the robust Oracle Database 23AI platform.
This is just a glimpse into the exciting world of Oracle Database 23AI Vector Search. Stay tuned for future posts where we'll delve deeper into specific use cases and explore the key features (like True Cache and Distributed-Database related enhancements...) of the new Oracle Database Release.