Opened 6 weeks ago
Last modified 6 weeks ago
#36517 closed New feature
Add Native Vector Support for Oracle: VectorField, VectorIndex, and VectorDistance — at Initial Version
Reported by: | SAVAN SONI | Owned by: | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | SAVAN SONI | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
This feature adds native support for Oracle’s Vector: https://docs.oracle.com/en/database/oracle/oracle-database/23/vecse/overview-ai-vector-search.html data type introduced in Oracle 23c. It enables AI and ML applications to store and query high-dimensional data directly in the database using a new VectorField model field, VectorIndex support for similarity search, and ORM expressions for vector operations.
Features Included:
VectorField model field:
- Accepts optional dimensions, storage_format, and storage_type arguments.
- Supports Dense and Sparse vector storage.
- Auto-converts lists, NumPy arrays, and oracledb.SparseVector for insert/update.
Vector Index support:
- VectorIndex class using Meta.indexes.
- Support for HNSW and IVF index types.
- Optional parameters: distance, accuracy, parallel, etc.
Vector distance expressions and lookups:
- Custom Func class VectorDistance for VECTOR_DISTANCE(lhs, rhs, metric)
- CosineDistance, EuclideanDistance, and NegativeDotProduct etc. as lookups.
- Query syntax via filter() and order_by() for similarity search.
Testing:
- Dense and Sparse vector insert/query tests added.
- Stress test scripts for repeated inserts/queries included.
Example:
from django.db import models VectorIndex = model.VectorIndex VectorDistanceType = models.VectorDistanceType VectorIndexType = models.VectorIndexType class Product(models.Model): name = models.CharField(max_length=100) embedding = models.VectorField(dim=3, storage_format=VectorStorageFormat.FLOAT32, storage_type=VectorStorageType.DENSE) class Meta: indexes = [ VectorIndex( fields=["embedding"], name="vec_idx_product", index_type=VectorIndexType.HNSW, distance=VectorDistanceType.COSINE, ) ]
And a Similarity search can be performed
query_vector = array.array("f", [1.0, 2.0, 3.0]) products = Product.objects.annotate( score=VectorDistance( "embedding", query_vector, metric=VectorDistanceType.COSINE, ) ).order_by("score")[:5]
Implementation Status
We have already implemented:
- Custom VectorField with support for DENSE and SPARSE formats
- Automatic SQL generation for model/table creation
- VectorIndex support with customizable parameters and distance metrics
- ORM expressions and lookups for vector distance queries (e.g., CosineDistance, EuclideanDistance)
- Basic tests for dense vector creation, insertion, indexing, and querying
- Integration with Oracle’s Python driver (oracledb) for runtime behavior
PR Readiness
We have finalized the major components of this feature and are ready to open a public pull request after community feedback or approval of this feature proposal.