Opened 6 months ago
Last modified 6 months ago
#36517 closed New feature
Add Native Vector Support for Oracle: VectorField, VectorIndex, and VectorDistance — at Initial Version
| Reported by: | SAVAN SONI | Owned by: | |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | dev |
| Severity: | Normal | Keywords: | |
| Cc: | SAVAN SONI | Triage Stage: | Unreviewed |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
This feature adds native support for Oracle’s Vector: https://docs.oracle.com/en/database/oracle/oracle-database/23/vecse/overview-ai-vector-search.html data type introduced in Oracle 23c. It enables AI and ML applications to store and query high-dimensional data directly in the database using a new VectorField model field, VectorIndex support for similarity search, and ORM expressions for vector operations.
Features Included:
VectorField model field:
- Accepts optional dimensions, storage_format, and storage_type arguments.
- Supports Dense and Sparse vector storage.
- Auto-converts lists, NumPy arrays, and oracledb.SparseVector for insert/update.
Vector Index support:
- VectorIndex class using Meta.indexes.
- Support for HNSW and IVF index types.
- Optional parameters: distance, accuracy, parallel, etc.
Vector distance expressions and lookups:
- Custom Func class VectorDistance for VECTOR_DISTANCE(lhs, rhs, metric)
- CosineDistance, EuclideanDistance, and NegativeDotProduct etc. as lookups.
- Query syntax via filter() and order_by() for similarity search.
Testing:
- Dense and Sparse vector insert/query tests added.
- Stress test scripts for repeated inserts/queries included.
Example:
from django.db import models
VectorIndex = model.VectorIndex
VectorDistanceType = models.VectorDistanceType
VectorIndexType = models.VectorIndexType
class Product(models.Model):
name = models.CharField(max_length=100)
embedding = models.VectorField(dim=3, storage_format=VectorStorageFormat.FLOAT32, storage_type=VectorStorageType.DENSE)
class Meta:
indexes = [
VectorIndex(
fields=["embedding"],
name="vec_idx_product",
index_type=VectorIndexType.HNSW,
distance=VectorDistanceType.COSINE,
)
]
And a Similarity search can be performed
query_vector = array.array("f", [1.0, 2.0, 3.0])
products = Product.objects.annotate(
score=VectorDistance(
"embedding",
query_vector,
metric=VectorDistanceType.COSINE,
)
).order_by("score")[:5]
Implementation Status
We have already implemented:
- Custom VectorField with support for DENSE and SPARSE formats
- Automatic SQL generation for model/table creation
- VectorIndex support with customizable parameters and distance metrics
- ORM expressions and lookups for vector distance queries (e.g., CosineDistance, EuclideanDistance)
- Basic tests for dense vector creation, insertion, indexing, and querying
- Integration with Oracle’s Python driver (oracledb) for runtime behavior
PR Readiness
We have finalized the major components of this feature and are ready to open a public pull request after community feedback or approval of this feature proposal.