#34107 closed Bug (invalid)
Django ORM queries do not pick up indexes made on a key in JSONField in Postgres
Reported by: | Tadek Teleżyński | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.2 |
Severity: | Normal | Keywords: | json, jsonfield, postgres, index |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Hey folks!
Been developing in Django for the past 5 years, excited to post my first ticket :)
Thank you from the bottom of my heart for developing and maintaining this amazing framework.
If you feel like the below is more of a feature request than a bug, feel free to recategorize it.
Consider a following setup (assuming Postgres database, version 13.8):
class Example(models.Model): data = models.JSONField(null=False, blank=True, default=dict)
And the following migration adding two indexes (note the difference in "->" and "->>") on data__category
key:
class Migration(migrations.Migration): dependencies = [ (...), ] atomic = False operations = [ migrations.RunSQL( "CREATE INDEX CONCURRENTLY CategoryIndex_default_cast ON app_example USING BTREE((data->>'category'));", reverse_sql="DROP INDEX IF EXISTS CategoryIndex_default_cast;" ), migrations.RunSQL( "CREATE INDEX CONCURRENTLY CategoryIndex ON app_example USING BTREE (((data->'category')::TEXT));", reverse_sql="DROP INDEX IF EXISTS CategoryIndex;" ), ]
Now if I use a regular filtering syntax on a query none of the indexes is used:
>>> print(Example.objects.filter(data__category='dog').explain()) Seq Scan on app_example (cost=0.00..29.05 rows=6 width=36) Filter: ((data -> 'category'::text) = '"dog"'::jsonb)
In order to pick up the index I need to write a bit more complex query:
>>> print(Example.objects.annotate(_category=KeyTextTransform('category', 'data')).filter(_category=Value('dog')).explain()) Bitmap Heap Scan on app_example (cost=4.20..13.70 rows=6 width=68) Recheck Cond: ((data ->> 'category'::text) = 'dog'::text) -> Bitmap Index Scan on categoryindex_default_cast (cost=0.00..4.20 rows=6 width=0) Index Cond: ((data ->> 'category'::text) = 'dog'::text)
To be honest I wasn't able to figure out a query that would pick up the second index ("CategoryIndex") that's using explicit type casting.
It would be very neat if the ORM could figure the type casting automatically.
Change History (5)
comment:1 by , 2 years ago
Description: | modified (diff) |
---|
comment:2 by , 2 years ago
Description: | modified (diff) |
---|
follow-up: 4 comment:3 by , 2 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
follow-up: 5 comment:4 by , 2 years ago
Replying to Mariusz Felisiak:
By the way, have you tried to create these indexes via
Meta.indexes
? You can create them concurrently without writing raw SQLs.
I couldn't figure out a syntax for the fields
key when I want to do an index on a key inside the json structure.
BTreeIndex(fields=['data__category'])
won't do...
comment:5 by , 2 years ago
I couldn't figure out a syntax for the
fields
key when I want to do an index on a key inside the json structure.
BTreeIndex(fields=['data__category'])won't do...
BTreeIndex
supports expressions, so you can use BTreeIndex(F("data__category"), name="...")
.
I don't think there is anything that Django could do better here. In most of cases creating indexes via
Meta.indexes
helps to avoid small differences between defined indexes and used filters.You can try to
Cast(F("data__category"), output_field=TextField())
.By the way, have you tried to create these indexes via
Meta.indexes
? You can create them concurrently without writing raw SQLs.