id summary reporter owner description type status component version severity resolution keywords cc stage has_patch needs_docs needs_tests needs_better_patch easy ui_ux 28128 Fulltext search very slow with annotate(SearchVector) Gavin Wahl "I have this model: {{{#!python class Bug(models.Model): # .. fields ... # This field is managed by Django, adding stuff to it that should be # included in the fulltext search. fulltext = models.TextField() }}} The `Bug` table has many rows, so I need to add an index on the fulltext column for fulltext search: {{{#!sql CREATE INDEX ON buggy_bug USING GIN (to_tsvector('english'::regconfig, COALESCE(""fulltext"", ''))); }}} Note that postgres requires the `english` config argument to `to_tsvector` for it be allowed to be used in an index. Now, at query time, the expression Django uses must exactly match the expression in the functional index. By default, `filter(fulltext__search='foo')` leaves out the config argument to `to_tsvector`, so my index isn't used. The documentation shows us how to fix this, by annotating a search field: {{{#!sql Bug.objects.annotate( search=SearchVector('fulltext', config='english') ).filter( search='foo' ) }}} Now the expressions match and the index is used. However, the query is still crazy slow. This is the query Django does: {{{#!sql SELECT ""buggy_bug"".""id"", to_tsvector('english'::regconfig, COALESCE(""buggy_bug"".""fulltext"", '')) AS ""search"" FROM ""buggy_bug"" WHERE to_tsvector('english'::regconfig, COALESCE(""buggy_bug"".""fulltext"", '')) @@ (plainto_tsquery('english'::regconfig, 'foo')) = true; }}} Which takes 2500 milliseconds on my data, and EXPLAIN confirms it is using the index. With some playing around with it, I find that removing the annotated search column gives a huge speedup: {{{#!sql SELECT ""buggy_bug"".""id"" FROM ""buggy_bug"" WHERE to_tsvector('english'::regconfig, COALESCE(""buggy_bug"".""fulltext"", '')) @@ (plainto_tsquery('english'::regconfig, 'foo')) = true; }}} This query only takes 11 milliseconds. What's the point of selecting the result of to_tsvector? We don't actually use it in Django, we just had to do it to specify the config to use for to_tsvector. There needs to be a way to specify the config for to_tsvector in order to use an index without a huge slowdown. " Uncategorized closed contrib.postgres 1.11 Normal duplicate Unreviewed 0 0 0 0 0 0