﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
28128	Fulltext search very slow with annotate(SearchVector)	Gavin Wahl		"I have this model:

{{{#!python
class Bug(models.Model):
    # .. fields ...

    # This field is managed by Django, adding stuff to it that should be
    # included in the fulltext search.
    fulltext = models.TextField()
}}}

The `Bug` table has many rows, so I need to add an index on the fulltext column for fulltext search:

{{{#!sql
CREATE INDEX ON buggy_bug USING GIN (to_tsvector('english'::regconfig, COALESCE(""fulltext"", '')));
}}}

Note that postgres requires the `english` config argument to `to_tsvector` for it be allowed to be used in an index.

Now, at query time, the expression Django uses must exactly match the expression in the functional index. By default, `filter(fulltext__search='foo')` leaves out the config argument to `to_tsvector`, so my index isn't used. The documentation shows us how to fix this, by annotating a search field:

{{{#!sql
Bug.objects.annotate(
    search=SearchVector('fulltext', config='english')
).filter(
    search='foo'
)
}}}

Now the expressions match and the index is used. However, the query is still crazy slow. This is the query Django does:

{{{#!sql
SELECT ""buggy_bug"".""id"",
       to_tsvector('english'::regconfig, COALESCE(""buggy_bug"".""fulltext"", '')) AS ""search""
FROM ""buggy_bug""
WHERE to_tsvector('english'::regconfig, COALESCE(""buggy_bug"".""fulltext"", '')) @@ (plainto_tsquery('english'::regconfig, 'foo')) = true;
}}}

Which takes 2500 milliseconds on my data, and EXPLAIN confirms it is using the index.  With some playing around with it, I find that removing the annotated search column gives a huge speedup:

{{{#!sql
SELECT ""buggy_bug"".""id""
FROM ""buggy_bug""
WHERE to_tsvector('english'::regconfig, COALESCE(""buggy_bug"".""fulltext"", '')) @@ (plainto_tsquery('english'::regconfig, 'foo')) = true;
}}}

This query only takes 11 milliseconds.

What's the point of selecting the result of to_tsvector? We don't actually use it in Django, we just had to do it to specify the config to use for to_tsvector.

There needs to be a way to specify the config for to_tsvector in order to use an index without a huge slowdown.
"	Uncategorized	closed	contrib.postgres	1.11	Normal	duplicate			Unreviewed	0	0	0	0	0	0
