﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
30637	SearchVectorField and SearchVector should be combinable.	Dani Hodovic	Ahmed Nassar	"When using `django.contrib.postgres` to perform full text search it's not possible to combine SearchVectors and SearchVectorfields. Doing so impacts the performance of the query.

Suppose we have a model with a small ""role"" field and a much larger ""text"" field. The text field is large enough that it warrants indexing in a separate column as a tsvector with a Gin index to ensure our queries are fast.

{{{
#!div style=""font-size: 80%""
Code highlighting:
  {{{#!python
  class JobPosting(models.Model):
      role = models.CharField(max_length=170, null=True)
      text = models.TextField(max_length=8000, default="""")
      # Large field optimized for full text search
      text_search = SearchVectorField(null=True)

      class Meta:
          indexes = [GinIndex(fields=[""text_search""])]
  }}}
}}}

If we need to perform a search on all columns we need to combine them into a common tsvector. The problem is that Django casts the large and search optimized text_search field to text and then back into a tsvector. This results in Postgres not using the existing Gin index and having to cast between types which makes for a very slow query.

{{{
#!div style=""font-size: 80%""
Code highlighting:
  {{{#!python
  JobPosting.objects.annotate(full_text=(SearchVector(""role"") + SearchVector(""text_search""))).filter(full_text=SearchQuery(""foo""))
  }}}
}}}


{{{
#!div style=""font-size: 80%""
Code highlighting:
  {{{#!sql
  EXPLAIN ANALYZE SELECT ""jobs_jobposting"".""id"",
         ""jobs_jobposting"".""role"",
         ""jobs_jobposting"".""text"",
         (to_tsvector(COALESCE(""jobs_jobposting"".""role"", '')) || to_tsvector(COALESCE((""jobs_jobposting"".""text_search"")::text, ''))) AS ""full_text""
    FROM ""jobs_jobposting""
   WHERE (to_tsvector(COALESCE(""jobs_jobposting"".""role"", '')) || to_tsvector(COALESCE((""jobs_jobposting"".""text_search"")::text, ''))) @@ (plainto_tsquery('foo')) = true
   LIMIT 21;
  ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   Limit  (cost=0.00..35.54 rows=1 width=809) (actual time=40.085..40.085 rows=0 loops=1)
     ->  Seq Scan on jobs_jobposting  (cost=0.00..35.54 rows=1 width=809) (actual time=40.082..40.082 rows=0 loops=1)
           Filter: ((to_tsvector((COALESCE(role, ''::character varying))::text) || to_tsvector(COALESCE((text_search)::text, ''::text))) @@ plainto_tsquery('foo'::text))
           Rows Removed by Filter: 42
   Planning Time: 3.140 ms
   Execution Time: 40.273 ms
  }}}
}}}


If you compare this to using the text_search field directly we can see that the query is much faster, presumably due to the use of index and lack of casting to `::text`

{{{
#!div style=""font-size: 80%""
Code highlighting:
  {{{#!python
  JobPosting.objects.filter(text_search=SearchQuery(""foo""))
  }}}
}}}

{{{
#!div style=""font-size: 80%""
Code highlighting:
  {{{#!sql
  EXPLAIN ANALYZE SELECT ""jobs_jobposting"".""id"",
         ""jobs_jobposting"".""text"",
         ""jobs_jobposting"".""role""
    FROM ""jobs_jobposting""
   WHERE ""jobs_jobposting"".""text_search"" @@ (plainto_tsquery('foo')) = true
   LIMIT 21;
  ------------------------------------------------------------------------------------------------------------------
   Limit  (cost=0.00..15.24 rows=1 width=371) (actual time=1.165..1.166 rows=0 loops=1)
     ->  Seq Scan on jobs_jobposting  (cost=0.00..15.24 rows=1 width=371) (actual time=1.163..1.163 rows=0 loops=1)
           Filter: (text_search @@ plainto_tsquery('foo'::text))
           Rows Removed by Filter: 42
   Planning Time: 0.699 ms
   Execution Time: 1.209 ms
  }}}
}}}


Compare the execution times: from **40.3ms to 1.2ms**.

You could technically concatenate the role and the text field into one SearchVectorField, but then you would be unable to search rank different fields differently. Perhaps we would like to rank the information in the role column as A, but the text column as C.

I have tried to use F expressions to concatenate the columns, but then Django complains.

{{{
#!div style=""font-size: 80%""
Code highlighting:
  {{{#!python
  JobPosting.objects.annotate(full_text=(SearchVector(""role"") + F(""text_search""))).filter(full_text=SearchQuery(""foo""))
  TypeError: SearchVector can only be combined with other SearchVectors
  }}}
}}}

I have also tried casting the ""text_search"" field into a SearchVector, but that also fails.
{{{
#!div style=""font-size: 80%""
Code highlighting:
  {{{#!python
  JobPosting.objects.annotate(full_text=(SearchVector(""role"") + Cast(""text_search"", SearchVector()))).filter(full_text=SearchQuery(""foo""))
  TypeError: SearchVector can only be combined with other SearchVectors
  }}}
}}}"	Bug	assigned	contrib.postgres	dev	Normal		db, postgres, full-text, search	Ülgen Sarıkavak Dmytro Litvinov	Accepted	0	0	0	0	0	0
