Opened 2 years ago

Last modified 2 years ago

#33459 closed Cleanup/optimization

Explain how to optimize full text search with SearchVectorField and GinIndex — at Version 2

Reported by: Thomas Aglassinger Owned by: nobody
Component: Documentation Version: 4.0
Severity: Normal Keywords: postgres
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Thomas Aglassinger)

The current documentation section on SearchVectorField at https://docs.djangoproject.com/en/dev/ref/contrib/postgres/search/#searchvectorfield does not explain how to use GinIndex or GistIndex to increase the performance of the search. It currently only describes how to add a SearchVectorField. To my understaning this somewhat does improve the performance on a linear scale by removing the need to parse the fields to search with Postgres' full text search parser. However also indexing this field would typically improve performance by a mangitude.

I eventually managed to piece this together from an article found at http://logan.tw/posts/2017/12/30/full-text-search-with-django-and-postgresql/ but believe this fairly standard use case should be covered in the Django documentation already.

So I propose to add a few paragraphs that show how to add a SearchVectorField to a model with a GinIndex, compute a search vector from multiple fields and then perform a ranked search on it.

For the related pull request, see <https://github.com/django/django/pull/15350>
I don't consider the current patch to be final, things to discuss:

  • Should the section on "SearchVectorField" be ranamed to "SearchVectorField and indexing"?
  • Should the section on "Performance" be included into the section on "SearchVectorField"? Currently it describes the problem well but I found the solution of pointing to the Postgres documentation unhelpful. If GinIndex is mention later anyway, the pointer to the postgres documentation could be added afterwards for further reading.
  • Is it alright to extend the Entry model from the previous chapter, or should I add a separate model like SearchableEntry? The first approach might confuse readers if they skim over the part where Entry gets redefined and think it's the same model as in other chapters.

Also it might be helpful to include a "full text search how-to" for example describing how to efficiently search a database of news articles in multiple languages. While the current reference documentation explains search configurations well enough, the later examples (rightfully) omit it to keep the explanations focused. This however limits their usefulness for skimming and copying the examples.

If you are interested, I could write such a how-to.

Related pull request: https://github.com/django/django/pull/15350

Change History (2)

comment:1 by Thomas Aglassinger, 2 years ago

Description: modified (diff)

comment:2 by Thomas Aglassinger, 2 years ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.
Back to Top