Opened 6 years ago

Closed 5 years ago

Last modified 3 months ago

#31340 closed New feature (fixed)

Improve expression support for __search lookup and SearchQuery

Reported by: Baptiste Mispelon Owned by: Baptiste Mispelon
Component: contrib.postgres Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

(not sure whether to categorize this as a bug or a new feature)

I've been trying to implement some kind of reverse full text search where I store keywords in the database and I want to query models whose keywords would match a given piece of text.

Here's a simplified model of what I'm working with:

class SavedSearch(models.Model):
    keywords = models.TextField()

    def __str__(self):
        return self.keywords

I've managed to achieve what I want in the case of the default search configuration using annotation and wrapping things with Value or Cast:

# This works
search_query = Cast('keywords', output_field=SearchQueryField())
search_vector = SearchVector(Value("lorem ipsum ...", output_field=TextField()))
qs = SavedSearch.objects.annotate(search=search_vector).filter(search=search_query)

But if I want to use a custom search configuration, things don't work anymore:

# This doesn't work (can't adapt type 'F')
search_query = SearchQuery(F('keywords'), config='english', search_type='plain')
search_vector = SearchVector(Value("lorem ipsum ...", output_field=TextField()))
qs = SavedSearch.objects.annotate(search=search_vector).filter(search=search_query)

I'm not very familiar with the inner workings of Lookup objects but I did some digging and I think I came up with a fix which involved fixing two separate issues:

1) SearchQuery doesn't currently support anyting other than plain values (str). Fixing this required changing both resolve_expression() and as_sql().
2) the __search lookup doesn't support things like F objects because of it assumes that any value with a resolve_expression method must be a SearchQuery object.

Change History (8)

comment:1 by Baptiste Mispelon, 6 years ago

comment:2 by Simon Charette, 6 years ago

Triage Stage: UnreviewedAccepted

comment:3 by Mariusz Felisiak, 6 years ago

Owner: set to Baptiste Mispelon
Status: newassigned

comment:4 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

In dd704c6:

Refs #31340 -- Simplified SearchQuery by making it subclass Func.

comment:5 by Mariusz Felisiak, 5 years ago

Triage Stage: AcceptedReady for checkin
Version: 3.0master

comment:6 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

Resolution: fixed
Status: assignedclosed

In 3baf92cf:

Fixed #31340 -- Allowed query expressions in SearchQuery.value and search lookup.

comment:7 by David, 3 months ago

Does the current implementation works well with custom SQL functions which returns tsquery and does not use any field which is already a tsquery?

In one of my project I have created a custom SQL function which takes a list of integers (the IDs of some row in a specific table) and builds a tsquery. I tought that using the following it would work well while it raises an exception

class TheModel(models.Model):
   searchable = SearchVectorField()


TheModel.objects.filter(searchable=models.Func(123, 345, function='my_tsquery_func', output_field=SearchQueryField())
# django.db.utils.ProgrammingError: function plainto_tsquery(tsquery) does not exist

Because the SQL produced is and this produces an exception.

SELECT id, searchable
FROM myapp_themodel
WHERE searchable @@ plainto_tsquery(my_tsquery_func(123, 345));

To add more context the my_tsquery_func looks like the following (the goal is to combine tsvectors into a tsquery to build a similarity algorithm):

CREATE OR REPLACE FUNCTION my_tsquery_func(first_id bigint, second_id bigint) RETURNS tsquery AS $$
SELECT
    to_tsquery(
        array_to_string(
            tsvector_to_array(
                (SELECT  searchable FROM myapp_themodel WHERE id = first_id LIMIT 1)
               || (SELECT  searchable FROM myapp_themodel WHERE id = second_id LIMIT 1)
            ),
            ' | '
        )
    );
$$ LANGUAGE sql PARALLEL SAFE STABLE;

comment:8 by Simon Charette, 3 months ago

I think the patch should have checked output_field instead of special casing SearchQuery and CombinedSearchQuery in this form

  • django/contrib/postgres/search.py

    diff --git a/django/contrib/postgres/search.py b/django/contrib/postgres/search.py
    index 2135c9bb88..636123a3ed 100644
    a b class SearchVectorExact(Lookup):  
    1616    lookup_name = "exact"
    1717
    1818    def process_rhs(self, qn, connection):
    19         if not isinstance(self.rhs, (SearchQuery, CombinedSearchQuery)):
     19        if not isinstance(
     20            getattr(self.rhs, "_output_field_or_none", None),
     21            SearchQueryField,
     22        ):
    2023            config = getattr(self.lhs, "config", None)
    2124            self.rhs = SearchQuery(self.rhs, config=config)
    2225        rhs, rhs_params = super().process_rhs(qn, connection)
    def __str__(self):  
    240243        return "(%s)" % super().__str__()
    241244
    242245
     246register_combinable_fields(
     247    SearchQueryField, SearchQueryCombinable.BITAND, SearchQueryField, SearchQueryField
     248)
     249
     250register_combinable_fields(
     251    SearchQueryField, SearchQueryCombinable.BITOR, SearchQueryField, SearchQueryField
     252)
     253
     254
    243255class SearchRank(Func):
    244256    function = "ts_rank"
    245257    output_field = FloatField()

as in its current form it disallows usage of Func(output_field=SearchQueryField()).

Happy to accept a new ticket with the above if you're willing to write tests, in the mean time your best bet is likely to subclass SearchQuery and override __init__ to assign self.function = "my_tsquery_func" instead and support a different signature.

Last edited 3 months ago by Simon Charette (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top