#31340 closed New feature (fixed)
Improve expression support for __search lookup and SearchQuery
| Reported by: | Baptiste Mispelon | Owned by: | Baptiste Mispelon |
|---|---|---|---|
| Component: | contrib.postgres | Version: | dev |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Ready for checkin | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
(not sure whether to categorize this as a bug or a new feature)
I've been trying to implement some kind of reverse full text search where I store keywords in the database and I want to query models whose keywords would match a given piece of text.
Here's a simplified model of what I'm working with:
class SavedSearch(models.Model): keywords = models.TextField() def __str__(self): return self.keywords
I've managed to achieve what I want in the case of the default search configuration using annotation and wrapping things with Value or Cast:
# This works search_query = Cast('keywords', output_field=SearchQueryField()) search_vector = SearchVector(Value("lorem ipsum ...", output_field=TextField())) qs = SavedSearch.objects.annotate(search=search_vector).filter(search=search_query)
But if I want to use a custom search configuration, things don't work anymore:
# This doesn't work (can't adapt type 'F') search_query = SearchQuery(F('keywords'), config='english', search_type='plain') search_vector = SearchVector(Value("lorem ipsum ...", output_field=TextField())) qs = SavedSearch.objects.annotate(search=search_vector).filter(search=search_query)
I'm not very familiar with the inner workings of Lookup objects but I did some digging and I think I came up with a fix which involved fixing two separate issues:
1) SearchQuery doesn't currently support anyting other than plain values (str). Fixing this required changing both resolve_expression() and as_sql().
2) the __search lookup doesn't support things like F objects because of it assumes that any value with a resolve_expression method must be a SearchQuery object.
Change History (8)
comment:1 by , 6 years ago
comment:2 by , 6 years ago
| Triage Stage: | Unreviewed → Accepted |
|---|
comment:3 by , 6 years ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
comment:5 by , 6 years ago
| Triage Stage: | Accepted → Ready for checkin |
|---|---|
| Version: | 3.0 → master |
comment:7 by , 6 months ago
Does the current implementation works well with custom SQL functions which returns tsquery and does not use any field which is already a tsquery?
In one of my project I have created a custom SQL function which takes a list of integers (the IDs of some row in a specific table) and builds a tsquery. I tought that using the following it would work well while it raises an exception
class TheModel(models.Model): searchable = SearchVectorField() TheModel.objects.filter(searchable=models.Func(123, 345, function='my_tsquery_func', output_field=SearchQueryField()) # django.db.utils.ProgrammingError: function plainto_tsquery(tsquery) does not exist
Because the SQL produced is and this produces an exception.
SELECT id, searchable FROM myapp_themodel WHERE searchable @@ plainto_tsquery(my_tsquery_func(123, 345));
To add more context the my_tsquery_func looks like the following (the goal is to combine tsvectors into a tsquery to build a similarity algorithm):
CREATE OR REPLACE FUNCTION my_tsquery_func(first_id bigint, second_id bigint) RETURNS tsquery AS $$ SELECT to_tsquery( array_to_string( tsvector_to_array( (SELECT searchable FROM myapp_themodel WHERE id = first_id LIMIT 1) || (SELECT searchable FROM myapp_themodel WHERE id = second_id LIMIT 1) ), ' | ' ) ); $$ LANGUAGE sql PARALLEL SAFE STABLE;
comment:8 by , 6 months ago
I think the patch should have checked output_field instead of special casing SearchQuery and CombinedSearchQuery in this form
-
django/contrib/postgres/search.py
diff --git a/django/contrib/postgres/search.py b/django/contrib/postgres/search.py index 2135c9bb88..636123a3ed 100644
a b class SearchVectorExact(Lookup): 16 16 lookup_name = "exact" 17 17 18 18 def process_rhs(self, qn, connection): 19 if not isinstance(self.rhs, (SearchQuery, CombinedSearchQuery)): 19 if not isinstance( 20 getattr(self.rhs, "_output_field_or_none", None), 21 SearchQueryField, 22 ): 20 23 config = getattr(self.lhs, "config", None) 21 24 self.rhs = SearchQuery(self.rhs, config=config) 22 25 rhs, rhs_params = super().process_rhs(qn, connection) … … def __str__(self): 240 243 return "(%s)" % super().__str__() 241 244 242 245 246 register_combinable_fields( 247 SearchQueryField, SearchQueryCombinable.BITAND, SearchQueryField, SearchQueryField 248 ) 249 250 register_combinable_fields( 251 SearchQueryField, SearchQueryCombinable.BITOR, SearchQueryField, SearchQueryField 252 ) 253 254 243 255 class SearchRank(Func): 244 256 function = "ts_rank" 245 257 output_field = FloatField()
as in its current form it disallows usage of Func(output_field=SearchQueryField()).
Happy to accept a new ticket with the above if you're willing to write tests, in the mean time your best bet is likely to subclass SearchQuery and override __init__ to assign my_tsquery_func instead and support a different signature.
Pull request here: https://github.com/django/django/pull/12525