#31340 closed New feature (fixed)
Improve expression support for __search lookup and SearchQuery
Reported by: | Baptiste Mispelon | Owned by: | Baptiste Mispelon |
---|---|---|---|
Component: | contrib.postgres | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
(not sure whether to categorize this as a bug or a new feature)
I've been trying to implement some kind of reverse full text search where I store keywords in the database and I want to query models whose keywords would match a given piece of text.
Here's a simplified model of what I'm working with:
class SavedSearch(models.Model): keywords = models.TextField() def __str__(self): return self.keywords
I've managed to achieve what I want in the case of the default search configuration using annotation and wrapping things with Value
or Cast
:
# This works search_query = Cast('keywords', output_field=SearchQueryField()) search_vector = SearchVector(Value("lorem ipsum ...", output_field=TextField())) qs = SavedSearch.objects.annotate(search=search_vector).filter(search=search_query)
But if I want to use a custom search configuration, things don't work anymore:
# This doesn't work (can't adapt type 'F') search_query = SearchQuery(F('keywords'), config='english', search_type='plain') search_vector = SearchVector(Value("lorem ipsum ...", output_field=TextField())) qs = SavedSearch.objects.annotate(search=search_vector).filter(search=search_query)
I'm not very familiar with the inner workings of Lookup
objects but I did some digging and I think I came up with a fix which involved fixing two separate issues:
1) SearchQuery
doesn't currently support anyting other than plain values (str
). Fixing this required changing both resolve_expression()
and as_sql()
.
2) the __search
lookup doesn't support things like F
objects because of it assumes that any value with a resolve_expression
method must be a SearchQuery
object.
Change History (8)
comment:1 by , 6 years ago
comment:2 by , 6 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:3 by , 6 years ago
Owner: | set to |
---|---|
Status: | new → assigned |
comment:5 by , 5 years ago
Triage Stage: | Accepted → Ready for checkin |
---|---|
Version: | 3.0 → master |
comment:7 by , 3 months ago
Does the current implementation works well with custom SQL functions which returns tsquery
and does not use any field which is already a tsquery
?
In one of my project I have created a custom SQL function which takes a list of integers (the IDs of some row in a specific table) and builds a tsquery
. I tought that using the following it would work well while it raises an exception
class TheModel(models.Model): searchable = SearchVectorField() TheModel.objects.filter(searchable=models.Func(123, 345, function='my_tsquery_func', output_field=SearchQueryField()) # django.db.utils.ProgrammingError: function plainto_tsquery(tsquery) does not exist
Because the SQL produced is and this produces an exception.
SELECT id, searchable FROM myapp_themodel WHERE searchable @@ plainto_tsquery(my_tsquery_func(123, 345));
To add more context the my_tsquery_func
looks like the following (the goal is to combine tsvector
s into a tsquery
to build a similarity algorithm):
CREATE OR REPLACE FUNCTION my_tsquery_func(first_id bigint, second_id bigint) RETURNS tsquery AS $$ SELECT to_tsquery( array_to_string( tsvector_to_array( (SELECT searchable FROM myapp_themodel WHERE id = first_id LIMIT 1) || (SELECT searchable FROM myapp_themodel WHERE id = second_id LIMIT 1) ), ' | ' ) ); $$ LANGUAGE sql PARALLEL SAFE STABLE;
comment:8 by , 3 months ago
I think the patch should have checked output_field
instead of special casing SearchQuery
and CombinedSearchQuery
in this form
-
django/contrib/postgres/search.py
diff --git a/django/contrib/postgres/search.py b/django/contrib/postgres/search.py index 2135c9bb88..636123a3ed 100644
a b class SearchVectorExact(Lookup): 16 16 lookup_name = "exact" 17 17 18 18 def process_rhs(self, qn, connection): 19 if not isinstance(self.rhs, (SearchQuery, CombinedSearchQuery)): 19 if not isinstance( 20 getattr(self.rhs, "_output_field_or_none", None), 21 SearchQueryField, 22 ): 20 23 config = getattr(self.lhs, "config", None) 21 24 self.rhs = SearchQuery(self.rhs, config=config) 22 25 rhs, rhs_params = super().process_rhs(qn, connection) … … def __str__(self): 240 243 return "(%s)" % super().__str__() 241 244 242 245 246 register_combinable_fields( 247 SearchQueryField, SearchQueryCombinable.BITAND, SearchQueryField, SearchQueryField 248 ) 249 250 register_combinable_fields( 251 SearchQueryField, SearchQueryCombinable.BITOR, SearchQueryField, SearchQueryField 252 ) 253 254 243 255 class SearchRank(Func): 244 256 function = "ts_rank" 245 257 output_field = FloatField()
as in its current form it disallows usage of Func(output_field=SearchQueryField())
.
Happy to accept a new ticket with the above if you're willing to write tests, in the mean time your best bet is likely to subclass SearchQuery
and override __init__
to assign self.function = "my_tsquery_func"
instead and support a different signature.
Pull request here: https://github.com/django/django/pull/12525