Opened 8 years ago
Closed 8 years ago
#27227 closed Bug (needsinfo)
Full text search by UUIDField returns DataError
Reported by: | Dan Claudiu Pop | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.10 |
Severity: | Normal | Keywords: | UUIDField fts postgres |
Cc: | danclaudiupop@… | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Consider the following model:
class Show(models.Model): cid = models.UUIDField( default=uuid.uuid4, editable=False, verbose_name="Content ID", help_text="Unique Identifier" ) title_short = models.CharField( max_length=60, blank=True, verbose_name="Short Title", help_text="Short title (60 chars)" )
The following snippet returns DataError, see below traceback:
>>> from django.contrib.postgres.search import SearchVector >>> Show.objects.annotate(search=SearchVector('cid')).filter(search='foo') Traceback (most recent call last): File "<console>", line 1, in <module> File "/home/vagrant/ve/lib/python3.4/site-packages/django/db/models/query.py", line 232, in __repr__ data = list(self[:REPR_OUTPUT_SIZE + 1]) File "/home/vagrant/ve/lib/python3.4/site-packages/django/db/models/query.py", line 256, in __iter__ self._fetch_all() File "/home/vagrant/ve/lib/python3.4/site-packages/django/db/models/query.py", line 1087, in _fetch_all self._result_cache = list(self.iterator()) File "/home/vagrant/ve/lib/python3.4/site-packages/django/db/models/query.py", line 54, in __iter__ results = compiler.execute_sql() File "/home/vagrant/ve/lib/python3.4/site-packages/django/db/models/sql/compiler.py", line 835, in execute_sql cursor.execute(sql, params) File "/home/vagrant/ve/lib/python3.4/site-packages/django/db/backends/utils.py", line 79, in execute return super(CursorDebugWrapper, self).execute(sql, params) File "/home/vagrant/ve/lib/python3.4/site-packages/django/db/backends/utils.py", line 64, in execute return self.cursor.execute(sql, params) File "/home/vagrant/ve/lib/python3.4/site-packages/django/db/utils.py", line 94, in __exit__ six.reraise(dj_exc_type, dj_exc_value, traceback) File "/home/vagrant/ve/lib/python3.4/site-packages/django/utils/six.py", line 685, in reraise raise value.with_traceback(tb) File "/home/vagrant/ve/lib/python3.4/site-packages/django/db/backends/utils.py", line 64, in execute return self.cursor.execute(sql, params) django.db.utils.DataError: invalid input syntax for uuid: "" LINE 1: ...tag", to_tsvector(COALESCE("entities_show"."cid", '')) AS "s...
Tried with PostgreSQL 9.3.14 and PostgreSQL 9.5.3
Change History (10)
comment:1 by , 8 years ago
Cc: | added |
---|
comment:2 by , 8 years ago
Type: | Uncategorized → Bug |
---|
comment:3 by , 8 years ago
comment:5 by , 8 years ago
We have a system in which we identify a resource by uuid field. We also expose a search endpoint where we give back a list of different resource types. For commodity we also search in the uuid field to give back the proper resource on the search endpoint.
Instead of checking "/api/book/<uuid>/" or "/api/author/<uuid>/" for an <uuid> we can go directly to "api/search/?q=<uuid>" to fetch the resource.
comment:6 by , 8 years ago
We could be extra safe and always cast the expression to text in a CAST(expression, output_field=models.TextField())
but I don't understand why you're using FTS here, using LIKE
through the __contains
lookup would be more appropriate IMHO. In the meant time you can take care of casting the expression yourself before annotating your queryset to work around your issue.
comment:7 by , 8 years ago
So unfortunately, even when supporting the fallback value, this fails because to_tsvector(uuid) fails for reasons pointed out by charettes: uuid isn't searchable. However, having a way to provide the fallback is useful, if one wants to have NULL returned. From the Coalesce docs:
The COALESCE function returns the first of its arguments that is not null. Null is returned only if all arguments are null.
But, I don't have a real-world use case for this. If there's no interest in supporting the fallback value, I'll throw away my local branch.
comment:8 by , 8 years ago
The use case that we encounter is this: having an administration interface which lists all the shows, i want to search shows by cid
, title_short
and other fields (such as description
, etc). So I wanted to leverage FTS, because __contains
is really slow over many records and fields.
By charettes suggestion, I used CAST
and the queryset looks like this:
Show.objects.annotate(cidtxt=Cast('cid', TextField())).annotate(search=SearchVector('cidtxt', 'title_short')).filter(search='foo')
comment:9 by , 8 years ago
I'm not sure. Maybe you can send a pull request to give an idea of what the patch allows?
comment:10 by , 8 years ago
Resolution: | → needsinfo |
---|---|
Status: | new → closed |
Analysis:
In
django/contrib/postgres/search.py
we find in init() around line 55 (stable/1.10.x):These are propagated to Coalesce() and generate the expression:
For most cases, the empty string is a valid input value, yet for the native PostgreSQL UUID type, this is not the case. The API doesn't provide a way to specify the fallback value.