Opened 8 months ago

Last modified 7 months ago

#27824 assigned New feature

Cannot chain unaccent transform with key lookup in Postgres JSON field

Reported by: Aymeric Augustin Owned by: Simon Charette
Component: contrib.postgres Version: 1.10
Severity: Normal Keywords:
Cc: Simon Charette, olau@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by Aymeric Augustin)

With the following model:

from django.contrib.postgres.fields import JSONField
from django.db import models


class Subscription(models.Model):
    inputs = JSONField()

I'd like this query to work (assuming a inputs look like {'first_name': '...'}:

Subscription.objects.filter(inputs__first_name__unaccent__icontains="...")

But it fails with:

django.db.utils.ProgrammingError: operator does not exist: jsonb #> text
LINE 1: ...%' OR UPPER("subscriptions_subscription"."inputs" #> ARRAY['...
                                                             ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

cecef94275118dc49a1b0d89d3ca9734e2ec9776 registered a bunch of lookups for chaining after a key transform, but unaccent wasn't included. I suggest to include it as well.

Change History (9)

comment:1 Changed 8 months ago by Aymeric Augustin

Description: modified (diff)

comment:2 Changed 8 months ago by Aymeric Augustin

Workaround:

Subscription.objects.annotate(
    _first_name=RawSQL("(inputs -> 'first_name')::text", [], output_field=TextField()),
).filter(
    _first_name__unaccent__icontains=query,
)

comment:3 Changed 8 months ago by Tim Graham

Triage Stage: UnreviewedAccepted

Django 1.11 (since the commit you mentioned) isn't crashing anymore but I don't think the query is correct. This test addition to postgres_tests/test_json.py isn't working: self.assertTrue(JSONModel.objects.filter(field__foo__unaccent__icontains='bár').exists()). It generates this SQL:

SELECT "postgres_tests_jsonmodel"."id", "postgres_tests_jsonmodel"."field", "postgres_tests_jsonmodel"."field_custom"
FROM "postgres_tests_jsonmodel"
WHERE UPPER(("postgres_tests_jsonmodel"."field" #>> ['foo', 'unaccent'])) LIKE UPPER(%bár%)

By the way, the workaround as translated to Django's test suite isn't working for me (tested on 1.11 and 1.10):

>>> JSONModel.objects.annotate(
    _first_name=RawSQL("(field -> 'foo')::text", [], output_field=TextField()),
).filter(
    _first_name__unaccent__icontains='bár',
)
...
FieldError: Unsupported lookup 'unaccent' for TextField or join on the field not permitted.

comment:4 Changed 8 months ago by Aymeric Augustin

Owner: set to Aymeric Augustin
Status: newassigned

I developed the workaround on Django 1.10. I'm not sure why it isn't working in the context of the test suite.

Florian gave me some pointers on how to fix this, but my naive attempts didn't go very far. I'll try to figure it out.

comment:5 Changed 8 months ago by Aymeric Augustin

Simply adding from .transforms import Unaccent in django.contrib.postgres.fields.jsonb causes tests to fail with the following exception:

======================================================================
ERROR: setUpClass (postgres_tests.test_json.TestQuerying)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/myk/Documents/dev/django/django/db/backends/utils.py", line 62, in execute
    return self.cursor.execute(sql, params)
psycopg2.ProgrammingError: column "field" is of type jsonb but expression is of type boolean
LINE 1: ...ests_jsonmodel" ("field", "field_custom") VALUES (true, NULL...
                                                             ^
HINT:  You will need to rewrite or cast the expression.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/myk/Documents/dev/django/django/test/testcases.py", line 999, in setUpClass
    cls.setUpTestData()
  File "/Users/myk/Documents/dev/django/tests/postgres_tests/test_json.py", line 101, in setUpTestData
    JSONModel.objects.create(field=True),
  File "/Users/myk/Documents/dev/django/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/Users/myk/Documents/dev/django/django/db/models/query.py", line 389, in create
    obj.save(force_insert=True, using=self.db)
  File "/Users/myk/Documents/dev/django/django/db/models/base.py", line 718, in save
    force_update=force_update, update_fields=update_fields)
  File "/Users/myk/Documents/dev/django/django/db/models/base.py", line 748, in save_base
    updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
  File "/Users/myk/Documents/dev/django/django/db/models/base.py", line 834, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/Users/myk/Documents/dev/django/django/db/models/base.py", line 873, in _do_insert
    using=using, raw=raw)
  File "/Users/myk/Documents/dev/django/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/Users/myk/Documents/dev/django/django/db/models/query.py", line 1045, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/Users/myk/Documents/dev/django/django/db/models/sql/compiler.py", line 1092, in execute_sql
    cursor.execute(sql, params)
  File "/Users/myk/Documents/dev/django/django/db/backends/utils.py", line 62, in execute
    return self.cursor.execute(sql, params)
  File "/Users/myk/Documents/dev/django/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback)
  File "/Users/myk/Documents/dev/django/django/db/backends/utils.py", line 62, in execute
    return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: column "field" is of type jsonb but expression is of type boolean
LINE 1: ...ests_jsonmodel" ("field", "field_custom") VALUES (true, NULL...
                                                             ^
HINT:  You will need to rewrite or cast the expression.


----------------------------------------------------------------------

This is disturbing.

comment:6 Changed 8 months ago by Aymeric Augustin

Has patch: set
Owner: Aymeric Augustin deleted
Patch needs improvement: set
Status: assignednew

I have no idea how to fix the TODO in the pull request. Deassigning myself.

comment:7 Changed 8 months ago by Simon Charette

Cc: Simon Charette added
Owner: set to Simon Charette
Status: newassigned

FWIW you want to use RawSQL("(inputs ->> 'first_name')", [], output_field=TextField()) as a workaround.

Note the use ->> instead of -> and ::text as the latter will give you the text representation of the first_name JSON key (that would be '"Charette"' instead of 'Charette').

comment:8 Changed 8 months ago by Simon Charette

I'm afraid we'll have to make this a wontfix as exposing a new lookup would break backward compatiblity for existing users of JSONField who have data containing a key named unaccent.

#27257 didn't register any new lookups, it only made existing text lookups that were available since the introduction of JSONField use ->> instead of (lhs -> key)::text.

This could also be closed as duplicate of #26511 which proposes to document KeyTextTransform as a public API so you could use KeyTextTransform('first_name', F('inputs')) instead of the RawSQL workaround you are using right now.

comment:9 Changed 7 months ago by Ole Laursen

Cc: olau@… added

As someone who's been using the JSONField to store user-defined fields and now need to query those fields, I think there needs be some kind of decision of what the future of querying JSONField is, perhaps followed by some backwards-incompatible changes to get to that state.

As far as I can tell, from the start querying has not produced correct SQL outside a limited set of examples. For things that are obviously ambigious, the documentation basically refers to something designed for the much simpler HStoreField. So I don't think this has ever been fully baked.

Here's my proposal:

1) Add a couple of convenient chaining functions for accessing arbitrarly named stuff inside the JSON - no __ ambigious non-sense, this should let me specify exactly what I'm looking for/translate directly into SQL

2a) Let Django interpret foo__bar as if it were Django models - so in the case of unaccent, if you named your field unaccent, you'll have to rename it on Django upgrade or go to step 1)

or

2b) Let Django interpret foo__bar as if it were Django models, but don't support anything else than array/object lookups into the JSON, everything else is step 1)

or

2c) Stop interpreting stuff inside JSONField, so no __ support

The documentation should explain 1) but say that 2) is an option for simple cases as long as you're in full control of the JSON/query parameters.

(I realize this is probably a discussion for django-devel, but I don't have time to raise it there myself right now, so just want to get something out there.)

Note: See TracTickets for help on using tickets.
Back to Top