Opened 8 years ago

Closed 18 months ago

#26511 closed Cleanup/optimization (wontfix)

Document KeyTextTransform() and KeyTransform().

Reported by: Eoin Murray Owned by: AllenJonathan
Component: Documentation Version: dev
Severity: Normal Keywords: postgres, jsonb
Cc: eoin@… Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I have a model called File with a JSONField. I create a new file and inside the JSONField is a key value pair

{"title": "the cow jumped over the moon"}

I want to be able to search for objects where the title contains moon...

I can do it will the following raw SQL statement in Django

File.objects.raw('SELECT * FROM web_file WHERE (metadata ->> %s)::text LIKE %s;', ['title', '%moon%'])[0]

This works, but doesn't return a queryset, and doesn't play nice with the rest of my code. I was hoping I could do this query with the Django ORM. From the docs, I thought the following code would work:

File.objects.filter(metadata__title__contains='moon')

But gives the an error (traceback here https://gist.github.com/eoinmurray/04843b77e8d71fad8cfa4a990c00740a)

DataError: invalid input syntax for type json
LINE 1: ...b_file" WHERE "web_file"."metadata" -> 'title' @> 'moon' ORD...

DETAIL:  Token "moon" is invalid.
CONTEXT:  JSON data, line 1: moon

Perhaps this is something that cannot be accomplished with the ORM, and therefore is not a bug - but I'm not really sure from the documentation...

Here is all the code needed to recreate the error, Im using Django 1.9, Postgres 9.4 and psycopg2 2.6.1.

models.py

from django.db import models
from django.contrib.postgres.fields import JSONField

class File(models.Model):
    owner = models.ForeignKey('auth.User', related_name='file_user')
    created_on = models.DateTimeField(auto_now_add=True, blank=True)
    name = models.CharField(max_length=100)
    metadata = JSONField(null=True, blank=True, default=dict())

query.py

from web.models import File
from django.contrib.auth.models import User

user = User.objects.get(pk=1)

File.objects.create(name="testfile.txt", owner=user, metadata={'title': 'the cow jumped over the moon'})

# raw query that works
# File.objects.raw('SELECT * FROM web_file WHERE (metadata ->> %s)::text LIKE %s;', ['title', '%moon%'])[0]

File.objects.filter(metadata__title__contains='moon')

>>> outputs traceback https://gist.github.com/eoinmurray/04843b77e8d71fad8cfa4a990c00740a
>>> DataError: invalid input syntax for type json
>>> LINE 1: ...b_file" WHERE "web_file"."metadata" -> 'title' @> 'moon' ORD...
                                                             ^
>>> DETAIL:  Token "moon" is invalid.
>>> CONTEXT:  JSON data, line 1: moon

Change History (17)

comment:1 by Claude Paroz, 8 years ago

Resolution: worksforme
Status: newclosed
Last edited 8 years ago by Claude Paroz (previous) (diff)

comment:2 by Eoin Murray, 8 years ago

That will only return objects where the metadata contains the key 'title' with the exact value 'moon', it wont match with objects where 'title' is 'the cow jumped over the moon'...

I have just tested it there and using your suggested syntax doesnt work for me

File.objects.create(name="testfile.txt", owner=some_user, metadata={'title': 'the cow jumped over the moon'})

print File.objects.filter(metadata__contains={'title': 'moon'})

>>> []

The filter returns no items

I just checked and it seems to me that

print File.objects.filter(metadata__title= 'moon')

and

print File.objects.filter(metadata__contains={'title': 'moon'})

are equivalent, and they do exact searches for title=moon, and not a substring containment search which is what Im looking for

Last edited 8 years ago by Eoin Murray (previous) (diff)

comment:3 by Simon Charette, 8 years ago

As Claude pointed out the textfield__contains and jsonfield__contains lookups do not behave the same.

In order to use the textfield__contains operator on metadata -> 'title' you'll have to cast your JSON field to text (just like you do in your raw() query) before hand.

The following should work against the master branch (which introduced the Cast function):

from django.db import models
from django.db.models.functions import Cast

File.objects.annotate(
    title=Cast('metadata__title', models.TextField()),
).filter(title__contains='moon')
Last edited 8 years ago by Simon Charette (previous) (diff)

comment:4 by Claude Paroz, 8 years ago

Component: contrib.postgresDocumentation
Owner: set to nobody
Resolution: worksforme
Status: closednew
Triage Stage: UnreviewedAccepted
Type: BugCleanup/optimization
Version: 1.9master

Thanks Simon for the details. Then it might be worth adding that hint somewhere in the documentation.

comment:5 by Eoin Murray, 8 years ago

Is there any way to do this in Django <=1.9.5?

comment:6 by Tim Graham, 8 years ago

You can probably copy the Cast function from Django master (03b6947728466e4e907487f30dd4dfec94a8eb2f) into your project.

comment:7 by Christian Karrié, 8 years ago

I've done something like:

like_arg = "%%%s%%" % query_string
qs.extra(where=["json_values ->> %s LIKE %s"], params=[lookup_key, like_arg])

json_values is my JSONField, lookup_key is "title" in your case, like_arg is "%moon%" in your case.

Thus returns a queryset

comment:8 by Jens Neuhaus, 8 years ago

Has patch: set

Hi. I submitted a patch to include it in the docs: https://github.com/django/django/pull/6965

comment:9 by Tim Graham, 8 years ago

Patch needs improvement: set
Summary: Django 1.9 and Postgres 9.4 jsonb string containment inside JSONFieldDocument how to do a substring search in JSONField

Marking as "Patch needs improvement" per Simon's comment, "Searching a whole JSON blob seems like a bad practice to me." The technique described in comment:3 doesn't work, so it's possible some code changes might also be needed to allow the desired behavior.

comment:10 by Simon Charette, 8 years ago

It looks like what we really need here is support for the ->> operator as (lhs -> key)::text is the JSON representation of the string (('"foo"'::jsonb)::text yields '"foo"').

I believe this should be done by introducing a new expression for this purpose and documenting it.

In the meantime we could document using Func('data', template="%(expressions)s ->> 'hobby'", output_field=models.TextField()).

comment:11 by Simon Charette, 8 years ago

Here's a POC of what I had in mind.

comment:12 by Tim Graham, 8 years ago

See #27205 for the same issue as it applies to ArrayField.

comment:13 by Mariusz Felisiak, 4 years ago

Summary: Document how to do a substring search in JSONFieldDocument KeyTextTransform() and KeyTransform().

I agree with Simon (see comment). Both KeyTextTransform() and KeyTransform() should be documented as a public API, folks already use them.

comment:14 by Mariusz Felisiak, 19 months ago

Owner: changed from nobody to AllenJonathan
Patch needs improvement: unset
Status: newassigned

comment:15 by Mariusz Felisiak, 19 months ago

Patch needs improvement: set

comment:16 by GitHub <noreply@…>, 19 months ago

In e9fd2b57:

Refs #26511 -- Fixed json.KeyTextTransform() on MySQL/MariaDB.

comment:17 by Mariusz Felisiak, 18 months ago

Has patch: unset
Patch needs improvement: unset
Resolution: wontfix
Status: assignedclosed
Triage Stage: AcceptedUnreviewed

Closing in favor of #33966.

There is no need to document KeyTransform() as the same can be achieved with F() so there is no need to expose an extra API. Secondly, #33966 proposed adding KT() which is more handy and will be documented instead of KeyTextTransform().

Note: See TracTickets for help on using tickets.
Back to Top