Opened 5 years ago
Last modified 2 weeks ago
#32213 assigned Bug
QuerySet.values()/.values_list() on KeyTransforms return wrong values for double-quoted strings on SQLite and Oracle
| Reported by: | Sage Abdullah | Owned by: | VIZZARD-X |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 3.1 |
| Severity: | Normal | Keywords: | sqlite oracle json |
| Cc: | Carlton Gibson, bcail | Triage Stage: | Accepted |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
Suppose we have a JSONField with the value {'key': '"value"'}. If .values()/.values_list() are called on a jsonfield__key lookup, they return 'value' on SQLite and Oracle. On other backends, they return '"value"', which is the correct behavior.
Change History (13)
comment:1 by , 5 years ago
| Cc: | added |
|---|---|
| Keywords: | sqlite oracle json added |
| Severity: | Release blocker → Normal |
| Summary: | QuerySet .values()/.values_list() on KeyTransforms return wrong values for double-quoted strings on SQLite and Oracle → QuerySet.values()/.values_list() on KeyTransforms return wrong values for double-quoted strings on SQLite and Oracle |
| Triage Stage: | Unreviewed → Accepted |
comment:2 by , 4 years ago
| Owner: | removed |
|---|---|
| Status: | assigned → new |
comment:3 by , 3 years ago
I'm also not sure this is event fixable since JSONField is schema less so the ORM has not way to know it should use ->> over -> for the specified key.
I think this ticket is a good candidate for wontfix consideration once KeyTextTransform is documented (#15956) as this issue can be solved by doing values_list(KeyTextTransform("key", "json_field")).
follow-up: 5 comment:4 by , 3 years ago
| Cc: | added |
|---|
I'm working on a test for this case, and having trouble reproducing it.
diff --git a/tests/model_fields/test_jsonfield.py b/tests/model_fields/test_jsonfield.py
index 2c32d8a4ea..ecc2b9d0d5 100644
--- a/tests/model_fields/test_jsonfield.py
+++ b/tests/model_fields/test_jsonfield.py
@@ -611,6 +611,11 @@ class TestQuerying(TestCase):
[obj],
)
+ def test_value(self):
+ obj = NullableJSONModel.objects.create(value={"key": '"value"'})
+ self.assertSequenceEqual(NullableJSONModel.objects.filter(value__key='"value"').values_list("value", flat=True), [{'key': '"value"'}])
+ self.assertSequenceEqual(NullableJSONModel.objects.filter(Q(value__has_key="key")).values_list("value", flat=True), [{'key': '"value"'}])
+
@skipUnlessDBFeature("supports_json_field_contains")
def test_contains(self):
tests = [
This test passes on both SQLite and Postgresql for me. What do I need to change to trigger the wrong behavior?
comment:5 by , 3 years ago
Replying to bcail:
This test passes on both SQLite and Postgresql for me. What do I need to change to trigger the wrong behavior?
This ticket is about using key transforms in the values()/values_list():
# SQLite
>>> NullableJSONModel.objects.filter(value__key='"value"').values_list("value__key", flat=True)
['value']
# PostgreSQL
>>> NullableJSONModel.objects.filter(value__key='"value"').values_list("value__key", flat=True)
['"value"']
comment:7 by , 3 years ago
Here are some tests that show the issues with various strings in a JSONField for sqlite (they pass on postgresql):
+ def test_str_with_quotes(self):
+ value = {"key": "test_value", "key2": '"value"'}
+ obj = NullableJSONModel.objects.create(value=value)
+
+ self.assertSequenceEqual(NullableJSONModel.objects.filter(value__key='test_value').values("value__key2"), [{"value__key2": '"value"'}])
+ self.assertSequenceEqual(NullableJSONModel.objects.filter(value__key='test_value').values_list("value__key2"), [('"value"',)])
+
+ def test_str_with_null_false_true(self):
+ value = {"key": "test_value", "key2": "null", "key3": "false", "key4": "true"}
+ obj = NullableJSONModel.objects.create(value=value)
+
+ self.assertSequenceEqual(NullableJSONModel.objects.filter(value__key='test_value').values_list("value__key2"), [('null',)])
+ self.assertSequenceEqual(NullableJSONModel.objects.filter(value__key='test_value').values("value__key3"), [{"value__key3": 'false'}])
+ self.assertSequenceEqual(NullableJSONModel.objects.filter(value__key='test_value').values("value__key4"), [{"value__key4": 'true'}])
+
+ def test_str_with_list_dict(self):
+ value = {"key": "test_value", "key2": "[]", "key3": "{}"}
+ obj = NullableJSONModel.objects.create(value=value)
+
+ self.assertSequenceEqual(NullableJSONModel.objects.filter(value__key='test_value').values_list("value__key2"), [("[]",)])
+ self.assertSequenceEqual(NullableJSONModel.objects.filter(value__key='test_value').values("value__key3"), [{"value__key3": "{}"}])
This code change gets the str_with_quotes test passing, but not the other two:
diff --git a/django/db/models/fields/json.py b/django/db/models/fields/json.py
index 7296fe42bc..6277ce9935 100644
--- a/django/db/models/fields/json.py
+++ b/django/db/models/fields/json.py
@@ -82,8 +82,11 @@ class JSONField(CheckFieldDefaultMixin, Field):
return value
# Some backends (SQLite at least) extract non-string values in their
# SQL datatypes.
- if isinstance(expression, KeyTransform) and not isinstance(value, str):
- return value
+ if connection.vendor == "sqlite" and isinstance(expression, KeyTransform):
+ if not isinstance(value, str):
+ return value
+ if value not in ['true', 'false', 'null'] and not value.startswith('[') and not value.startswith('{'):
+ return value
try:
return json.loads(value, cls=self.decoder)
except json.JSONDecodeError:
I think that once sqlite 3.38.0 is in widespread use, and ticket #33548 is implemented (using -> and ->> in sqlite), that may take care of this issue.
comment:8 by , 5 weeks ago
| Has patch: | set |
|---|
comment:9 by , 5 weeks ago
I have submitted a Pull Request with the fix and regression tests:
https://github.com/django/django/pull/20487
The fix addresses the issue where SQLite KeyTransform lookups returned unquoted raw strings, causing incorrect json.loads parsing.
comment:10 by , 5 weeks ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
comment:11 by , 5 weeks ago
| Needs tests: | set |
|---|---|
| Patch needs improvement: | set |
Left some comments for improvements on the PR mainly
- The solution should use
KeyTransform.select_formatand not overrideJSONField.from_db_valuewithKeyTransformspecific logic - Existing testing setup (models, test classes) should be used instead of a per-ticket model and test module and test case class
comment:12 by , 3 weeks ago
I've updated the PR to use KeyTransform.select_format as suggested, ensuring the fix is isolated to the SQLite backend without modifying JSONField.from_db_value.
Also added regression tests in tests/model_fields/test_jsonfield.py rather than using per-ticket model which was the case previously
comment:13 by , 2 weeks ago
| Needs tests: | unset |
|---|---|
| Patch needs improvement: | unset |
I've focused this patch specifically on the SQLite regression. Unchecking Needs tests and Patch needs improvement as the SQLite resolution is looked into. The Oracle aspect mentioned in the ticket likely requires a different implementation and wondering whether it could be addressed separately in a separate ticket?
Thanks, I'm not sure about severity because it's niche I don't believe that many users (if anyone) are affected by this issue. We can bump it when evaluating a patch.