Opened 2 weeks ago
Closed 2 weeks ago
#35885 closed Uncategorized (invalid)
JSONField does accept strings that look like dicts and incorrectly saves them as strings, breaking JSON filtering
Reported by: | DataGreed | Owned by: | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 5.0 |
Severity: | Normal | Keywords: | |
Cc: | DataGreed | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Steps to reproduce:
- Create a model with JSONField, create a migration and migrate
- Create an instance of this model, set the value of JSON field to a string generated by
json.dumps()
from any dictionary, e.g. "{"foo":"bar"} - Save the model instance
- Instance save successfully
Expected result:
Instance is saved with the JSONField set to a dictionary {"foo": "bar"}, and JSONField deep filtering works properly
Actual result:
Instance is saved with a JSONField value saved as a string. Filtering does not work, and when you query this instance from a database, the JSONField value will be a string, not a dict.
I tried it on PostgreSQL and sqlite - both working incorrectly.
Note that if I set JSONField value to an actual dict it works correctly, but it is too easy to make a mistake and then waste hours of debugging the reasons the JSON filtering does not work, since developers usually sanitize the data at least by dumping the dict to json to make sure it is even possible to dump.
It seems like the fix would be to parse the string value in JSONField before saving.
My current workaround is to do the following on my models with JSONFields:
class SomeModel(models.Model): json_field = models.JSONField(blank=True, null=True) def clean(self): try: if isinstance(self.json_field, str): # make sure we are not trying to save string to JSONField, django allows this for some reason # and it breaks filtering on json self.arguments = json.loads(self.json_field) except ValueError as e: # reraise as validation error. We do this to get more actionable description for the error raise ValidationError(str(e)) def save(self, *args, **kwargs): # ensure clean runs. Well, kind of - we can still directly update the fields, # and it will somewhat break data integrity, so just don't do that maybe? thanks :) self.full_clean() super().save(*args, **kwargs)
Change History (1)
comment:1 by , 2 weeks ago
Component: | Uncategorized → Database layer (models, ORM) |
---|---|
Resolution: | → invalid |
Status: | new → closed |
JSON accepts top level string as valid input; that is including JSON serialized as a string.
I don't think this is a normal behavior. Usually data transits through through serialization layers that are in charge of parsing JSON strings as Python objects just like for datetimes, decimal, float, and other types.
Having
JSONField
auto-magically perform ajson.loads
when provided a string input is backward incompatible and prevents storing JSON de-serializable strings as top level values (e.g."1", "false", "null"
).