#35946 closed Bug (wontfix)
Postgres ArrayField should convert values when given a list
| Reported by: | Zerq | Owned by: | Tanish Yelgoe |
|---|---|---|---|
| Component: | contrib.postgres | Version: | |
| Severity: | Normal | Keywords: | ArrayField to_python |
| Cc: | Mike Lissner | Triage Stage: | Unreviewed |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | yes | UI/UX: | no |
Description
Currently ArrayField.to_python only calls base_field.to_python when given string. In my opinion it should call it always. Currently using this field is inconsistent. Simple example:
f = MyModel._meta.get_field("int_array")
f.clean('[1, "2", 3]', None) # ok
f.clean([1, "2", 3], None) # error
Proposed ArrayField.to_python, witch always calls base_field.to_python(val):
def to_python(self, value):
if isinstance(value, str):
value = json.loads(value)
return [self.base_field.to_python(val) for val in value]
Change History (6)
comment:1 by , 14 months ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
follow-up: 3 comment:2 by , 14 months ago
| Resolution: | → invalid |
|---|---|
| Status: | assigned → closed |
comment:3 by , 14 months ago
Replying to Sarah Boyce:
Hi Zerq, thank you for the ticket!
The string case is commented with
# Assume we're deserializingand that's why theto_pythonis called (hence the inconsistency, there's different contexts)
Your suggested change is making us accept more values that we would previously reject (which could be a regression if folks rely on this behavior)
In short, I believe this by design and not a bug
Thanks for your replay.
I agree that it would be breaking change. But currently it is inconsistent with how fields operate and that makes it a trap. This is why I think it should be patched in new release as breaking change or at least warning added to documentation. to_python is not only called during deserializing, but as a normal cleaning process (https://docs.djangoproject.com/en/5.1/howto/custom-model-fields/#converting-values-to-python-objects). Users can expect that values will be coerced to proper type when assigned to model and cleaned. Custom fields can also preprocess values in this method (example: https://github.com/django/django-localflavor/blob/085fb2be02cae1c95594c2a358b8df8d62fa5577/localflavor/es/models.py#L43). This field will work fine alone, but not in ArrayField.
Example with different base_field:
class MyModel(models.Model):
boolean = models.BooleanField()
bool_array = ArrayField(models.BooleanField())
def test_my_model():
m = MyModel()
value = "false" # value read from network
m.boolean = value # ok
m.bool_array = [value] # error
m.full_clean()
I see no design reason why ArrayField should decide what values are acceptable or skip preprocess on its own.
I do not try to force this change, but ask to give it second thought. For me it is a bug, breaking how fields work within ArrayField.
comment:4 by , 14 months ago
| Resolution: | invalid |
|---|---|
| Status: | closed → new |
comment:5 by , 14 months ago
| Resolution: | → wontfix |
|---|---|
| Status: | new → closed |
Looking at the linked docs: (https://docs.djangoproject.com/en/5.1/howto/custom-model-fields/#converting-values-to-python-objects
to_python() is called by deserialization and during the clean() method used from forms.
As a general rule, to_python() should deal gracefully with any of the following arguments:
- An instance of the correct type (e.g., Hand in our ongoing example).
- A string
Does it handle an instance of the correct type?
Yes, [1, 2, 3] would be correct in terms of a ArrayField with an IntegerField base_field, and [1, "2", 3] is not a valid array field with integer base
Does it handle a deserialization (the string case) correctly?
Yes, when we serialize an ArrayField with [1, 2, 3] we get '["1", "2", "3"]' which is handled to be valid.
To me, the odd case is that '[1, "2", 3]' valid. That's because we get '[1, "2", 3]' apply json.load and get [1, "2", 3] and then the to_python of the base fields handle both strings (as the deserializer case) but also the case that it just received a value of the correct instance. We _possibly_ could handle this. I am not sure the cost benefit.
So I am adjusting this to "wontfix". This means you can start a new conversation on the Django Forum, where you'll reach a broader audience and receive additional feedback, if there is consensus that we should change this. You can reopen the ticket.
comment:6 by , 3 days ago
| Cc: | added |
|---|
There are similar issues with HStoreField.to_python() and CompositePrimaryKey.to_python().
By contrast, RangeField.to_python() *does* run to_python() on individual components when given a list.
More interestingly, various form fields, including for ArrayField itself (postgres.forms.SimpleArrayField), run to_python() on individual components when given a list.
I think this is ripe for a standardization. I'll try to get some findings into a forum post to support reframing this one.
I ended up here during review of #36865--which is going to start leveraging to_python() more during admin searches--and investigating the behavior difference between model to_python() and form to_python().
Hi Zerq, thank you for the ticket!
The string case is commented with
# Assume we're deserializingand that's why theto_pythonis called (hence the inconsistency, there's different contexts)Your suggested change is making us accept more values that we would previously reject (which could be a regression if folks rely on this behavior)
In short, I believe this by design and not a bug