Opened 3 weeks ago
Last modified 43 hours ago
#36986 assigned Cleanup/optimization
Add a hook to Serializer._value_from_field() for complex fields like CompositePrimaryKey
| Reported by: | Tim Graham | Owned by: | Tim Graham |
|---|---|---|---|
| Component: | Core (Serialization) | Version: | dev |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Accepted | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
The Python serializer's _value_from_field() method using isinstance for special-case handling of CompositePrimaryKey.
Custom model fields can't add similar logic without subclassing the serializer, a cumbersome solution which doesn't scale well in the presence of multiple custom fields.
My motivation for a generic hook is to add serialization support for MongoDB's EmbeddedModelField, EmbeddedModelArrayField, PolymorphicEmbeddedModelField, and PolymorphicEmbeddedModelArrayField.
An early prototype:
-
django/core/serializers/python.py
diff --git a/django/core/serializers/python.py b/django/core/serializers/python.py index 2929874b01..1a4f5a1336 100644
a b class Serializer(base.Serializer): 39 39 data["fields"] = self._current 40 40 return data 41 41 42 def get_fields_from_model(self, obj, *, polymorphic=False): 43 data = { 44 field.name: self._value_from_field(obj, field) 45 for field in obj._meta.local_fields 46 } 47 if polymorphic: 48 data["_label"] = obj._meta.label 49 return data 50 42 51 def _value_from_field(self, obj, field): 43 52 if isinstance(field, CompositePrimaryKey): 44 53 return [self._value_from_field(obj, f) for f in field] 54 if hasattr(field, "embedded_model"): 55 sub_obj = getattr(obj, field.attname) 56 if sub_obj is None: 57 return None 58 if isinstance(sub_obj, list): 59 return [self.get_fields_from_model(sub) for sub in sub_obj] 60 return self.get_fields_from_model(sub_obj) 61 if hasattr(field, "embedded_models"): 62 sub_obj = getattr(obj, field.attname) 63 if sub_obj is None: 64 return None 65 if isinstance(sub_obj, list): 66 return [ 67 self.get_fields_from_model(sub, polymorphic=True) for sub in sub_obj 68 ] 69 return self.get_fields_from_model(sub_obj, polymorphic=True) 45 70 value = field.value_from_object(obj) 46 71 # Protected types (i.e., primitives like None, numbers, dates, 47 72 # and Decimals) are passed through as is. All other values are
I don't have an immediate proposal for the API change to accomplish this.
Change History (12)
comment:1 by , 3 weeks ago
comment:2 by , 3 weeks ago
That's insufficient. Notice that both CompositePrimaryKey and my example in the description require the ability to call self._value_from_field recursively. I'm considering the possibility of a registry that would allow registering a field's function with the serializer, similar to how lookups are registered to model fields.
comment:3 by , 3 weeks ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
comment:4 by , 3 weeks ago
| Triage Stage: | Unreviewed → Accepted |
|---|
Thanks, Tim. Would your field be able to implement __iter__(), like CompositePrimaryKey does?
comment:5 by , 2 weeks ago
Not for the polymorphic fields, where each field value may be a model instance of a different class with different fields. The implementation in the description uses obj._meta.local_fields rather than getting the list of subfields from field.
comment:6 by , 2 weeks ago
For another serializer API concern, see 47651eadb8ca7aacddad41da4df64fd2af11faae which fixed handling of JSONField in the XML serializer by special casing it. I didn't intend to address XML serialization in this ticket, but it could be useful if the new API were generic enough to clean that up as well.
And while I only intended to address serialization in this ticket, I found that a deserialization hook is also likely needed. Not only does the JSONField/XML patch demonstrate a need, but for the various embedded model fields, the field.to_python() handling in the Python Deserializer._handle_object() almost works, except that embedded subfields are serialized to the database using Field.column instead of Field.name as you would expect to find in fixtures. Thus, the correct deserialization for embedded model fields is EmbeddedModelField.to_python() but with field.to_python(value[field.name]) instead of field.to_python(value[field.column]).
comment:7 by , 13 days ago
We can Introduce a format-aware serialization registry on django.db.models.Field mapping serializer formats ('python', 'xml', etc.) to handler callables. To support recursive execution, the callable signature must accept the active serializer instance as context: handler(field, obj, serializer).
This allows the removal of hardcoded isinstance checks within the core serializers. For example, CompositePrimaryKey will encapsulate its serialization compilation via the registry:
CompositePrimaryKey.register_serializer_handler('python', composite_python_serializer)
Similarly, with the JSON field JSONField.register_serializer_handler('xml', json_xml_serializer)
comment:8 by , 12 days ago
The ideas seem plausible. My idea was to do the registrations on the serializer rather than on the field, but either path could be viable. Since the json, jsonl, and pyyaml serializers all subclass the Python serializer, it's unclear to me that hooks are needed for them. Are these your original ideas or are they from an LLM?
comment:9 by , 11 days ago
This, I followed up on the registry (lookup) approach as you mentioned above, at implementing very similar pattern for serializer registration with recursion support. Although, it is redundant hooks that's needed for subclasses. Would making the registration on the serializers eliminate it?
comment:10 by , 11 days ago
I guess it would! We could simply register the Python serializer and override serialization for specific fields in relevant serializers. Which approach you think is more appropriate and extensible?
comment:11 by , 8 days ago
The registration approach might be overcomplicated. In my latest approach, I added Field.serialize_to_python() and deserialize_from_python() hooks. That's enough for MongoDB's embedded model fields and plays nicely with composite fields as well. You could see if similar hooks for the XML serializer would allow cleaning up the JSONField-specific code there.
comment:12 by , 43 hours ago
| Owner: | changed from to |
|---|
I added Field.serialize_to_xml() and deserialize_from_xml() to my exploratory branch and it seems to work well. I'll refine it and make a PR.
Could we do a protocol-based hook similar to this?
django/core/serializers/python.py
I guess, this gives custom fields full control while keeping it generic.