#33333 closed Bug (fixed)
Models with a BinaryField fail to deepcopy in setUpTestData() on PostgreSQL.
| Reported by: | Adam Zimmerman | Owned by: | Mariusz Felisiak |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 3.2 |
| Severity: | Release blocker | Keywords: | |
| Cc: | Simon Charette | Triage Stage: | Accepted |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
While using Postgres, models with a BinaryField fail to deepcopy, due to Django returning these field values as memoryview objects. This causes issues when creating instances in setUpTestData() during tests.
The specific errors seem to be fairly inconsistent and hard to reproduce. But often the exception complains about a missing _state field, or that _state has no attribute db.
I've currently worked around this by implementing a mixin for models where the __getstate__() method converts any memoryviews to bytes. But I don't know if that's the correct solution for Django generally.
Change History (16)
comment:1 by , 4 years ago
comment:2 by , 4 years ago
| Cc: | added |
|---|---|
| Severity: | Normal → Release blocker |
| Summary: | Models with a BinaryField fail to deepcopy → Models with a BinaryField fail to deepcopy in setUpTestData() on PostgreSQL. |
Thanks for the report! Regression in 3cf80d3fcf7446afdde16a2be515c423f720e54d. I was able to reproduce this issue on PostgreSQL with the following test:
class MyTests(TestCase): @classmethod def setUpTestData(cls): MyModel.objects.create(binary=b'y') cls.my_binary = MyModel.objects.get() def test_1(self): self.my_binary.binary = b'z' self.my_binary.save()
comment:3 by , 4 years ago
| Triage Stage: | Unreviewed → Accepted |
|---|
Feels like the issue should be solved at the model level because while it break setupTestData (3cf80d3fcf7446afdde16a2be515c423f720e54d) it also breaks any form of QuerySet or Model pickling
MyModel.objects.create(binary=b'y') pickle.dumps(MyModel.objects.all())
This can be reproduced by adding a BinaryField to models in the test queryset_pickle apps.
comment:4 by , 4 years ago
My first instinct would be to have BinaryField.from_db_value use force_bytes to store bytes instead of memoryview in _state or introduce a hook to allow Field subclass to define a hook using during serialization and have Model.__getstate__ call it.
To me that's more of a long standing issue with BinaryField and less of a regression caused by 3cf80d3fcf7446afdde16a2be515c423f720e54d.
comment:5 by , 4 years ago
Simon, your diagnostic might be right, but I think it's late in the 4.0 release cycle to make such changes now, also considering changing memoryview to bytes might have serious memory implications not plainly visible in tests.
What about writing a workaround for setuptestData in 4.0, and make a more proper and thorough fix in 4.1?
comment:6 by , 4 years ago
Given we're pretty close to the 4.0 release date I'm not against including a limited hack for setUpTestData that special cases Model and QuerySet instances.
It does feel weird that we're treating this as a release blocker now though given it has been broken since BinaryField was introduced and setUpTestData has been emitting deprecation warnings about it since Django 3.2.
comment:7 by , 4 years ago
We have also a ticket about BinaryField inconsistency, see #27813.
What about writing a workaround for setuptestData in 4.0 ...
For me it is a 3.2 release blocker.
comment:8 by , 4 years ago
considering changing memoryview to bytes might have serious memory implications not plainly visible in tests.
Fixing #27813 would be somewhat of a breaking change on PostgreSQL as a different type would be returned but I doubt it would have memory implications in the sense of elevated memory usage.
In the end memoryview is a pointer to a sequence of bytes but these bytes have to live somewhere when returned from the database backend and made available to the client. Returning them proxied by a memoryview instance or not should not make a significant difference in memory usage beyond the size of the memoryview instance itself.
---
Mariusz, how do you want to proceed here? I have to admit I'm not sure how to navigate the current state of things.
Should we submit a patch meant to be backported to 3.2 and 4.0 in setUpTestData that solely deals with this particular edge case of models and queryset with a BinaryField on Postgres to circumvent #27813? What if we get other reports of users who ignored the deprecation warning for other non-pickleable objects assigned during setUpTestData?
comment:9 by , 4 years ago
Should we submit a patch meant to be backported to 3.2 and 4.0 in setUpTestData that solely deals with this particular edge case of models and queryset with a BinaryField on Postgres to circumvent #27813?
Yes, it should be feasible by juggling with __getstate__()/__reduce__() hooks 🤔 I think we should treat this as a release blocker because BinaryField is built-in into Django and 3cf80d3fcf7446afdde16a2be515c423f720e54d broke previously working tests.
What if we get other reports of users who ignored the deprecation warning for other non-pickleable objects assigned during setUpTestData?
That would not be a release blocker, IMO, unless any other built-in is non-pickleable.
For anyone who finds this ticket and wants to work around it before it's fixed in Django, here's the mixin I'm currently using: