Opened 8 years ago
Closed 8 years ago
#28856 closed Bug (fixed)
GenericForeignKey attributes create new instances on every access
| Reported by: | Morgan Wahl | Owned by: | nobody |
|---|---|---|---|
| Component: | contrib.contenttypes | Version: | 1.10 |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Accepted | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Given these models:
class OtherSuper(models.Model):
pass
class OtherSub(OtherSuper):
pass
class Ref(models.Model):
obj_type = models.ForeignKey(ContentType, on_delete=models.PROTECT)
obj_id = models.CharField(max_length=255)
obj = GenericForeignKey('obj_type', 'obj_id')
I get this behavior:
In [1]: ref = Ref.objects.create(obj=OtherSub.objects.create()) In [2]: id(ref.obj) == id(ref.obj) Out[2]: True In [3]: ref.refresh_from_db() In [4]: id(ref.obj) == id(ref.obj) Out[4]: False
Each time ref.obj is accessed, a new instance is created for its value. This is a problem, since doing something like ref.obj.field = 1; ref.obj.save() won't actually update the field in the database. This only happens when the referenced object is an instance of a model that subclasses another model. (So, it wouldn't happen if referencing OtherSuper in the models above.) The refresh_from_db() call is also necessary to reproduce this in a simple test like the above; it happens with any instance created from an existing DB record.
I've written a regression test against stable/1.10.x . I'll attach a patch.
I discovered this because I have code that does the above (changes a field on the related model and calls save). I call this a regression because it works correctly on 1.9.
I'm not sure what the underly bug is; I looked at the diff in contenttypes between 1.9 and 1.10, and there are more than a few changes. Hopefully someone who understands the GenericForeignKey implementation can figure this out.
Attachments (1)
Change History (19)
by , 8 years ago
| Attachment: | 1.10-regression-test.patch added |
|---|
comment:1 by , 8 years ago
| Description: | modified (diff) |
|---|
comment:2 by , 8 years ago
Bisected to b9f8635f58ad743995cad2081b3dc395e55761e5. Do you spot a bug in that commit?
comment:3 by , 8 years ago
I'm pretty sure this is the expected behavior from refresh_from_db(), refreshing any relationship should create a new instance as ref.refresh_from_db() should roughly translate to ref = Ref.objects.get(pk=ref.pk) while preserving ref identity.
e.g. foreign key relationships have always worked this way
>> obj_type = ref.obj_type >> ref.refresh_from_db() >> ref.obj_type is obj_type False
b9f8635f58ad743995cad2081b3dc395e55761e5 just made this consistent for generic foreign key which wouldn't be refreshed before it landed. This was a problem because if either ref.obj_type or ref.obj_id had changed when refresh_from_db() was called then ref.obj would point to a different object that the one referenced by ref.obj_type and ref.obj_id.
Django's ORM doesn't maintain an identity mapping of object, as opposed to SQL Alchemy for example, so you shouldn't expect refreshed instances to share their identity.
comment:4 by , 8 years ago
Thanks for looking at this Simon.
I should've been a bit clearer about the problem: on the instance returned from refresh_from_db, every access of the GFK attribute produces a different instance of the related object. (I tried to show that with the id(ref.obj) == id(ref.obj) line).
I called refresh_from_db just to simulate a real-world situation where the row backing the ref instance here is created at some point, and at some later point a new model instance is instantiated for that row.
comment:5 by , 8 years ago
| Has patch: | set |
|---|---|
| Triage Stage: | Unreviewed → Accepted |
Thanks Morgan, I missed that! This is quite a complex scenario: GFK, PK type mismatch, and MTI so I'm not surprised it broke given the small number of tests for PK type coercion.
The culprit seems to be that related fields don't delegate to_python to the field they reference so the OtherSub primary key, a OneToOneField, uses the default Field.to_python which is an identity function. This causes this line to be falsey because rel_obj._meta.pk.to_python('1') returns '1' instead of 1.
I feel like the correct way to solve that would to implement ForeignKey.to_python (which OneToOneField subclasses) to use self.foreign_related_fields[0].to_python but even if this causes a single failure in the test suite I feel like this would be to intrusive for a backport. What I'd favor doing instead is implementing this approach in master and adapt GenericForeignKey.__get__ to work around the issue if a backport is deemed eligible.
Here's a PR for master: https://github.com/django/django/pull/9395
And one for 1.11.x if you judge it should be backported Tim: https://github.com/django/django/pull/9396
comment:6 by , 8 years ago
Could the backport fix be applied to 1.10 as well? Is is still supported, and this is a potential data loss scenario.
comment:7 by , 8 years ago
I don't think this merits another Django 1.10 release considering that supports ends December 1. If they care about security, most users should have upgraded to Django 1.11 by now.
comment:11 by , 8 years ago
| Resolution: | fixed |
|---|---|
| Status: | closed → new |
This fix applied to 1.11 fails when the referenced object is a subclass of a subclass.
I have regression test and fix for this situation here:
https://github.com/django/django/compare/stable/1.11.x...addgene:mw/gfk-recursive-fix
Should I open a PR?
comment:12 by , 8 years ago
I went ahead and made the PR so I could see test results.
comment:18 by , 8 years ago
| Resolution: | → fixed |
|---|---|
| Status: | new → closed |
patch with regression test