Opened 7 years ago
Closed 7 years ago
#28856 closed Bug (fixed)
GenericForeignKey attributes create new instances on every access
Reported by: | Morgan Wahl | Owned by: | nobody |
---|---|---|---|
Component: | contrib.contenttypes | Version: | 1.10 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Given these models:
class OtherSuper(models.Model): pass class OtherSub(OtherSuper): pass class Ref(models.Model): obj_type = models.ForeignKey(ContentType, on_delete=models.PROTECT) obj_id = models.CharField(max_length=255) obj = GenericForeignKey('obj_type', 'obj_id')
I get this behavior:
In [1]: ref = Ref.objects.create(obj=OtherSub.objects.create()) In [2]: id(ref.obj) == id(ref.obj) Out[2]: True In [3]: ref.refresh_from_db() In [4]: id(ref.obj) == id(ref.obj) Out[4]: False
Each time ref.obj
is accessed, a new instance is created for its value. This is a problem, since doing something like ref.obj.field = 1; ref.obj.save()
won't actually update the field in the database. This only happens when the referenced object is an instance of a model that subclasses another model. (So, it wouldn't happen if referencing OtherSuper
in the models above.) The refresh_from_db()
call is also necessary to reproduce this in a simple test like the above; it happens with any instance created from an existing DB record.
I've written a regression test against stable/1.10.x . I'll attach a patch.
I discovered this because I have code that does the above (changes a field on the related model and calls save). I call this a regression because it works correctly on 1.9.
I'm not sure what the underly bug is; I looked at the diff in contenttypes
between 1.9 and 1.10, and there are more than a few changes. Hopefully someone who understands the GenericForeignKey
implementation can figure this out.
Attachments (1)
Change History (19)
by , 7 years ago
Attachment: | 1.10-regression-test.patch added |
---|
comment:1 by , 7 years ago
Description: | modified (diff) |
---|
comment:2 by , 7 years ago
Bisected to b9f8635f58ad743995cad2081b3dc395e55761e5. Do you spot a bug in that commit?
comment:3 by , 7 years ago
I'm pretty sure this is the expected behavior from refresh_from_db()
, refreshing any relationship should create a new instance as ref.refresh_from_db()
should roughly translate to ref = Ref.objects.get(pk=ref.pk)
while preserving ref
identity.
e.g. foreign key relationships have always worked this way
>> obj_type = ref.obj_type >> ref.refresh_from_db() >> ref.obj_type is obj_type False
b9f8635f58ad743995cad2081b3dc395e55761e5 just made this consistent for generic foreign key which wouldn't be refreshed before it landed. This was a problem because if either ref.obj_type
or ref.obj_id
had changed when refresh_from_db()
was called then ref.obj
would point to a different object that the one referenced by ref.obj_type
and ref.obj_id
.
Django's ORM doesn't maintain an identity mapping of object, as opposed to SQL Alchemy for example, so you shouldn't expect refreshed instances to share their identity.
comment:4 by , 7 years ago
Thanks for looking at this Simon.
I should've been a bit clearer about the problem: on the instance returned from refresh_from_db
, every access of the GFK attribute produces a different instance of the related object. (I tried to show that with the id(ref.obj) == id(ref.obj)
line).
I called refresh_from_db
just to simulate a real-world situation where the row backing the ref
instance here is created at some point, and at some later point a new model instance is instantiated for that row.
comment:5 by , 7 years ago
Has patch: | set |
---|---|
Triage Stage: | Unreviewed → Accepted |
Thanks Morgan, I missed that! This is quite a complex scenario: GFK, PK type mismatch, and MTI so I'm not surprised it broke given the small number of tests for PK type coercion.
The culprit seems to be that related fields don't delegate to_python
to the field they reference so the OtherSub
primary key, a OneToOneField
, uses the default Field.to_python
which is an identity function. This causes this line to be falsey because rel_obj._meta.pk.to_python('1')
returns '1'
instead of 1
.
I feel like the correct way to solve that would to implement ForeignKey.to_python
(which OneToOneField
subclasses) to use self.foreign_related_fields[0].to_python
but even if this causes a single failure in the test suite I feel like this would be to intrusive for a backport. What I'd favor doing instead is implementing this approach in master and adapt GenericForeignKey.__get__
to work around the issue if a backport is deemed eligible.
Here's a PR for master: https://github.com/django/django/pull/9395
And one for 1.11.x if you judge it should be backported Tim: https://github.com/django/django/pull/9396
comment:6 by , 7 years ago
Could the backport fix be applied to 1.10 as well? Is is still supported, and this is a potential data loss scenario.
comment:7 by , 7 years ago
I don't think this merits another Django 1.10 release considering that supports ends December 1. If they care about security, most users should have upgraded to Django 1.11 by now.
comment:11 by , 7 years ago
Resolution: | fixed |
---|---|
Status: | closed → new |
This fix applied to 1.11 fails when the referenced object is a subclass of a subclass.
I have regression test and fix for this situation here:
https://github.com/django/django/compare/stable/1.11.x...addgene:mw/gfk-recursive-fix
Should I open a PR?
comment:12 by , 7 years ago
I went ahead and made the PR so I could see test results.
comment:18 by , 7 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
patch with regression test