#25693 closed Bug (fixed)
Data loss if a ManyToManyField is shadowed by Prefetch
Reported by: | Ian Foote | Owned by: | Ian Foote |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.7 |
Severity: | Release blocker | Keywords: | |
Cc: | tom@…, Simon Charette | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
With two models:
class Book(Model): name = CharField(max_length=100) class Author(Model): books = ManyToManyField(Book)
it is possible to create a prefetch query that loses data from Author.books
:
poems = Book.objects.filter(name='Poems') Author.objects.prefetch_related( Prefetch('books', queryset=poems, to_attr='books'), )
When this queryset is evaluated, each Author's books
is overridden by the poems
queryset.
Change History (24)
comment:1 by , 9 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:2 by , 9 years ago
Description: | modified (diff) |
---|
comment:3 by , 9 years ago
Cc: | added |
---|
comment:5 by , 9 years ago
Severity: | Normal → Release blocker |
---|---|
Triage Stage: | Unreviewed → Accepted |
I think these are different issues.
#21584 demonstrates getting an object, prefetching related objects, creating an additional related object: the list of prefetched related objects isn't refreshed automatically. (The description makes it look more complicated than it actually is.)
That's because parent.children_set is just a queryset. It's consistent with how querysets have always been working. There is exactly the same behavior without prefetech_related
, with any queryset for which the results have been fetched.
The DRF issue Tom mentions is probably a consequence of this. I believe DRF should refetch data that may have been invalidated by the updates before serializing its response.
Now, this bug appears to be much worse becase incorrect data gets written back to the database (if I understand correctly).
Ian, do you know since which version of Django it occurs? I'm wondering if we can but controls in place to avoid assigning prefetched querysets to writable descriptors or preventing the write from occurring...
comment:8 by , 9 years ago
Cc: | added |
---|
This issue also recently surfaced on the django-users mailing list.
Now, this bug appears to be much worse becase incorrect data gets written back to the database (if I understand correctly).
You do.
Ian, do you know since which version of Django it occurs? I'm wondering if we can but controls in place to avoid assigning prefetched querysets to writable descriptors or preventing the write from occurring...
This bug exists since the introduction of the Prefetch
object (Django 1.7).
This seems related to #25550. Would a backport fix this bug?
The ticket is related but the issue will remain until the assignment is actually removed. It's currently only pending deprecation on master
.
Since we'll have to live with this assignment issue until 2.0 the only viable solution I can think of at the moment would be to raise a ValueError
if getattr(queryset.model, to_attr)
is an instance of one of the problematic descriptors.
comment:9 by , 9 years ago
Note that the issue is even worse with GenericRelation
on Django < 1.9 which used to issue a manager.clear()
instead of relying on the revised logic added by #21169.
comment:10 by , 9 years ago
Has patch: | set |
---|
comment:11 by , 9 years ago
Needs documentation: | set |
---|---|
Patch needs improvement: | set |
Version: | master → 1.7 |
The patch is looking good. It's only missing a release note for the 1.8 and 1.7 series.
comment:12 by , 9 years ago
Needs documentation: | unset |
---|---|
Patch needs improvement: | unset |
Realities downstream issue: https://github.com/tomchristie/django-rest-framework/issues/2442
I assume this ticket is a duplicate of the (closed, wontfix) ticket here: https://code.djangoproject.com/ticket/21584