Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#25693 closed Bug (fixed)

Data loss if a ManyToManyField is shadowed by Prefetch

Reported by: Ian Foote Owned by: Ian Foote
Component: Database layer (models, ORM) Version: 1.7
Severity: Release blocker Keywords:
Cc: tom@…, Simon Charette Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Ian Foote)

With two models:

class Book(Model):
    name = CharField(max_length=100)

class Author(Model):
    books = ManyToManyField(Book)

it is possible to create a prefetch query that loses data from Author.books:

poems = Book.objects.filter(name='Poems')
Author.objects.prefetch_related(
    Prefetch('books', queryset=poems, to_attr='books'),
)

When this queryset is evaluated, each Author's books is overridden by the poems queryset.

Change History (24)

comment:1 Changed 4 years ago by Ian Foote

Owner: changed from nobody to Ian Foote
Status: newassigned

comment:2 Changed 4 years ago by Ian Foote

Description: modified (diff)

comment:3 Changed 4 years ago by Tom Christie

Cc: tom@… added

comment:4 Changed 4 years ago by Tom Christie

Related downstream issue: https://github.com/tomchristie/django-rest-framework/issues/2442

I assume this ticket is a duplicate of the (closed, wontfix) ticket here: https://code.djangoproject.com/ticket/21584

Last edited 4 years ago by Tom Christie (previous) (diff)

comment:5 Changed 4 years ago by Aymeric Augustin

Severity: NormalRelease blocker
Triage Stage: UnreviewedAccepted

I think these are different issues.


#21584 demonstrates getting an object, prefetching related objects, creating an additional related object: the list of prefetched related objects isn't refreshed automatically. (The description makes it look more complicated than it actually is.)

That's because parent.children_set is just a queryset. It's consistent with how querysets have always been working. There is exactly the same behavior without prefetech_related, with any queryset for which the results have been fetched.

The DRF issue Tom mentions is probably a consequence of this. I believe DRF should refetch data that may have been invalidated by the updates before serializing its response.


Now, this bug appears to be much worse becase incorrect data gets written back to the database (if I understand correctly).

Ian, do you know since which version of Django it occurs? I'm wondering if we can but controls in place to avoid assigning prefetched querysets to writable descriptors or preventing the write from occurring...

comment:6 Changed 4 years ago by Marten Kenbeek

This seems related to #25550. Would a backport fix this bug?

comment:7 Changed 4 years ago by Ian Foote

I've verified it on master and on 1.8. I assume it also applies to 1.9.

comment:8 Changed 4 years ago by Simon Charette

Cc: Simon Charette added

This issue also recently surfaced on the django-users mailing list.

Now, this bug appears to be much worse becase incorrect data gets written back to the database (if I understand correctly).

You do.

Ian, do you know since which version of Django it occurs? I'm wondering if we can but controls in place to avoid assigning prefetched querysets to writable descriptors or preventing the write from occurring...

This bug exists since the introduction of the Prefetch object (Django 1.7).

This seems related to #25550. Would a backport fix this bug?

The ticket is related but the issue will remain until the assignment is actually removed. It's currently only pending deprecation on master.

Since we'll have to live with this assignment issue until 2.0 the only viable solution I can think of at the moment would be to raise a ValueError if getattr(queryset.model, to_attr) is an instance of one of the problematic descriptors.

comment:9 Changed 4 years ago by Simon Charette

Note that the issue is even worse with GenericRelation on Django < 1.9 which used to issue a manager.clear() instead of relying on the revised logic added by #21169.

comment:10 Changed 4 years ago by Ian Foote

Has patch: set

comment:11 Changed 4 years ago by Simon Charette

Needs documentation: set
Patch needs improvement: set
Version: master1.7

The patch is looking good. It's only missing a release note for the 1.8 and 1.7 series.

comment:12 Changed 4 years ago by Ian Foote

Needs documentation: unset
Patch needs improvement: unset

comment:13 Changed 4 years ago by Tim Graham <timograham@…>

Resolution: fixed
Status: assignedclosed

In 46085737:

Fixed #25693 -- Prevented data loss with Prefetch and ManyToManyField.

Thanks to Jamie Matthews for finding and explaining the bug.

comment:14 Changed 4 years ago by Tim Graham <timograham@…>

In f9a08eb:

[1.9.x] Fixed #25693 -- Prevented data loss with Prefetch and ManyToManyField.

Thanks to Jamie Matthews for finding and explaining the bug.

Backport of 4608573788c04fc047da42b4b7b48fdee8136ad3 from master

comment:15 Changed 4 years ago by Tim Graham <timograham@…>

In 5fc9a1b8:

[1.8.x] Fixed #25693 -- Prevented data loss with Prefetch and ManyToManyField.

Thanks to Jamie Matthews for finding and explaining the bug.

Backport of 4608573788c04fc047da42b4b7b48fdee8136ad3 from master

comment:16 Changed 4 years ago by Simon Charette <charette.s@…>

In 6184cb9b:

[1.7.x] Fixed #25693 -- Prevented data loss with Prefetch and ManyToManyField.

Thanks to Jamie Matthews for finding and explaining the bug.

Backport of 4608573788c04fc047da42b4b7b48fdee8136ad3 from master

comment:17 Changed 4 years ago by Simon Charette <charette.s@…>

In 4a9c32f:

Refs #25693 -- Avoided redundant calls to get_fields() in to_attr validation.

comment:18 Changed 4 years ago by Simon Charette <charette.s@…>

In cc8c02fa:

Refs #25693 -- Added a regression test for to_attr validation on forward m2m.

comment:19 Changed 4 years ago by Simon Charette <charette.s@…>

In 0d1d307:

[1.9.x] Refs #25693 -- Avoided redundant calls to get_fields() in to_attr validation.

Backport of 4a9c32f5eece9030c2b568e930cec0c1ba8f1da0 from master

comment:20 Changed 4 years ago by Simon Charette <charette.s@…>

In 164cbdac:

[1.9.x] Refs #25693 -- Added a regression test for to_attr validation on forward m2m.

Backport of cc8c02fa0fa2119704d1c39ca8509850aef84acc from master

comment:21 Changed 4 years ago by Simon Charette <charette.s@…>

In a3baee2:

[1.8.x] Refs #25693 -- Avoided redundant calls to get_fields() in to_attr validation.

Backport of 4a9c32f5eece9030c2b568e930cec0c1ba8f1da0 from master

comment:22 Changed 4 years ago by Simon Charette <charette.s@…>

In ae461380:

[1.8.x] Refs #25693 -- Added a regression test for to_attr validation on forward m2m.

Backport of cc8c02fa0fa2119704d1c39ca8509850aef84acc from master

comment:23 Changed 4 years ago by Simon Charette <charette.s@…>

In 3d037b9:

[1.7.x] Refs #25693 -- Avoided redundant calls to get_fields() in to_attr validation.

Conflicts:

django/db/models/query.py

Backport of cc8c02fa0fa2119704d1c39ca8509850aef84acc from master

comment:24 Changed 4 years ago by Simon Charette <charette.s@…>

In fd14265:

[1.7.x] Refs #25693 -- Added a regression test for to_attr validation on forward m2m.

Backport of cc8c02fa0fa2119704d1c39ca8509850aef84acc from master

Note: See TracTickets for help on using tickets.
Back to Top