Opened 7 years ago

Closed 6 months ago

Last modified 6 months ago

#27403 closed Cleanup/optimization (fixed)

Document that prefetch_related doesn't guarantee transactional consistency

Reported by: Aymeric Augustin Owned by: Tim Bell
Component: Documentation Version: dev
Severity: Normal Keywords:
Cc: Shai Berger Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Let's assume a Parent model and a Child model with a FK to Parent.

Database initially looks like this:

parent1 -- child1

User runs Parent.objects.all().prefetch_related('child_set').

Django fetches parent1.

Another transaction commits, resulting in this structure:

parent1 +- child1
        `- child2
parent2

Django fetches all children pointing to parent1, which returns child1 and child2.

The result of the query is:

parent1 +- child1
        `- child2

But at no point in the history did the database look like this.

This bug report is based on code inspection.

I verified that prefetch_related doesn't start a transaction.

However I didn't write code exhibiting the race condition because that's a bit complicated.

Change History (9)

comment:1 by Tim Graham, 7 years ago

Triage Stage: UnreviewedAccepted

comment:2 by Rivo Laks, 7 years ago

Correct me if I'm wrong, but I think just wrapping the prefetch_related() in a transaction wouldn't fix this.

By default, at least in Postgres, queries made within a transaction can still see data from other _committed_ transactions, so the same problem would occur.

In Postgres, you could select a higher isolation level, I think Repeatable Read would prevent the case you've described. But this has other consequences and should be done on per-app basis. See https://www.postgresql.org/docs/current/static/transaction-iso.html for more info.

If I'm right, maybe docs should note that there might be edge cases regards transactional consistency?

comment:3 by Shai Berger, 7 years ago

Cc: Shai Berger added

Rivo, I think you're wrong, but not in the direction you suspected: In fact, REPEATABLE READ would still not fix this, and as far as I understand, on any database that isn't PG, even SERIALIZABLE won't -- unless we first select (with no need to fetch) the join of the tables. But that would undo much of the performance benefits of prefetch_related...

Some databse engines -- notably SQL Server -- can return several result sets (=querysets) from a single command (typically a procedure call). Something like that could, at least in principle, help us here.

comment:4 by Simon Charette, 7 years ago

Component: Database layer (models, ORM)Documentation
Summary: prefetch_related doesn't guarantee transactional consistencyDocument that prefetch_related doesn't guarantee transactional consistency
Type: BugCleanup/optimization

I agree with both conclusions. Repurposing the ticket for a mention in the documentation.

comment:5 by Tim Bell, 6 months ago

Owner: changed from nobody to Tim Bell
Status: newassigned

I'm looking at this ticket during the DjangoCon sprints.

I've reproduced a version of this race condition using the models defined in the prefetch_related() docs (https://docs.djangoproject.com/en/4.2/ref/models/querysets/#prefetch-related). By adding a sleep() at the start of prefetch_related_objects(), I was able to run another database query in between the two queries required to evaluate Pizza.objects.prefetch_related('toppings').

When deleting Pizza.objects.get(name="Hawaiian") during the sleep, we get this:

>>> Pizza.objects.prefetch_related('toppings')
Sleeping...
<QuerySet [<Pizza: Hawaiian ()>, <Pizza: Seafood (prawns, smoked salmon)>]>

Just like in the example from this ticket's description, at no point in the history did the database look like this.

I'll now try to write a succinct description of this issue (only one or two sentences) to add to the docs.

comment:6 by Tim Bell, 6 months ago

Has patch: set
Last edited 6 months ago by Mariusz Felisiak (previous) (diff)

comment:7 by Mariusz Felisiak, 6 months ago

Triage Stage: AcceptedReady for checkin

comment:8 by Mariusz Felisiak <felisiak.mariusz@…>, 6 months ago

Resolution: fixed
Status: assignedclosed

In ee10425:

Fixed #27403 -- Doc'd that QuerySet.prefetch_related() doesn't guarantee transactional consistency.

Added a note about the potential race condition in prefetch_related()
that could produce an inconsistent result, one that does not correspond
to any point in the database history.

comment:9 by Mariusz Felisiak <felisiak.mariusz@…>, 6 months ago

In 8b18e0bb:

[5.0.x] Fixed #27403 -- Doc'd that QuerySet.prefetch_related() doesn't guarantee transactional consistency.

Added a note about the potential race condition in prefetch_related()
that could produce an inconsistent result, one that does not correspond
to any point in the database history.

Backport of ee104251c403fbac83b8475163ff2ac01c567d25 from main

Note: See TracTickets for help on using tickets.
Back to Top