Opened 2 years ago

Last modified 19 months ago

#27403 new Cleanup/optimization

Document that prefetch_related doesn't guarantee transactional consistency

Reported by: Aymeric Augustin Owned by: nobody
Component: Documentation Version: master
Severity: Normal Keywords:
Cc: Shai Berger Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Let's assume a Parent model and a Child model with a FK to Parent.

Database initially looks like this:

parent1 -- child1

User runs Parent.objects.all().prefetch_related('child_set').

Django fetches parent1.

Another transaction commits, resulting in this structure:

parent1 +- child1
        `- child2
parent2

Django fetches all children pointing to parent1, which returns child1 and child2.

The result of the query is:

parent1 +- child1
        `- child2

But at no point in the history did the database look like this.

This bug report is based on code inspection.

I verified that prefetch_related doesn't start a transaction.

However I didn't write code exhibiting the race condition because that's a bit complicated.

Change History (4)

comment:1 Changed 2 years ago by Tim Graham

Triage Stage: UnreviewedAccepted

comment:2 Changed 19 months ago by Rivo Laks

Correct me if I'm wrong, but I think just wrapping the prefetch_related() in a transaction wouldn't fix this.

By default, at least in Postgres, queries made within a transaction can still see data from other _committed_ transactions, so the same problem would occur.

In Postgres, you could select a higher isolation level, I think Repeatable Read would prevent the case you've described. But this has other consequences and should be done on per-app basis. See https://www.postgresql.org/docs/current/static/transaction-iso.html for more info.

If I'm right, maybe docs should note that there might be edge cases regards transactional consistency?

comment:3 Changed 19 months ago by Shai Berger

Cc: Shai Berger added

Rivo, I think you're wrong, but not in the direction you suspected: In fact, REPEATABLE READ would still not fix this, and as far as I understand, on any database that isn't PG, even SERIALIZABLE won't -- unless we first select (with no need to fetch) the join of the tables. But that would undo much of the performance benefits of prefetch_related...

Some databse engines -- notably SQL Server -- can return several result sets (=querysets) from a single command (typically a procedure call). Something like that could, at least in principle, help us here.

comment:4 Changed 19 months ago by Simon Charette

Component: Database layer (models, ORM)Documentation
Summary: prefetch_related doesn't guarantee transactional consistencyDocument that prefetch_related doesn't guarantee transactional consistency
Type: BugCleanup/optimization

I agree with both conclusions. Repurposing the ticket for a mention in the documentation.

Note: See TracTickets for help on using tickets.
Back to Top