Opened 5 years ago

Closed 5 days ago

#21171 closed Cleanup/optimization (fixed)

Skip transaction creation for single statement operations

Reported by: Anssi Kääriäinen Owned by: nobody
Component: Database layer (models, ORM) Version: 1.5
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

There are a couple of cases where Django will execute just a single data modifying statement for the operation done. Examples include single table .save(), .delete() that doesn't cascade, bulk_create() and .update(). Atomicity of single statements guarantees that these operations work without creating / releasing transaction. This could increase performance of these statements when used in autocommit mode.

The save() method should be easy to handle - it is easy to check if the table has parents or not. Bulk create is also somewhat easy, although there are cases where multiple queries need to be executed. Model deletion might be hard to handle, signals are sent inside the same transaction and it isn't that far fetched to think that signal handler will do additional data modifications (log changes for example).

It might be there are more operations that could benefit from this - get_or_create() and update_or_create() for example.

I got the idea while inspecting if cache db backend could use ORM directly. Currently the db backend doesn't wrap all data modifications in atomic blocks, but if using ORM directly that would happen and this could lead to performance problems in the backend.

Change History (5)

comment:1 Changed 5 years ago by Tim Graham

Triage Stage: UnreviewedAccepted

comment:2 Changed 5 years ago by Shai Berger

I must be missing something -- in current master, get_or_create() and update_or_create() are not single-statement operations.

Also, as far as I see, the signals argument applies to save() just as much as it applies to delete().

Perhaps add a keyword argument? subtransaction=True, so if you want to save a bit you pass in False?

comment:3 Changed 5 years ago by Anssi Kääriäinen

For save() signals are sent outside the transaction but for delete inside the transaction. So, for delete, if you skip the transaction creation, then queries ran from signals aren't inside transaction control. But for save() this isn't the case - any DML is ran outside transaction in any case.

The delete() case is a change in behaviour.

You are right about get_or_create() and update_or_create().

All in all, I'm beginning to wonder if this is really worth it...

comment:4 Changed 5 days ago by Tim Graham

Has patch: set

comment:5 Changed 5 days ago by Florian Apolloner <apollo13@…>

Resolution: fixed
Status: newclosed

In bc7dd849:

Fixed #21171 -- Avoided starting a transaction when a single (or atomic queries) are executed.

Checked the following locations:

  • Model.save(): If there are parents involved, take the safe way and use transactions since this should be an all or nothing operation.

If the model has no parents:

  • Signals are executed before and after the previous existing transaction -- they were never been part of the transaction.
  • if force_insert is set then only one query is executed -> atomic by definition and no transaction needed.
  • same applies to force_update.
  • If a primary key is set and no force_* is set Django will try an UPDATE and if that returns zero rows it tries an INSERT. The first case is completly save (single query). In the second case a transaction should not produce different results since the update query is basically a no-op then (might miss something though).
  • QuerySet.update(): no signals issued, single query -> no transaction needed.
  • Model/Collector.delete(): This one is fun due to the fact that is does many things at once.

Most importantly though: It does send signals as part of the
transaction, so for maximum backwards compatibility we need to be
conservative.

To ensure maximum compatibility the transaction here is removed only
if the following holds true:

  • A single instance is being deleted.
  • There are no signal handlers attached to that instance.
  • There are no deletions/updates to cascade.
  • There are no parents which also need deletion.
Note: See TracTickets for help on using tickets.
Back to Top