Opened 7 years ago

Closed 10 months ago

Last modified 7 months ago

#29280 closed New feature (fixed)

Add a database OPTIONS to specify the transaction mode one SQLite.

Reported by: Ove Kåven Owned by: Anže Pečar
Component: Database layer (models, ORM) Version: 2.0
Severity: Normal Keywords: sqlite, databases
Cc: Aymeric Augustin Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Ove Kåven)

I'd like to propose a change like this, which I think would fix a class of SQLite "database is locked" problems. But it's possible you want to add a config option, or an argument to transaction.atomic(), or something of the kind.

diff --git a/django/db/backends/sqlite3/base.py b/django/db/backends/sqlite3/base.py
index 3989028930..391a50789e 100644
--- a/django/db/backends/sqlite3/base.py
+++ b/django/db/backends/sqlite3/base.py
@@ -272,7 +272,7 @@ class DatabaseWrapper(BaseDatabaseWrapper):
         Staying in autocommit mode works around a bug of sqlite3 that breaks
         savepoints when autocommit is disabled.
         """
-        self.cursor().execute("BEGIN")
+        self.cursor().execute("BEGIN IMMEDIATE")
 
     def is_in_memory_db(self):
         return self.creation.is_in_memory_db(self.settings_dict['NAME'])

Explanation: Consider a type of transaction consisting of, for example, an update_or_create() call, which you run in a "with transaction.atomic()" context. Suppose two threads or processes run such a transaction at exactly the same time (but with different keys, doesn't matter, SQLite locks the whole database anyway).

  1. The current transaction.atomic() implementation results in a "BEGIN", which tells SQLite to start deferred transactions, i.e., to *not* acquire any locks before it has to.
  2. Normally, update_or_create would first do a "SELECT FOR UPDATE", but SQLite doesn't support that, so a plain "SELECT" is done. Both threads acquire a shared read lock, and so the SELECTs succeed for both threads.
  3. Next, update_or_create needs to do an INSERT or UPDATE (not important which), so the threads needs to upgrade the read locks to write locks. Unfortunately, only one thread can have the write lock.

Now, the designers of SQLite apparently realized that if you already have the read lock, then it's not a good idea to start a blocking wait on the write lock. If you wait while holding the read lock, then nobody will ever be able to acquire exclusive access to the database, and everything will just hang. On the other hand, if you drop the read lock and then wait, then you lose the 'serializable' isolation guarantee that SQLite tries to give. Hence, SQLite has only one choice.

  1. The thread that didn't get the write lock immediately gets a "database is locked" error. Its transaction is aborted.

The timeout mentioned in the Django documentation will have absolutely no effect on this, and it doesn't matter how short-lived your transactions are. It only matters that they do a read before they do a write.

I can see three possible solutions to this problem:

  1. "Don't do that". Don't use SQLite, or don't do concurrency. (Many answers on StackOverflow.com and such are essentially this.)
  2. Treat the "database is locked" as if it were a "serialization error", which it kind of is. That is, the app must retry the transaction until it commits. That works, but is somewhat unsatisfactory.
  3. Grab the write lock immediately (like "SELECT FOR UPDATE" would have, if SQLite had supported it). That's what "BEGIN IMMEDIATE", in the above patch, does. After all, if you're not holding the read lock when you try to grab the write lock, then this particular problem won't happen. (Of course, the lock timeout thing that's mentioned in the Django documentation can still happen if your transactions are too long-lived, but that's different.)

And I think that if you're using "with transaction.atomic()" in the first place, then you most likely want to write to the database, in which case there's not much reason not to always grab the write lock right away. But even if it's decided that some config option would be good, it probably shouldn't be too hard to add? Or?

Change History (13)

comment:1 by Ove Kåven, 7 years ago

Description: modified (diff)

comment:2 by Tim Graham, 7 years ago

Cc: Aymeric Augustin added

Aymeric, do you have expertise to offer on this proposal?

comment:3 by Aymeric Augustin, 7 years ago

So the proposal here is to always use immediate rather than deferred transactions.

Per https://www.sqlite.org/lang_transaction.html:

After a BEGIN IMMEDIATE, no other database connection will be able to write to the database or do a BEGIN IMMEDIATE or BEGIN EXCLUSIVE.

This means all atomic blocks will be serialized. Combined with ATOMIC_REQUESTS = True, HTTP requests will be serialized. I don't think that's an acceptable behavior.

In general the performance hit of removing the possibility of concurrent database reads — and web workflows tend to be read-heavy — seems too high to implement this in Django.

Also I'm not sure it fits well with the premise of allowing concurrency. It brings us back to "Don't use SQLite, or don't do concurrency."

FWIW you can trivially subclass (or monkey-patch) the SQLite database adapter to make this change in your project.

comment:4 by Tim Graham, 7 years ago

Resolution: invalid
Status: newclosed
Summary: Fix for a class of SQLite "database is locked" problemsFix SQLite "database is locked" problems using "BEGIN IMMEDIATE"

comment:5 by btimby, 6 years ago

Just a not for anyone who has this same problem and is looking for a solution...

I agree with the assessment that transforming all atomic blocks into immediate transactions would have a huge negative impact. However, SOME atomic blocks (those containing select_for_update or update_or_create) when on sqlite and using threads / processes _could_ be immediate transactions without too great of an impact. This can be accomplished by monkey patching the atomic decorator / context manager to support an immediate argument.

If you add the following to your project's __init__.py:

from django.db import connection, DEFAULT_DB_ALIAS
from django.db import transaction


def _monkey_patch_atomic():

    def atomic(using=None, savepoint=True, immediate=False):
        # Bare decorator: @atomic -- although the first argument is called
        # `using`, it's actually the function being decorated.
        if callable(using):
            a = transaction.Atomic(DEFAULT_DB_ALIAS, savepoint)(using)
        # Decorator: @atomic(...) or context manager: with atomic(...): ...
        else:
            a = transaction.Atomic(using, savepoint)

        a.immediate = immediate
        return a

    def __enter__(self):
        connection = transaction.get_connection(self.using)

        if not connection.in_atomic_block:
            # Reset state when entering an outermost atomic block.
            connection.commit_on_exit = True
            connection.needs_rollback = False
            if not connection.get_autocommit():
                # Pretend we're already in an atomic block to bypass the code
                # that disables autocommit to enter a transaction, and make a
                # note to deal with this case in __exit__.
                connection.in_atomic_block = True
                connection.commit_on_exit = False

        if connection.in_atomic_block:
            # We're already in a transaction; create a savepoint, unless we
            # were told not to or we're already waiting for a rollback. The
            # second condition avoids creating useless savepoints and prevents
            # overwriting needs_rollback until the rollback is performed.
            if self.savepoint and not connection.needs_rollback:
                sid = connection.savepoint()
                connection.savepoint_ids.append(sid)
            else:
                connection.savepoint_ids.append(None)
        else:
            if self.immediate:
                connection.cursor().execute('BEGIN IMMEDIATE')

            else:
                connection.set_autocommit(False, force_begin_transaction_with_broken_autocommit=True)

            connection.in_atomic_block = True

    transaction.atomic = atomic
    transaction.Atomic.immediate = False
    transaction.Atomic.__enter__ = __enter__


_monkey_patch_atomic()

You can then do:

with atomic(immediate=True):
    ModelName.objects.update_or_create(foo=foo)

And the transaction will be started with BEGIN IMMEDIATE avoiding the deadlock described here. With this change in place the erstwhile deadlock victim will wait for the victor.

Version 0, edited 6 years ago by btimby (next)

comment:6 by Simon Charette, 10 months ago

Easy pickings: unset
Has patch: unset
Resolution: invalid
Status: closednew
Summary: Fix SQLite "database is locked" problems using "BEGIN IMMEDIATE"Add a database OPTIONS to specify the transaction mode one SQLite.
Triage Stage: UnreviewedAccepted
Type: BugNew feature

Re-opening as a new feature to target adding a SQLite specific OPTIONS["transaction_mode"] option per the forum and Discord discussions.

comment:7 by Anže Pečar, 10 months ago

Owner: changed from nobody to Anže Pečar
Status: newassigned

comment:8 by Anže Pečar, 10 months ago

Triage Stage: AcceptedReady for checkin

comment:9 by Mariusz Felisiak, 10 months ago

Has patch: set
Triage Stage: Ready for checkinAccepted

PR. Please don't mark your own PRs as "Ready for checkin".

comment:10 by Mariusz Felisiak, 10 months ago

Patch needs improvement: set

comment:11 by Mariusz Felisiak, 10 months ago

Patch needs improvement: unset
Triage Stage: AcceptedReady for checkin

comment:12 by Mariusz Felisiak <felisiak.mariusz@…>, 10 months ago

Resolution: fixed
Status: assignedclosed

In a0204ac1:

Fixed #29280 -- Made the transactions behavior configurable on SQLite.

comment:13 by Sarah Boyce <42296566+sarahboyce@…>, 7 months ago

In 9d5c0244:

Refs #29280 -- Moved release note about transaction_mode to "Database backends" section.

Note: See TracTickets for help on using tickets.
Back to Top