Opened 16 years ago

Closed 15 years ago

Last modified 13 years ago

#9409 closed (fixed)

"OperationalError: database is locked" with Python 2.6 multiprocessing and SQLite backend

Reported by: mrts Owned by: Jacob
Component: Documentation Version: 1.0
Severity: Keywords:
Cc: gabor@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Running the attached test cases in Ubuntu Hardy with Python 2.6, saving models to database fails with

Process ConcurrentAccess-79:
Traceback (most recent call last):
  File "/usr/local/lib/python2.6/multiprocessing/process.py", line 231, in _bootstrap
    self.run()
  File "./db_locked_error.py", line 18, in run
    f.save()
  File "/usr/local/lib/python2.6/site-packages/Django-1.0_final-py2.6.egg/django/db/models/base.py", line 307, in save
    self.save_base(force_insert=force_insert, force_update=force_update)
  File "/usr/local/lib/python2.6/site-packages/Django-1.0_final-py2.6.egg/django/db/models/base.py", line 379, in save_base
    result = manager._insert(values, return_id=update_pk)
  File "/usr/local/lib/python2.6/site-packages/Django-1.0_final-py2.6.egg/django/db/models/manager.py", line 138, in _insert
    return insert_query(self.model, values, **kwargs)
  File "/usr/local/lib/python2.6/site-packages/Django-1.0_final-py2.6.egg/django/db/models/query.py", line 888, in insert_query
    return query.execute_sql(return_id)
  File "/usr/local/lib/python2.6/site-packages/Django-1.0_final-py2.6.egg/django/db/models/sql/subqueries.py", line 308, in execute_sql
    cursor = super(InsertQuery, self).execute_sql(None)
  File "/usr/local/lib/python2.6/site-packages/Django-1.0_final-py2.6.egg/django/db/models/sql/query.py", line 1700, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python2.6/site-packages/Django-1.0_final-py2.6.egg/django/db/backends/util.py", line 19, in execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python2.6/site-packages/Django-1.0_final-py2.6.egg/django/db/backends/sqlite3/base.py", line 167, in execute
    return Database.Cursor.execute(self, query, params)
OperationalError: database is locked

Attachments (3)

db_locked_error.py (690 bytes ) - added by mrts 16 years ago.
concurrent.tar.gz (6.3 KB ) - added by mrts 16 years ago.
The corresponding Django project
database_is_locked_docs.diff (1.5 KB ) - added by mrts 16 years ago.
Docs that explain the reason of and solution for the error (AFAICS).

Download all attachments as: .zip

Change History (21)

by mrts, 16 years ago

Attachment: db_locked_error.py added

by mrts, 16 years ago

Attachment: concurrent.tar.gz added

The corresponding Django project

comment:1 by anonymous, 16 years ago

Summary: OperationalError: database is locked with Python 2.6 multiprocessing and SQLite backend"OperationalError: database is locked" with Python 2.6 multiprocessing and SQLite backend

comment:2 by mrts, 16 years ago

This seems to be SQLite-specific, e.g. MySQL backend is not affected. If it's hard to fix (e.g. extra heavy-weight locking machinery), it should be probably wontfixed.

comment:3 by Gábor Farkas, 16 years ago

Cc: gabor@… added

comment:4 by Malcolm Tredinnick, 16 years ago

Resolution: invalid
Status: newclosed

There's no Django bug here; it's expected behaviour. The threadsafety level for sqlite3 is 1, which means connections cannot be shared between threads.

In normal operations, the backend is imported into each thread, so there's a separate DatabaseWrapper instance and hence a separate connection. Your test case is not normal, in that it's reusing the connection between threads. The bug is in the assumption that it's safe to do that.

comment:5 by Gábor Farkas, 16 years ago

if i'm not mistaken, the test-case uses the multiprocessing-module, so the code is single-threaded, but uses multiple processes.

comment:6 by mrts, 16 years ago

Component: Database layer (models, ORM)Documentation
Resolution: invalid
Status: closedreopened

OperationalError: database is locked seems to imply the code is thread-aware (assuming locked denotes locked by a thread-aware lock). I'm not aware of the internals, but setting DATABASE_OPTIONS = {'timeout': 30} as suggested by Brian Beck in http://groups.google.com/group/django-developers/browse_thread/thread/d320de970c2a4016 solved the problem. So connections cannot be shared between threads seems to be incorrect -- Django ORM seems to do it's job quite well when timeout is larger with the sample code.

I'd say this should be documented and I'm willing to do it (unfortunately I'm currently dead busy).

by mrts, 16 years ago

Docs that explain the reason of and solution for the error (AFAICS).

in reply to:  6 comment:9 by Ramiro Morales, 16 years ago

Replying to mrts:

OperationalError: database is locked seems to imply the code is thread-aware
[...]
So connections cannot be shared between threads seems to be incorrect -- Django ORM seems to do it's job quite well when timeout is larger with the sample code.

I don't know if these mailing list threads and documentation on multithreaded access to SQLite databases are relevant, as gabor mentioned, the mutiprocessing Python 2.6 module (formerly pyprocessing) you used for your tests don't use threads but processes, see this SQLite FAQ entry. It seems very logical the fact that if you are firing 100 (!) processes all of them trying to insert one record to the same SQLite database table, at least some of them will try to do so in a concurent way and you will get the failure.

comment:10 by mrts, 16 years ago

Where do you get the impression I'm not aware that processes are used instead of threads in multiprocessing? Concurrent processing idioms and problems are the same whether threading or multiprocessing is used, only GIL reduces concurrency (and hides some problems) in threading context a bit. And I'm obviously launching that many processes to assure that the problem indeed manifests, this is a test case.

The mailing list threads are relevant for having a general idea of the issues. I hope the explanation that I've attached is sufficient, if you find it is not or that I've missed something or made a mistake in the explanation, please correct it and attach a new patch.

in reply to:  10 comment:11 by Ramiro Morales, 16 years ago

Replying to mrts:

Where do you get the impression I'm not aware that processes are used instead of threads in multiprocessing? Concurrent processing idioms and problems are the same whether threading or multiprocessing is used

I wouldn't add documentation that talks about multithreading misleading people to think it's a supported configuration. If (as you know very well) Django documentation makes no explicit mention about multithreading being supported at all; the why should a warning about failures (with a hypothetical multithreaded scenario reproduced not with threads but concurrent processes) with a DB backend to a software (SQLite) that is the one that supports concurrency to the lesser extent be added?

comment:12 by mrts, 16 years ago

Your last comment is misleading as a) Django is generally threadsafe and I personally use it routinely with threaded apache and mod_wsgi with no ill effects, b) this is not a thread, but a general concurrency issue that pops up even in Google cache (see http://209.85.135.104/search?q=cache:kEMOo-HuvzgJ:www.rkblog.rk.edu.pl/w/p/django-nginx/+django+OperationalError:+database+is+locked ) for live sites running on top of mod_python. You are perhaps right that "threaded environment" and "thread" are not appropriate terms, a more generic term that encompasses both processes and threads might be required. Someone could amend the patch with such a term.

in reply to:  description comment:13 by anonymous, 16 years ago

Replying to mrts:

I've been having this same issue after switching from parallel python (pp) to multiprocessing under 2.5 using psycopg2 -- under heavy load I get the error InternalError: SET TRANSACTION ISOLATION LEVEL must be called before any query (usually, though it can vary). As near as I can tell, this is an issue caused by multiple processes using the same database connection; it seems if I import a model or db.connection anywhere and then run worker processes, the imports aren't "redone" (even if explicitly added as imports in the worker function) which leads to multiple processes using the same db connection which of course leads to them stepping on each other trying to make queries.

I'm not entirely sure they're caused by the same thing, but I figured I'd annotate it here anyway.

comment:14 by Jacob, 16 years ago

milestone: 1.1
Triage Stage: UnreviewedAccepted

comment:15 by Bob Thomas, 15 years ago

Has patch: set

comment:16 by Jacob, 15 years ago

Owner: changed from nobody to Jacob
Status: reopenedassigned

comment:17 by Jacob, 15 years ago

Resolution: fixed
Status: assignedclosed

(In [10371]) Fixed a whole bunch of small docs typos, errors, and ommissions.

Fixes #8358, #8396, #8724, #9043, #9128, #9247, #9267, #9267, #9375, #9409, #9414, #9416, #9446, #9454, #9464, #9503, #9518, #9533, #9657, #9658, #9683, #9733, #9771, #9835, #9836, #9837, #9897, #9906, #9912, #9945, #9986, #9992, #10055, #10084, #10091, #10145, #10245, #10257, #10309, #10358, #10359, #10424, #10426, #10508, #10531, #10551, #10635, #10637, #10656, #10658, #10690, #10699, #19528.

Thanks to all the respective authors of those tickets.

comment:18 by Jacob, 13 years ago

milestone: 1.1

Milestone 1.1 deleted

Note: See TracTickets for help on using tickets.
Back to Top