Django

Code

Ticket #3460 (reopened)

Opened 1 year ago

Last modified 2 days ago

postgresql_psycopg2 backend uses wrong isolation level

Reported by: Jack Moffitt <metajack@gmail.com> Assigned to: nobody
Milestone: Component: Database wrapper
Version: SVN Keywords:
Cc: sam@robots.org.uk, jarek.zgoda@gmail.com Triage Stage: Design decision needed
Has patch: 1 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 1

Description

Postgresql only has two transaction isolation levels: read committed and serializable. READ COMMITTED is the default and the one django expects to use. psycopg2 however defines three isolation levels: autocommit, read committed, and serializable. The difference between autocommit and read committed is that read committed in psycopg2 puts all statements inside a BEGIN/END block (or BEGIN/ROLLBACK or BEGIN/COMMIT). Inside the BEGIN it also executes SET TRANSACTION ISOLATION LEVEL READ COMMITTED (which is redundant as this isolation level is the default).

Django currently uses psycopg2's read committed isolation level, and this leads to every query (almost) being surrounded by spurious BEGIN/END. In a heavily loaded system, this consumes a massive amount of the total database time.

Django should be using the autocommit isolation level, which does not have this overhead. Anywhere explicit transaction blocks are needed they can be used. The attached patch switches the backend to the autocommit isolation level. I have tested this patch on our production system and it makes a noticeable speed improvement.

NOTE: Django's ORM does an existence test followed by and UPDATE or INSERT when save() is called. The read committed isolation level does not protect against this failing even within a transaction block, and therefor this change should have no side effects on Django code.

Attachments

autocommit.diff (0.8 kB) - added by Jack Moffitt <metajack@gmail.com> on 02/08/07 14:34:35.
switch psycopg2's default isolation level to autocommit
isotest.py (0.9 kB) - added by Jack Moffitt <metajack@gmail.com> on 02/08/07 16:05:04.
test script
combined.diff (1.5 kB) - added by Jack Moffitt <metajack@gmail.com> on 02/08/07 16:05:37.
updated patch; also fixes #3459
psycopg2-commit-mode-r5790.diff (1.5 kB) - added by John Shaffer <jshaffer2112@gmail.com> on 08/03/07 16:03:29.
Updated patch for [5790]. The patch causes problems with fixtures and causes some of Django's tests to fail.
3460-autocommit.diff (2.8 kB) - added by Timothée Peignier <tim@tryphon.org> on 02/29/08 09:35:27.
Change isolation level without breaking fixtures and tests
autocommit-7624.patch (4.2 kB) - added by cgrady on 06/12/08 01:08:23.
added similar fix to postgresql backend for psycopg1

Change History

02/08/07 14:34:35 changed by Jack Moffitt <metajack@gmail.com>

  • attachment autocommit.diff added.

switch psycopg2's default isolation level to autocommit

02/08/07 16:04:45 changed by Jack Moffitt <metajack@gmail.com>

  • needs_better_patch changed.
  • needs_tests changed.
  • needs_docs changed.

Note that this patch may have side effects if the database's default isolation level is set to serializable. The attached patch corrects this problem and makes sure that we're always in read committed unless otherwise overridden for the entire session. See http://www.postgresql.org/docs/8.1/interactive/sql-set-transaction.html for documentation.

I'm also attaching a test script to show that this works.

Note that the self._commit() call is uncessary in autocommit isolation mode, but is needed for the others.

02/08/07 16:05:04 changed by Jack Moffitt <metajack@gmail.com>

  • attachment isotest.py added.

test script

02/08/07 16:05:37 changed by Jack Moffitt <metajack@gmail.com>

  • attachment combined.diff added.

updated patch; also fixes #3459

02/08/07 16:06:10 changed by Jack Moffitt <metajack@gmail.com>

Also, note that #3459 needs almost the same setup, so that's why I've attached the combined patch.

02/08/07 17:41:12 changed by Marc Fargas <telenieko@telenieko.com>

  • stage changed from Unreviewed to Design decision needed.

For the record: Discussion about that has been held: here

03/07/07 18:23:22 changed by Philippe May

This patch also resolves an important bug for me: i have an application with models imported from python packages, not directly from the models.py file. Without the patch, postgres raises an error (ERROR: the relation «app_rel» doesn't exist) and the transaction is still pending, which breaks the normal execution when creating the databases (with syncdb). Hope that's clear enough... I'm not a postgres guru, so i haven't figured out the details. Contact me if interested.

Guess it's not worth filling a new ticket for my bug when a patch is ready.

08/01/07 05:18:22 changed by Nis Jørgensen <nis@superlativ.dk>

I have run into this problem as well, and fixed it by doing the exact code change you did.

This caused problems when I started to use fixtures (forward references) in my tests - I believe there are currently problems with the transaction management for those which are exposed by the fix.

I am sorry that I cannot be more specific at the moment, just thought I'd log this as something to investigate before applying the patch.

08/01/07 10:26:01 changed by mir@noris.de

#5043 marked as duplicate of this ticket.

08/02/07 18:51:22 changed by anonymous

  • cc set to sam@robots.org.uk.

08/03/07 16:03:29 changed by John Shaffer <jshaffer2112@gmail.com>

  • attachment psycopg2-commit-mode-r5790.diff added.

Updated patch for [5790]. The patch causes problems with fixtures and causes some of Django's tests to fail.

11/14/07 15:29:01 changed by jarek.zgoda@gmail.com

  • cc changed from sam@robots.org.uk to sam@robots.org.uk, jarek.zgoda@gmail.com.

11/19/07 12:34:41 changed by kmishler@arinc.com

We applied this patch for our application. It fixed a problem we were having where if an insert, update or delete threw an exception, any subsequent database operation would fail until django was restarted. We fixed the problem with fixtures by adding a line to core/management/commands/loaddata.py:

transaction.connection.connection.set_isolation_level(1)

It fixed the django unit tests for our application.

11/20/07 04:01:58 changed by mir

  • needs_better_patch set to 1.

The problem is that the patch turns every single command in a transaction for itself (this happens in autocommit without an explicit COMMIT). The new behaviour is incompatible with the old one and e.g. breaks forward references in the serialization tests, as noted by John Shaffer with the most recent patch.

I checked "patch needs improvement" since it breaks tests.

12/01/07 08:45:25 changed by jacob

  • status changed from new to closed.
  • resolution set to duplicate.

This is a situation where different aplications call for different settings, so hardcoding something isn't a great idea. I think, then, this is a situation that could be solved by #6064.

12/01/07 10:47:40 changed by Jack Moffitt <metajack@gmail.com>

I see no use case for an application that needs behavior different than this. In this patch, you always know exactly when and how you are in a transaction, and you can choose whether your code executes in one or not. The old way does not allow this flexibility and always puts you in a transaction, which for 90% of the most common queries wastes a lot of database cycles. Any site running stock django with postgresql will be crippled by this when it starts to get any load.

This is not something that is solved by #6064, or at least I think doing it that way would be more of a hack. Solving it by setting the default isolation level consistently means that everyone will understand exactly how the database is getting used. It won't be broken for most people until they figure out there is some special initializer that magically fixes things by changing how all the isolation levels work. I suppose you could make the default initializer do this, but note that this is a lot more technical and low level that making sure the time zone is correct or setting how postgres returns the date, etc. There's a reason that this functionality is provided _at the wrapper layer_ and the other is provided in the SQL language.

We have at least 3 reports here of people who need this functionality, and so far, no reports or explanation on how this functionality would be bad. Please reconsider this patch.

12/01/07 10:53:41 changed by Jack Moffitt <metajack@gmail.com>

  • status changed from closed to reopened.
  • resolution deleted.

I'm reopening because reading #6064 it is clear this is not a duplicate. That functionality is about initial SQL, and this functionality can not be changed by SQL alone. We are changing the Python wrapper's functionality here, not the database's.

There is nothing changing on the postgresql side. The default python DB-API functionality is fine for easy of use, but is not what you want in a system like Django, as it is extremely wasteful of transactions in the common case. Note that this is specified by the DB-API spec, and will probably affect every database with transactions. Each wrapper is going to have its own way to turn this off I imagine.

(follow-up: ↓ 16 ) 12/01/07 15:03:17 changed by jacob

  • status changed from reopened to closed.
  • resolution set to duplicate.

Jack, it can be changed by inital SQL -- "SET TRANSACTION ISOLATION LEVEL <foo>". If that's wrong, #6064 just needs to allow custom *python* (as well as SQL) to run upon database creation. We need to fix the general problem, not individual specific ones.

12/01/07 15:26:56 changed by Jack Moffitt <metajack@gmail.com>

There is no autocommit isolation level in SQL. If you intend #6064 to have custom python initializers per backend, then it will do what we want, but this is still quite a different thing than the tasks that custom sql initializers will be doing in general. So #6064 needs python + sane defaults (like setting autocommit by default + some way that the first time you override this to change the timezone or date style you don't completely change how the whole transaction system works by accident.

02/29/08 09:35:27 changed by Timothée Peignier <tim@tryphon.org>

  • attachment 3460-autocommit.diff added.

Change isolation level without breaking fixtures and tests

(in reply to: ↑ 14 ) 06/11/08 07:42:11 changed by Jack Moffitt <metajack@gmail.com>

  • status changed from closed to reopened.
  • resolution deleted.

Replying to jacob:

Jack, it can be changed by inital SQL -- "SET TRANSACTION ISOLATION LEVEL <foo>". If that's wrong, #6064 just needs to allow custom *python* (as well as SQL) to run upon database creation. We need to fix the general problem, not individual specific ones.

So the default Django will be really inefficient, unless the user knows the magic incantation of custom python for the database adapter they are using? This seems a poor solution. There is nothing in #6064 that recommends allowing custom python, nor is there anything in #6064 that would use such a setting for fixing the enormous problems with this poor use of DB-API.

People are still adding patches to this bug, so it is clearly still an issue for others.

I cannot see a reason why you keep closing this bug. Does it break something? The patch seems to work quite well for others. Incremental improvement now is much better than waiting another year for a perfect solution.

06/12/08 01:08:23 changed by cgrady

  • attachment autocommit-7624.patch added.

added similar fix to postgresql backend for psycopg1

07/02/08 11:21:34 changed by cpinto

I'm not sure #6064 is the patch for this. It works with connections while this patch works with cursors which are not the same thing. For example, with the connection_created signal how would you set a different timezone on a new cursor?

Since this appears to be a specific issue with pgsql and #6064 cannot be used to solve this particular issue (not without refactoring other stuff, starting with the signal itself that would either needed to be changed to cursor_created and dispatched on cursor creation or adding a new signal that served this purpose), why is this even marked as duplicate?


Add/Change #3460 (postgresql_psycopg2 backend uses wrong isolation level)




Change Properties
Action