Code

Opened 6 years ago

Closed 3 years ago

Last modified 2 years ago

#9964 closed Uncategorized (fixed)

Transaction middleware closes the transaction only when it's marked as dirty

Reported by: ishirav Owned by: mtredinnick
Component: Database layer (models, ORM) Version: 1.0-beta
Severity: Normal Keywords: transactions
Cc: jdunck@…, carljm, basti, shai, forest, peterbe, xof, andrewsk, danfairs, brodie Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The transaction middleware commits only dirty transactions on success, and rolls back only dirty transactions in case of an error.

This means that any transaction that only SELECTs data gets rolled back (I'm not sure by whom - the database itself, perhaps?). So this is what I see in my PostgreSQL log:

BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITTED
SET TIME ZONE E'America/Chicago'
SELECT ...
SELECT ...
SELECT ...
ROLLBACK

The expected behavior is a COMMIT (except in case of an error).

Typically this does not cause any problems, because it affects only non-dirty transactions.

However if you use raw SQL to make database modifications, it will be rolled back unless the dirty flag gets set by some another part of your code. This can result in loss of data, and a very hard to find bug...

The correct behavior would be to always close the transaction, regardless of the dirty flag. Meaning that TransactionMiddleware.process_response should always call transaction.commit(), and TransactionMiddleware.process_exception should always call transaction.rollback()

This bug is also the cause of tickets #9763 and #9919

Attachments (10)

bench.py (2.1 KB) - added by shai 5 years ago.
rudimentary benchmarking code
9964-against-10052.patch (27.7 KB) - added by shai 5 years ago.
not-yet-ready patch for comments
9964-against-10087.diff (35.3 KB) - added by shai 5 years ago.
Improved patch, tested (tests included)
9964-make-managed-transactions-dirty.patch (6.9 KB) - added by shai 4 years ago.
New patch for marking transactions dirty, against 14579
9964-make-managed-transactions-dirty-v2.patch (8.9 KB) - added by shai 4 years ago.
improved patch: more robust detection of use (with added test), more tests fixed, against 14617
9964-make-managed-transactions-dirty-v2-full.patch (11.2 KB) - added by shai 4 years ago.
added release-notes text and test for related bug 6669
9964-make-managed-transactions-dirty-v3-full.patch (19.0 KB) - added by shai 4 years ago.
Use deprecation warnings instead of errors for transactions left pending
9964-r14913-bugfix.patch (11.0 KB) - added by shai 4 years ago.
patch against r14913: bugfix approach
9964-r14913-deprecation.patch (18.9 KB) - added by shai 4 years ago.
patch against r14913: deprecation approach
9964-r15054-bugfix.patch (14.6 KB) - added by shai 4 years ago.
Improved patch against r15054

Download all attachments as: .zip

Change History (65)

comment:1 Changed 6 years ago by shai

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

I would like to point out two other cases where the current behavior has bad effects:

1) When data retrieval statements have side-effects. This can happen with database back-ends which support procedures (or functions) returning tabular results.

2) When the transaction isolation level is Repeatable-Read or Serializable, and the connections are pooled.

  • As long as no update is made on the specific thread, the transaction isn't closed;
  • so new HTTP requests handled by the same thread would re-use the connection and therefore, the transaction;
  • so the new requests retrieve stale data -- updated to the time when the first request was processed, because of the isolation level.

The code in SVN still contains this bug.

comment:2 Changed 6 years ago by mtredinnick

  • Owner changed from nobody to mtredinnick
  • Status changed from new to assigned

comment:3 Changed 5 years ago by jacob

  • milestone set to 1.1
  • Triage Stage changed from Unreviewed to Accepted

#9673 and #9919 are duplicates.

comment:4 Changed 5 years ago by jdunck

  • Cc jdunck@… added

comment:5 Changed 5 years ago by jdunck

The obvious question is when set_dirty should be called; the description of this ticket suggests that a transaction should be dirty even for a transaction consisting only of selects. The problem with this is that lots of transaction cycles are only selects, and committing a transaction has overhead. I think select-only transactions should rollback by default. If you're using triggers that munge data upon select, set_dirty should be manually called; this can be clearly documented.

There's some overhead and fiddliness with db.connection.cursor.execute looking at the string it receives in order to determine CUD operations. Maybe backends.DatabaseOperations should have a .is_select(sql) hook?

Otherwise it'd be a decent compromise to just clearly document more that only dirty transactions are committed, and that using cursor.execute does not mark a transaction as dirty.

Changed 5 years ago by shai

rudimentary benchmarking code

comment:6 Changed 5 years ago by shai

I suspect jdunck's description is inaccurate, and it is not committing a transaction that has an overhead, but closing the transaction. That is, for select-only transactions, whether you commit or roll back, it costs about the same; the "cheap" option is to leave the transaction be.

I ran some very basic tests on my desktop, to evaluate the costs of closing transactions. The bench.py file I just attached has functions to create connections into SQLite, MySQL, and PostgreSQL databases I had set up; to create a simple table; to fill it up with 100,000 rows; to run some (<100) selects as warm-up; and to run a big bunch of selects -- about 1450 batches of 3, each batch ending with a "transaction ending" operation. I tested for the three databases, against the three operations: no-op, rollback and commit. The results, in milliseconds, were:

Database PostgreSQL MySQL SQLite
Module psycopg2 MySQLdb sqlite3
Leave open 455-576 356-366 309-402
Rollback 569-750 395-414 297-359
Commit 573-717 397-426 298-305

These results support my suspicion, that rollback does not save much (if any) overhead, compared to commit, in select-only transactions.

Assuming this evidence is accepted, then, the only question that remains is whether Django should always close transactions. The arguments I am aware of are:

  • Leaving dangling transactions is "impolite", and in some cases can lead to data loss (lost commits as well as bad commits)
  • Closing every transaction costs. The table above shows that, for select-only transactions, this would cost about 10% performance on MySQL and about 20% on PostgreSQL (see note below for SQLite, and I have no access to Oracle or other common DBMSs at the moment). These numbers are, of course, expected to change wildly with the introduction of multi-threading, different transaction isolation levels, and realistic workloads. However, one should note that the single-threaded case tends to hide the benefits of shorter transactions; I expect the 'commit' option to look better in realistic conditions.

Notes:

  1. After preparing the databases with the functions in the file, I ran all my tests with
    $ python -m timeit -v -n 10 -r 10 -s 'from bench import *' -s 'conn=sqlite()' -s 'warm_up(conn)' 'leave_open(conn)'
    
    Replacing the connection function and the benchmarking function, of course. All the databses use the system default options; the system is 32-bit Debian sid running on a quad-core 2.4GHz cpu, with 2GHz memory, a consumer-grade 250G hard-disk, and 3 KDE sessions up (though only one is active).
  2. The SQLite results seem to indicate that having long transactions increases the performance variance; I found this surprising, and repeated the tests several times with similar results, though this is certainly not rigorous.

comment:7 Changed 5 years ago by guettli

  • Cc hv@… added

comment:8 Changed 5 years ago by carljm

  • Cc carljm added

Changed 5 years ago by shai

not-yet-ready patch for comments

comment:9 Changed 5 years ago by shai

  • Has patch set
  • Needs documentation set
  • Needs tests set
  • Patch needs improvement set

The patch I just attached is not ready for inclusion and not tested, I just want to get some feedback on the ideas:

  • Take the functions affected by 'dirty' out to separate modules, each providing a different version of them
    • fast_select.py reflects current behavior
    • safe.py always closes transactions
  • Selection between modules is done by settings.py
  • The default is fast_select (existing behavior), for backward compatibility
  • The project template is modified to select 'safe', so new projects get that by default
    • But if they need the extra performance, and are willing to pay their dues in code diligence, they can

Changed 5 years ago by shai

Improved patch, tested (tests included)

comment:10 Changed 5 years ago by shai

  • Needs tests unset
  • Patch needs improvement unset

The new patch includes tests, which it passed.

Malcolm has expressed, in the mailing list, preference for a trivial patch; however, the trivial patches considered took it upon Django to select for the users, whether they should take a performance hit (compared to 1.0), or check their code better. The attached patch leaves this choice to the users (as detailed above).

I removed the "need tests" and "need better patch" flags, because it is my opinion now that the tests and patch are good, and viable for looking into. I do not presume to take the role of a committer; I hope this is the right way to do it.

I have to add a word on testing here, but I must run to get some kids from school now.

comment:11 Changed 5 years ago by shai

Two words on testing:

A) The tests introduced are not for the transaction-policy setting per se, but rather for the transaction middleware. It had none, so I figured it could use one while I'm at it.

B) Testing this patch has to be done with two different settings files; one where the Transaction Policy was set to safe, and one where it was left as fast_select. This is because of a minor technicality(1) and a major effect.

The minor technicality doesn't really matter, because of the major effect: When you change something as fundamental as transactional behavior, it's not enough to check your own puny unit-tests; you need to verify that you haven't broken some innocent victim elsewhere. So, I had to run the whole test suite under the two policies anyway; indeed, I had to fix one of the tests(2) for it to pass.

This may look like I'm now proposing that, for all eternity, tests must be thus run twice. Not so; I hope to see, within a couple of versions, that fast_select is recommended against, then the default changed to safe, and fast_select relegated to a status of not-really-supported, much like the TRANSACTIONED_MANAGED knob is today (turning that undocumented knob in your settings file makes the test suite fail completely, as it cannot even syncdb a test database).

(1) The minor technicality is that the way the code is currently written, it modifies the transaction module's global namespace (for speed); this means it does not support changing the Transaction Policy on the fly. In practice, one does not expect this sort of setting to change in mid-operation; in a test setting, it seems to get in the way, but, as mentioned above, it doesn't really matter.

(2) That test I had to fix reveals one weak point of the 'safe' transaction policy -- if all transactions are considered dirty at all times, there's no way to know if a transaction is pending. Thus, we lose a validity check we currently have (where if a transaction is left pending, an exception is thrown). I added a rollback to the end of all transactions, as a minimal protection against such strays.

comment:12 Changed 5 years ago by mtredinnick

Impressive amount of work and analysis here. Thanks.

It'll take a couple of days to review this, although I'll start today. Just because it's a huge change that could Break Stuff(tm) if we get it wrong. But I like the approach you've taken, for the most part.

comment:13 Changed 5 years ago by jdunck

Malcolm, I know there're other things on your plate, but given your estimate here, I thought it might be good to poke this-- did you get a chance to look at it?

comment:14 Changed 5 years ago by basti

  • Cc basti added

comment:15 follow-up: Changed 5 years ago by russellm

  • milestone changed from 1.1 to 1.2

After discussing with Jacob and Malcolm, we've decided to defer this one from v1.1.

From a purely code perspective, the patches on this ticket aren't ready for trunk. In particular, having a user-space setting to determine transaction behavior is a non-starter.

From a design perspective, there are some subtle tradeoffs that need to be considered here, including weighing the cost/benefit of "always commit", and ensuring that manual transaction control will continue to work for those that need every last nanosecond of performance.

On the flipside - the existing behavior works, as long as you understand what "works" means. This existing behavior could certainly be better, though; in the 1.2 timeframe we can give this the serious discussion it requires.

comment:16 in reply to: ↑ 15 ; follow-up: Changed 5 years ago by shai

Hi,

Replying to russellm:

After discussing with Jacob and Malcolm, we've decided to defer this one from v1.1.

I understand how busy you all are; still, I hope you can find the time to respond while the topic is fresh in your heads.

having a user-space setting to determine transaction behavior is a non-starter.

Could you please elaborate on this a little? I find it confusing, as the setting activating the transaction middleware (as all middlewares) is in the same space as the setting proposed in the patch.

From a design perspective, there are some subtle tradeoffs that need to be considered here, including weighing the cost/benefit of "always commit",
and ensuring that manual transaction control will continue to work for those that need every last nanosecond of performance.

This I also find confusing, as almost everything that is non-trivial here is intended to keep the current behavior available for people who need it (and in fact, keep it through the upgrade for existing projects). Perhaps I misunderstand, but it seems like you are trying to decide on the always-close-transaction issue, when the patch jumps through hoops in order to defer this very decision -- to users for now, and to the future in general.

On the flipside - the existing behavior works, as long as you understand what "works" means.

In this case, would you consider documentation patches (as proposed in #9919)?

Thanks,

Shai.

comment:17 in reply to: ↑ 16 Changed 5 years ago by russellm

Replying to shai:

Hi,

having a user-space setting to determine transaction behavior is a non-starter.

Could you please elaborate on this a little? I find it confusing, as the setting activating the transaction middleware (as all middlewares) is in the same space as the setting proposed in the patch.

The current patch proposes to preserve the current behavior, as well as adding a new behavior that implements a new behavior. However, this isn't an area of API where the end user should be required to make a choice at all - it should Just Work (tm). The only users that will care about the different behavior are those optimizing for uber-speed, and for those users we have extensive manual controls over transactions.

From a design perspective, there are some subtle tradeoffs that need to be considered here, including weighing the cost/benefit of "always commit",
and ensuring that manual transaction control will continue to work for those that need every last nanosecond of performance.

This I also find confusing, as almost everything that is non-trivial here is intended to keep the current behavior available for people who need it (and in fact, keep it through the upgrade for existing projects). Perhaps I misunderstand, but it seems like you are trying to decide on the always-close-transaction issue, when the patch jumps through hoops in order to defer this very decision -- to users for now, and to the future in general.

Deciding to push the decision to the user isn't a decision at all. More control knobs doesn't necessarily imply better software. Part of the job of a framework is to eliminate some decisions to make sure that things always work as expected. We want to make a decision, and make sure it is the right one. Since this is a bit of an edge case, and there is a reasonable workaround, we're willing to defer that decision until we know we will get it right.

On the flipside - the existing behavior works, as long as you understand what "works" means.

In this case, would you consider documentation patches (as proposed in #9919)?

Agreed, we probably should add some docs for this. The patch on #9919 doesn't really do the job at the moment. To my mind, it makes a lot of unnecessary changes, and only gives a marginal improvement on the one area that the ticket should be addressing. I'll reopen the ticket, but a new patch will be required.

comment:18 Changed 5 years ago by shai

  • Cc shai added

Replying to russellm:

The current patch proposes to preserve the current behavior, as well as adding a new behavior that implements a new behavior. However, this isn't an area of API where the end user should be required to make a choice at all - it should Just Work (tm). The only users that will care about the different behavior are those optimizing for uber-speed, and for those users we have extensive manual controls over transactions.

I agree, and I've set the defaults accordingly. The only reason I made the choice user-visible was because I could see no other way to make it also backwards-compatible -- to avoid changing the behavior under the feet of existing projects. This is the sort of change that is bound to break some user code; in this case, you essentially argue that the user should fix their code or avoid upgrading, and I was trying to offer a smoother path.

Replying to shai:

This I also find confusing, as almost everything that is non-trivial here is intended to keep the current behavior available for people who need it (and in fact, keep it through the upgrade for existing projects). Perhaps I misunderstand, but it seems like you are trying to decide on the always-close-transaction issue, when the patch jumps through hoops in order to defer this very decision -- to users for now, and to the future in general.

Deciding to push the decision to the user isn't a decision at all. (...) We want to make a decision, and make sure it is the right one.

I mis-represented my intentions. There is a difference between "push the decision to the user" and "make a decision but allow the user to override"; I was trying to do the second. The decision I made was "no change for existing projects, always commit for new ones". Allowing override was, indeed, intended to limit the costs of a "wrong" decision (I am pretty sure I made the right choice in general, but that might not be right for some users).

Further, I believe these arguments will continue to hold after a more thorough revision. We will still want backward-compatibility, and we will still want the decision to be overridable. Granted, we might find better ways to achieve these goals; granted, it makes sense to become more confident of the decision; and granted, it can all wait to 1.2.

In this case, would you consider documentation patches (as proposed in #9919)?

Agreed, we probably should add some docs for this. (...) I'll reopen the ticket, but a new patch will be required.

I'll try to provide a patch in the next few days, if nobody beats me to it :-)

Thanks,

Shai.

comment:19 Changed 4 years ago by forest

  • Cc forest added

comment:20 Changed 4 years ago by ubernostrum

  • milestone changed from 1.2 to 1.3

A patch which wasn't ready for 1.1 and hasn't been updated since probably isn't ready for 1.2.

(also, a better solution might just be to document that if you're doing things in raw SQL you should manually inform the transaction middleware by setting the transaction to dirty)

comment:21 follow-up: Changed 4 years ago by shai

Replying to ubernostrum:

A patch which wasn't ready for 1.1 and hasn't been updated since probably isn't ready for 1.2.

A substantial part of the approach was rejected -- core developers decided that the negative
value of an added line in the default settings.py was larger than the positive value of fixing
the problematic behavior in a backwards-compatible way.

(also, a better solution might just be to document that if you're doing things in raw SQL you should manually inform the transaction middleware by setting the transaction to dirty)

This has already been done, but it only "solves" part of the problem. The part where, by default, the middleware and decorator leave the transaction pending at the end of the request is not fixed. This can still cause data loss and corruption, depending on things like transaction isolation level and support of nested transactions in the database backend. Further, those would be "phantom bugs" -- hard to reproduce, because they depend on the allocation of threads to a stream of requests.

The correct behavior, clearly, is to close all transactions, whether by commit or rollback. Current behavior is a bug which needs to be fixed.

Since transactions are deep, and changing their behavior might break user applications, I suggested (perhaps not as clearly as I should have) a three-step path towards correct behavior:

  1. Make new projects behave correctly, keep behavior for old ones, tell people they need to adapt (by adding settings.py var)
  2. Make all projects behave correctly, unless explicitly overridden (removing var from default settings.py, but still supporting both values)
  3. Cease support for the above override

In his replies above, Russel effectively says we should either jump straight to the end of the path or keep living with the bug. My reply to that went unanswered. If you'd consider a patch which just always closes transactions, just say so, and I'll provide one.

Thanks,
Shai.

comment:22 follow-up: Changed 4 years ago by seanc

I would argue that this is more than a simple cosmetic bug, and that fixing this has a huge set of benefits. A few additional notes:

1) PostgreSQL can't get rid of versions of tuples that are held inside of a read transaction (e.g. a VACUUM can't reap rows that have been read in a transaction until the transaction is closed)

2) Users of pgbouncer with their pool_mode set to 'transaction' aren't free'ing up the backend, which dramatically cuts the effectiveness of pgbouncer.

; When server connection is released back to pool:
;   session      - after client disconnects
;   transaction  - after transaction finishes
;   statement    - after statement finishes
pool_mode = transaction

3) Neither COMMIT or ROLLBACK will perform any better than the other. See the following thread for some small details.

http://www.mail-archive.com/pgsql-general@postgresql.org/msg60957.html

One important catch to keep in mind is that SELECT can call functions that modify tables. COMMITs will provide better "do what I expect" behavior, but I would argue that users of functions that modify data should be explicit in their COMMITs.

Based on what I've read in the thread above, adding three new versions of the TransactionMiddleware appears to provide the easiest path forward. "TransactionMiddlewareRollback," "TransactionMiddlewareCommit," and "TransactionMiddlewareOpen". Copy TransactionMiddleware to TransactionMiddlewareOpen and have TransactionMiddleware issue a warning on startup that the developer should switch to Rollback or Commit (or Open to preserve the current buggy functionality). In a future version, change TransactionMiddleware to use TransactionMiddlewareRollback as the default or just remove it all together. I'd rather see an explicit behavior configured as a Middleware class vs. an entry in settings.py.

comment:23 in reply to: ↑ 22 ; follow-ups: Changed 4 years ago by shai

Replying to seanc:

I would argue that this is more than a simple cosmetic bug

Has anyone argued otherwise?

3) Neither COMMIT or ROLLBACK will perform any better than the other. See the following thread for some small details.

http://www.mail-archive.com/pgsql-general@postgresql.org/msg60957.html

While this is an important data-point, PG isn't the only database engine that should be considered. Just saying.

One important catch to keep in mind is that SELECT can call functions that modify tables. COMMITs will provide better "do what I expect" behavior, but I would argue that users of functions that modify data should be explicit in their COMMITs.

I would argue that in general, people who do this deserve whatever results they get. If you're doing it for auditing (i.e. you need to register the fact that a select has been executed), then you need to have the auditing records committed whether the main transaction commits or rolls back. I can see no other valid use-case for a table-modifying select.

Based on what I've read in the thread above, adding three new versions of the TransactionMiddleware appears to provide the easiest path forward. "TransactionMiddlewareRollback," "TransactionMiddlewareCommit," and "TransactionMiddlewareOpen". Copy TransactionMiddleware to TransactionMiddlewareOpen and have TransactionMiddleware issue a warning on startup that the developer should switch to Rollback or Commit (or Open to preserve the current buggy functionality). In a future version, change TransactionMiddleware to use TransactionMiddlewareRollback as the default or just remove it all together. I'd rather see an explicit behavior configured as a Middleware class vs. an entry in settings.py.

You'll need something similar for the decorators, and once it's not a one-line-in-settings.py change, it's ugly.

I also repeat the observation that Middleware classes are activated in settings.py. How, then, is the class selection an improvement over some other setting?

comment:24 in reply to: ↑ 23 Changed 4 years ago by seanc

Replying to shai:

Replying to seanc:

I would argue that this is more than a simple cosmetic bug

Has anyone argued otherwise?

Certainly not you, but the ticket has been open for over 18mo now (opened 01/05/09). My intention was to throw out additional reasons for why this is an important bug that needs to be fixed and a decision needs to be made (one way or another) for how to handle the backwards incompatible behavior change.

Your patch looks good to me and I only have a few suggestions/comments. FWIW, I think this bug is the reason Johnny Cache is incompatible with the TransactionManager(1).

3) Neither COMMIT or ROLLBACK will perform any better than the other. See the following thread for some small details. http://www.mail-archive.com/pgsql-general@postgresql.org/msg60957.html

While this is an important data-point, PG isn't the only database engine that should be considered. Just saying.

Agreed (who uses MySQL anymore, anyway? I mean, really. The 90's called and they want their database back. j/k... kinda. *grin*).

I joke, but this design decision is rooted in history and was almost certainly performance related. Transactions are expensive to enter and exit (synchronization required to get the transaction ID for data visibility) and executing individual statements is expensive because every statement is wrapped in its own transaction. PostgreSQL, in particular, used to be excruciatingly slow at this, but PostgreSQL has improved >5x in this dept(a) and MySQL has made reasonable performance improvements as well(b), so making this change to COMMIT or ROLLBACK at the end of a web request isn't costly like it used to be. Given the popularity of pgbouncer in the PostgreSQL world, one could probably make a case rooted in data that leaving transactions open has a direct negative performance impact.

(a) http://suckit.blog.hu/2009/09/26/postgresql_history

(b) http://suckit.blog.hu/2009/10/03/mysql_history

(Performance related feature request that I'll throw out while this area of code is under examination: it would be fantastic if there was some database agnostic decorator that could be used to change PostgreSQL's synchronous_commit on a per-webrequest basis.

http://www.postgresql.org/docs/8.4/static/runtime-config-wal.html#GUC-SYNCHRONOUS-COMMIT

e.g. "SET LOCAL synchronous_commit TO ON;" at the beginning of a transaction to provide strong data guarantees vs OFF, which is probably the default for most PostgreSQL installations)

Based on what I've read in the thread above, adding three new versions of the TransactionMiddleware appears to provide the easiest path forward. "TransactionMiddlewareRollback," "TransactionMiddlewareCommit," and "TransactionMiddlewareOpen". Copy TransactionMiddleware to TransactionMiddlewareOpen and have TransactionMiddleware issue a warning on startup that the developer should switch to Rollback or Commit (or Open to preserve the current buggy functionality). In a future version, change TransactionMiddleware to use TransactionMiddlewareRollback as the default or just remove it all together. I'd rather see an explicit behavior configured as a Middleware class vs. an entry in settings.py.

You'll need something similar for the decorators, and once it's not a one-line-in-settings.py change, it's ugly.

With settings.py being loaded first, I don't understand why the API for transactions would need to change. Unless there's some subtlety that I'm missing, having different flavors of TransactionManager* would still result in the same @transaction decorator. Having TransactionManager* be a facade that sets a tunable would be acceptable, but I think sticking with configuring this based on different TransactionManager* classes is the best way to go. To the extent that there's an API, it shouldn't need to change from a user's perspective.

from django.db import transaction

@transaction.commit_on_success
@transaction.commit_manually
  transaction.rollback()
  transaction.save()

A few comments on patch 9964-against-10087.diff:

In django/db/transaction/safe.py, leave_transaction_management() should be specific to the TransactionManager{Commit|Open|Rollback}. In the *Commit case, this would obviously execute the appropriate COMMIT.

I see that the patch already provides this functionality via django/db/transaction/init.py, lines 69-86.

django/conf/global_settings.py, the tunable values for TRANSACTION_POLICY seem ambiguous. Would propose the following three strings:

# Current behavior, highly discouraged (prevents data from being vacuumed in Pg and breaks pgbouncer's pool_transaction)
# Enabled by using 'TransactionManagerOpen'
TRANSACTION_POLICY = 'legacy_leave_open'

# COMMITs the transaction at the end of the HTTP request unless unhandled error (required for johnnycache)
# Enabled by using 'TransactionManagerCommit'
TRANSACTION_POLICY = 'commit'

# ROLLBACK the transaction at the end of the HTTP request
# Enabled by using 'TransactionManagerRollback'
TRANSACTION_POLICY = 'rollback'

Would it be possible to have a decorator that would change this on a per-function basis? The naming "*_on_success()" is inaccurate. Something like:

@transaction.commit_on_cleanup(True)
@transaction.rollback_on_cleanup(True)

would be a very useful addition to the TransactionManager*, IMHO.

I also repeat the observation that Middleware classes are activated in settings.py. How, then, is the class selection an improvement over some other setting?

I believe it to be the least backwards incompatible change. Johnny-cache requires COMMITs and in the event that django and the database get out of sync with regards to a dirty transaction, *I* would rather have django roll things forward vs roll things back (again, per-site policy). Users already add TransactionManager to enable this functionality and given this is a fundamental change to the data visibility/integrity for their application, keeping the point of entry the same seems prudent. Having this be configured by some random tunable doesn't seem appropriate compared to requiring users to make an explicit tweak to TransactionManager that signifies understanding of the problem and the desired outcome. There is nothing more important to an application than its data integrity.

Also, after thinking about this some more (and realizing that the reason I'm staring at this bug is because TransactionManager is incompatible with JohnnyCache(2)), having TransactionMiddleware default to TransactionManagerCommit will probably be more common than *Rollback, but again, I think this is a fundamental app-specific tweak that should be made on a site-by-site basis.

(1) http://bitbucket.org/jmoiron/johnny-cache/issue/17/johny-cache-incompatible-w

(2) Changes from LocalStore aren't pushed to QueryCacheBackend because the commit handlers were never executed.

comment:25 follow-up: Changed 4 years ago by seanc

Thanks to @jmoiron for providing the following snippet which solved my johnny-cache problem and lets me continue to use TransactionMiddleware. shai, would be interested in any comments that you have regarding this, but for the time being, I think I'm set. I think the default behavior at this point in time should be COMMIT, not ROLLBACK (both for performance reasons for MySQL and for Johnny-cache reasons). The web request was successful, therefore the database should match the HTTP state.

# Use the following in settings.py
MIDDLEWARE_CLASSES = (
    'lib.transaction_middleware.TransactionMiddlewareCommit',
)
# And transaction_manager.py
from django.middleware.transaction import TransactionMiddleware
from django.db import transaction

class TransactionMiddlewareCommit(TransactionMiddleware):
    def process_response(self, request, response):
        if transaction.is_managed():
            try: transaction.commit()
            except: pass
            transaction.leave_transaction_management()
            return response

class TransactionMiddlewareOpen(TransactionMiddleware):
    def process_response(self, request, response):
        return super(TransactionMiddleware, self).process_response(req, request, response)

class TransactionMiddlewareRollback(TransactionMiddleware):
    def process_response(self, request, response):
        if transaction.is_managed():
            try: transaction.commit()
            except: pass
            transaction.leave_transaction_management()
            return response

comment:26 in reply to: ↑ 25 Changed 4 years ago by shai

Replying to seanc:

shai, would be interested in any comments that you have regarding this, but for the time being, I think I'm set.

I will try to post some coherent thoughts later; in the meantime, I just want to note that the classes other than commit here have some problems:


class TransactionMiddlewareOpen(TransactionMiddleware):
    def process_response(self, request, response):
        return super(TransactionMiddleware, self).process_response(req, request, response)

The first parameter, req, is redundant and breaks the call.

class TransactionMiddlewareRollback(TransactionMiddleware):
    def process_response(self, request, response):
        if transaction.is_managed():
            try: transaction.commit()

should be transaction.rollback(), I think...

            except: pass
            transaction.leave_transaction_management()
            return response

comment:27 Changed 4 years ago by seanc

Correct. I included the other two for completeness based on my previous comments, but dashed those out here in the comments editor and didn't test anything but *Commit. Updated version suitable for copy/paste by someone reading this ticket.

# And transaction_manager.py
from django.middleware.transaction import TransactionMiddleware
from django.db import transaction

class TransactionMiddlewareCommit(TransactionMiddleware):
    def process_response(self, request, response):
        if transaction.is_managed():
            try: transaction.commit()
            except: pass
            transaction.leave_transaction_management()
            return response

class TransactionMiddlewareOpen(TransactionMiddleware):
    def process_response(self, request, response):
        return super(TransactionMiddleware, self).process_response(request, response)

class TransactionMiddlewareRollback(TransactionMiddleware):
    def process_response(self, request, response):
        if transaction.is_managed():
            try: transaction.rollback()
            except: pass
            transaction.leave_transaction_management()
            return response

comment:28 Changed 4 years ago by peterbe

  • Cc peterbe added

comment:29 in reply to: ↑ 23 Changed 4 years ago by Xof

Replying to shai:

I can see no other valid use-case for a table-modifying select.

It's extremely common to have table-modifying SPs invoked via a SELECT in PostgreSQL, since there is no top-level invoke-SP operation.

comment:30 Changed 4 years ago by Xof

  • Cc xof added

comment:31 follow-up: Changed 4 years ago by Xof

I'd like to offer an opinion on this issue:

The current behavior is not a documented, reliable behavior that we need to continue to support.

Right now, the current behavior when running with default settings in transaction management on PostgreSQL using psycopg2 is that a transaction gets opened the first time you touch the database. If Django is aware you've modified the database, you get a COMMIT on normal return, ROLLBACK on exception. If you either didn't modify the database, or did but didn't tell Django, you get an open transaction that lasts until some indefinite point in the future. In fact, the next request on the same thread can dive into your same transaction, which strikes me as very buggy.

Relying on this behavior is like relying on a particular pattern of uninitialized memory; I don't see any virtue in maintaining it.

Thus, I'd suggest that simply enforcing a "commit on return, rollback on exception" throughout the transaction management functionality (middleware, decorators) is the best and easiest answer.

comment:32 in reply to: ↑ 31 Changed 4 years ago by shai

Replying to Xof:

The current behavior is not a documented, reliable behavior

This is not exactly accurate; the current behavior w.r.t. "modifying without telling Django" is in fact documented. Granted, this is one piece of documentation I will be all for removing.

behavior that we need to continue to support.

The change has performance implications. Of course, performance is not a "documented behavior", but it is not behavior you're free to modify at will either. I grant you that the performance implications are unclear, and doing the right thing may actually perform better in most places where it matters (though evidence for this should be provided); I also agree that, if forced to choose, the correctness considerations outweigh the performance issues here by far.

I was just not convinced that forcing this choice is the best option.

comment:33 follow-up: Changed 4 years ago by andrewsk

  • Cc andrewsk added

comment:34 Changed 4 years ago by Tuxie

I'm not a Django user but I am a PostgreSQL DBA and I must say that I'm fascinated by how you have handled this bug. Still open after 11 months because people are actually defending leaving dangling transactions? Just wow!

These dangling transactions will prevent the autovacuum daemon from doing its job, writes will slow to a crawl after a while and tables will just grow and grow, never being able to reuse the disk space after previous deletes and updates.

Always COMMIT, ROLLBACK or disconnect after starting a transaction or never open the transaction in the first place. The current behaviour is not an optimization, it has serious performance implications for apps with any amount of regular writes, so bad it practically makes Django+PostgreSQL unusable for any long running dynamic applications of non-trivial size.

comment:35 Changed 4 years ago by danfairs

  • Cc danfairs added

comment:36 in reply to: ↑ 33 Changed 4 years ago by linuxpoet

Replying to andrewsk:

Replying to andrewsk:

Perhaps I can help. It seems people have made this a lot more complicated than it needs to be. There really is zero argument here on how this should work. The only argument is implementation.

If you open a transaction to PostgreSQL, you must close that transaction when you are done with it. Either via commit or rollback. In short, if your code returns success, commit. If the code returns an exception rollback. The only excuse to have an idle transaction is if you are processing data within your user layer to send back via the same connection. That type of idle transaction is short lived and follows the same semantics of success/commit, exception/rollback.

Anything else is broken. Period. Talk of backward compatibility for a broken implementation is ridiculous.

comment:37 in reply to: ↑ 21 Changed 4 years ago by shai

Jeez, people. What's required here is a design decision by core committers, which I called for more than 7 months ago:

Since transactions are deep, and changing their behavior might break user applications, I suggested (...) a three-step path towards correct behavior (...) If you'd consider a patch which just always closes transactions, just say so, and I'll provide one.

The expressions of disbelief and ridicule, accompanied by repetition of obviously correct statements about the right way to manage transactions, are not helpful.

comment:38 Changed 4 years ago by shai

After looking into other stuff, and then coming back to this, I realized I was sorely mistaken in the analysis above.

In fact, Django closes all the connections (at least, all the connections it manages) at the end of every request (this is done with a signal-handler that is defined at django/db/init.py circa line 85). Thus, no performance is gained by leaving transactions dangling, and there is no danger of phantom bugs with a transaction being stretched across requests. The problem starts and almost ends, with the cases that have actually been reported -- using raw sql does not mark a transaction as dirty when it should. "Almost", because all that's left is the non-issue of read-only transactions which are being rolled back rather than committed.

I'm attaching a new patch. The new approach is: Whenever anybody touches a cursor in a managed transaction, that transaction becomes dirty. It is not entirely backwards-compatible: Some user code which used manual commit will break, because leaving a read-only transaction pending is no longer considered acceptable. This was, in fact, the case with one of the fixtures_regress tests (a fix for the test is included in the patch). The patch also removes the note about needing to set_dirty with raw sql from the documentation.

Your criticism is welcome.

Changed 4 years ago by shai

New patch for marking transactions dirty, against 14579

Changed 4 years ago by shai

improved patch: more robust detection of use (with added test), more tests fixed, against 14617

Changed 4 years ago by shai

added release-notes text and test for related bug 6669

comment:39 Changed 4 years ago by shai

  • Needs documentation unset

The patch I just uploaded adds to the previous version only documentation (text for release notes) and a single test. As far as I'm concerned, and pending comments saying otherwise, this is now ready for checkin.

Changed 4 years ago by shai

Use deprecation warnings instead of errors for transactions left pending

comment:40 Changed 4 years ago by shai

The last patch (v3) includes responses to comments by Russell on the mailing list (thread, specific message). In particular,

1) Why introduce a full class wrapper with a customizable on_use()
handler when there is only one meaningful course of action -- if
managed, set dirty. Why not just embed that logic direct into Django's
cursor (which is itself a wrapper anyway)?

I responded in the list that I cannot find a "Django's cursor", just a set of independent cursor wrappers, and that's why I believe a wrapper is necessary. I intended to embed the if-managed-set-dirty logic in that wrapper, but then realized that this leads to circular import: django.db.transaction imports (connections) from django.db (__init__.py), which loads a backend, which imports BaseDatabaseWrapper from django.db.backends; BaseDatabaseWrapper now depends on the cursor wrapper, which, if the logic is embedded in it, would need to import django.db.transaction. While circular imports can work, I think avoiding one justifies the slight complication of the implementation. The result being a little more general isn't a con either, in my opinion.

We could make this change under two circumstances. Either:

a) We need to find a way to gracefully introduce this change, raising

a PendingDeprecationWarning in 1.3, a DeprecationWarning in 1.4, and
finally making the change in 1.5; but if the user opts-in, the warning
will be silenced. A setting isn't the right way to handle this,
because settings apply across an entire project, but different
reusable apps may be in different states of compliance.

The new patch does just this: It keeps the basic mechanism of transaction management using the current semantics for "read-only" transactions, and runs in parallel a new semantics where a transaction becomes dirty with any operation (I say "read-only" in quotes, because this includes database-modifying actions which Django isn't aware of, such as raw sql execution). The new semantics is used for two things:

  • The patch introduces a ReadonlyTransactionWarning, which is derived (for now) from PendingDeprecationWarning. This warning is issued when the new semantics would cause behavior different from current (except for the point below): When transaction.is_dirty() would return a different value, and when a "read-only" transaction is left pending at the end of a transaction management block.
  • TransactionMiddleware and the commit_on_success decorator/context-manager use the new semantics to decide if a transaction is dirty.

The net result is that code under commit_on_success is now guaranteed to close transactions, and code under manual management leaving "read-only" transactions pending gets a graceful deprecation. The only code this change should break is code which relies on "read-only" transactions under commit_on_success to be left pending; I argue (following Christophe Pettus) that such code is buggy.

b) We need to declare that the current behavior to be a bug. We can

break backwards compatibility to correct behavior that is clearly
wrong. I haven't fully thought through the consequences here, but I
think the combination of the footprint of affected cases, combined
with the side effects of having dangling connections and transactions
might almost be enough to convince me that can invoke the bug clause
to break backwards compatibility. If you (or anyone else) has any
opinions on this, I'd be interested in hearing them.

I still think current behavior is a bug; even if we limit attention only to the cases where a "read-only" transaction runs under completely manual control, it still allows for a "transaction leak": a function which purports to be its own transaction management block, but in fact its transactional behavior depends on what happens after it ends. The v2 patch is also simpler, because it doesn't need to modify the commit_on_success behavior, and doesn't need to run parallel semantics; there's less room for error there.

But now, committers have a choice between Russell's a) and b).

Thanks,
Shai.

comment:41 Changed 4 years ago by brodie

  • Cc brodie added

Changed 4 years ago by shai

patch against r14913: bugfix approach

Changed 4 years ago by shai

patch against r14913: deprecation approach

comment:42 Changed 4 years ago by shai

Hi,

I just updated both patches -- the bugfix approach (earlier version was named "v2") and the deprecation approach ("v3") -- to apply cleanly against r14913. In view of today's post on the Django Blog, I really want to get these fixes in this time.

Should we change the triage stage to "Design decision needed"? On one hand, "Accepted" is more advanced; on the other hand, someone needs to pick one of the patches...

Until a committer can make this choice, I urge all interested parties to review the patches. If we aren't stepping back in the triaging process, we should step forward, and this requires someone -- not a committer, but not me (as the patch author), to say the patch is ready for checkin. Both patches include tests and documentation.

Thanks,
Shai.

comment:43 Changed 4 years ago by jacob

Hi Shai --

As I said on the mailing list (http://groups.google.com/group/django-developers/msg/3fe71024db4d32f6), let's go with the "bugfix" approach. I'm willing to deal with the backwards-compatibility concerns from anyone who's for some reason relying on read-only connections being left hanging.

There are still a few problems with the current patch though:

First, I'm deeply unhappy with the cursor wrapper approach. We're already wrapping the backend cursor objects, so this additional layer means there's at least two getattr calls every time someone calls execute(). That's an unacceptable overhead both in terms of time and code complexity.

I have the feeling that this whole thing can be greatly cleaned up since we've decided to basically unconditionally commit() automatically opened transactions. Perhaps we can just do it in BaseDatabaseWrapper.close()? I haven't thought that approach through entirely, so perhaps I'm missing something.

But I *really* want a simpler approach here that doesn't introduce another layer of indirection.

Second, the tests need some comments or explanation of what they're doing. There isn't a single comment or docstring in that file, and things like Test9964.test_6669() are terrible names for a test function. Someone needs to spend some time explaining what the tests are actually testing.

I'll try to work on this stuff myself, too, but as you've been thinking about this for a lot longer you can probably do it more efficiently than me.

Thanks for your good work so far. If we can get this nailed down pretty soon I'll make sure it gets into 1.3.

comment:44 Changed 4 years ago by shai

Hi Jacob,

Thanks for your attention and comments. I'll make some fixes later today or through the weekend, but there's just one thing I'd like to have feedback for before I do: As I've noted, while the backend cursors are being wrapped, they are each being wrapped completely separately. I don't see a way for correct behavior without being notified of cursor activity. So the alternatives I see are:

  • A wrapper as in the patch
  • A (essentially duplicate) fix in every cursor-wrapper
  • A mixin (equivalently, metaclass, or class decorator, 2.4 notwithstanding) that cursor wrappers will now be required to inherit; still requires the fix in every cursor-wrapper (including those, like MSSQL, which are not in the Django code base).

Of these, I think the wrapper is the simplest in terms of code complexity and managability. In terms of performance, since we're talking about I/O, I think the overhead should be acceptable.

What am I missing?

Thanks again,

Shai.

comment:45 follow-up: Changed 4 years ago by jacob

What I'm saying is that I think you don't need to wrap the cursor at all. You're trying to introduce a general mechanism -- notification when a cursor is used -- which I don't think is necessary here. We just need a specific mechanism to commit un-committed connections upon close. I can think of a few other places you could be committing the transaction -- connection.close() would be my first try -- without the need for this additional layer of complexity.

comment:46 in reply to: ↑ 45 Changed 4 years ago by shai

Replying to jacob:

We just need a specific mechanism to commit un-committed connections upon close.

I beg to differ.

  • Under manual control, we don't want to commit un-committed transactions -- we want to throw a TransactionManagementError like we always did.
  • Under commit_on_success, we don't want to wait for the connection to close -- we want to commit or rollback as soon as we leave the function. The user may still perform other operations on this connection, and they should go in separate transactions (I'm ignoring nested transaction management blocks here for simplicity).

Unless I'm missing something, these two points mean that whatever we do has to take care of what happens whenever a transaction management block ends. Considering also the possibility of more than one transaction, I think the only way to go is to make sure transactions are marked dirty when they should be.

This still leaves an option that uses no notification: Just consider the connection to be always in "dirty" state, so action is taken whenever a transaction management block ends. I went this road in my first patches; it has two ill effects.

  • One is, again, it loses the enforcement of correct transaction management under manual control, an important development aid;
  • The second is that we must choose, at the closing of the connection, between closing the transaction and not closing it. Because of the first problem, not closing allows manually-controlled transactions to be left dangling, whereas closing means that correctly-managed transactions will be closed twice -- and that second closing is a DB round-trip.

Thus, I see no viable alternative to notifying the transaction-management mechanism whenever a cursor is used.

Changed 4 years ago by shai

Improved patch against r15054

comment:47 Changed 4 years ago by shai

I uploaded a new patch:

  • Updated against r15054
  • The tests now have comments and docstrings per Jacob's comments.

I have not made changes to the main code yet.

comment:48 Changed 4 years ago by shai

(in case my last comment wasn't clear: I haven't made changes to the main code because I hope my arguments above convince Jacob and Russell that such changes are not necessary.

comment:49 Changed 3 years ago by russellm

  • Triage Stage changed from Accepted to Ready for checkin

I've spoken with Jacob, and while we both agree that this patch isn't perfect (i.e., the extra level of abstraction bothers us), the pragmatic approach dictates that it's better to resolve the issue with less than perfect code than have this problem persist.

Marking RFC so we can get this in for 1.3.

comment:50 Changed 3 years ago by russellm

In [15492]:

Changeset r15232 refactored transactions so that all transaction state is maintained on the connection. This changeset continues that work, moving all transaction control to the connection, too. The transaction control functions in django.db.transaction are left as a generic way to easily apply a transaction control function based on a DB alias. Refs #9964.

comment:51 Changed 3 years ago by russellm

  • Resolution set to fixed
  • Status changed from assigned to closed

In [15493]:

Fixed #9964 -- Ensure that all database operations make transactions dirty, not just write operations. Many thanks to Shai Berger for his work and persistence on this issue.

This is BACKWARDS INCOMPATIBLE for anyone relying on the current behavior that allows manually managed read-only transactions to be left dangling without a manual commit or rollback.

comment:52 Changed 3 years ago by shai

Thanks a lot for the commit and kind words. Please note that, as the tests in that patch now pass -- specifically, transactions_regress.tests.test_failing_query_transaction_closed -- this also fixes #6669.

comment:53 Changed 3 years ago by jacob

  • milestone 1.3 deleted

Milestone 1.3 deleted

comment:54 Changed 3 years ago by guettli

  • Cc hv@… removed
  • Easy pickings unset
  • Severity set to Normal
  • Type set to Uncategorized
  • UI/UX unset

comment:55 Changed 2 years ago by aaugustin

In [17368]:

Fixed #6669 -- Ensured database connections are marked as dirty by CursorDebugWrapper.execute/executemany. Refs #9964. Thanks james at 10gic net for the report and Claude Paroz for the patch.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.