Code

Opened 8 years ago

Closed 16 months ago

#2227 closed New feature (fixed)

transaction.commit()/rollback() should be aware of nested calls to transaction.managed(True)

Reported by: mderk@… Owned by: aaugustin
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: transaction, nest, scope, nested, rollback, commit
Cc: md@…, research@…, hv@…, aball@…, forest, wdoekes, etianen, vlastimil.zima@…, stevejalim, ionel.mc@…, mmitar@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

As I see in the transaction.py, commit()/rollback() call connection._commit()/connection._rollback() explicitly.
It breaks the case when a function with commit_manually or commit_on_success decorator calls another function decorated the same way (enters the transaction management).

Consider following scenario:

# a bank account class
class Account:
    @transaction.commit_on_success
    def withdraw(self, sum):
    # code to withdraw money from the account 

    @transaction.commit_on_success
    def add(self, sum):
    # code to add money to the account

# somewhere else
@transaction.commit_on_success
def makeMoneyTransaction(from_account, to_account, sum):
    # this code will call commit on success
    from_account.withdraw(sum)

    # this code may raise an exception, and if it raises, 
    # commit won't be called in Account.add() and makeMoneyTransaction()
    # But money had been already withdrawn, and the from_account saved in the database.
    # The makeMoneyTransaction's transaction is inconsistent.
    # So we loose our money
    to_account.add(sum)

IMO, what the transaction.commit should do in this case is to check first whether the transaction is "nested".
In the AdoDB PHP library, each call to StartTrans either starts new transaction or increments the counter of the nested StartTrans calls. CompleteTrans either calls commit/rollback (if its StartTrans counterpart was not called inside another StartTrans/CompleteTrans block) or decrements the counter.
I rate this deficiency as major. What do you think?

Attachments (4)

django-20060703-nested_transactions.diff (4.9 KB) - added by md@… 8 years ago.
patch adding nested transactions to @commit_on_success
transaction.py.diff (6.9 KB) - added by weaver@… 6 years ago.
tests.py (2.0 KB) - added by weaver@… 6 years ago.
transactions.zip (7.9 KB) - added by weaver@… 6 years ago.

Download all attachments as: .zip

Change History (43)

Changed 8 years ago by md@…

patch adding nested transactions to @commit_on_success

comment:1 Changed 8 years ago by Maximillian Dornseif <md@…>

  • Cc md@… added
  • Summary changed from transaction.commit()/rollback() should be aware of nested calls to transaction.managed(True) to [patch] transaction.commit()/rollback() should be aware of nested calls to transaction.managed(True)
  • Version set to SVN

I had the same Problem: an upper layer calling a lower layer and both used transactions. The problem is, that the databases I'm aware of do not support nested transactions. I played a little with Postgres 8.x SAVEPOINTs but then opted for a simple counting solution mderk.
Attatched patch implements that for @commit_on_success. Unfortunately I included some other cruft in the patch. Only look at the patch to django/db/transaction.py or get a fixed patch at http://c0re.23.nu/c0de/misc/django-20060703-nested_transactions.diff

At least with @commit_on_success I would consider the current (pre-patch) behaviour as broken. To use @commit_on_success you must guarantee that you never directly or indirectly calll an other function decorated by @commit_on_success - something you can not do in any reasonable complex application.

comment:2 Changed 8 years ago by mderk@…

  • Type changed from enhancement to defect

Nope, the patch doesn't resolve the issue, because not the decorators should be patched, but commit() itself, because it could be called from a nested commit_manually-decorated function or whatever. I guess the commit should look at state[thread_ident][-2] (not an elegant way, but...), something like this:

def commit():
    """
    Does the commit itself and resets the dirty flag, only when the transaction is not nested.
    """
    thread_ident = thread.get_ident()
    top = state.get(thread_ident, None)
    if top and len(top) > 1 and top[-2]:
        return
    connection._commit()
    set_clean()

The behaviuor is similar to commit_unless_managed(), but it looks to the previous 'level', if possible. Of course, the this commit() should be in enter_transaction_management/leave_transaction_management block to work as expected.

I agree, that the current behaviour is broken, so classify this as a defect.

comment:3 Changed 8 years ago by anonymous

I don't agree with you here. if I write transaction.commit() or rollback() I want a commit or a rollback to happen. No magic wanted. While when I use the higher level @commit_on_success I yust want this funktion to be transaction protected.

Making commit not issue COMMIT to the database seems thereforto me a bad design choice. But what's about commit_unless_nested() or rollback_unless_nested()?

comment:4 Changed 8 years ago by mderk@…

But what's about commit_unless_nested() or rollback_unless_nested()?

The problem here is that You can call a @commit_manually-decorated function, that insist on doing commit/rollback in the function body (it will raise an exception if you don't), or any function that omits the transaction decorator syntax, but calls a commit/rollback explicitly (e.g. when only custom SQL is used in the function). So, it means that you can not safely call any code with emplicit or explicit commit/rollback in it within another transaction block. Another reason is BC - you should replace all the manual commits/rollbacks to commit_unless_nested() or rollback_unless_nested() all over the place, and to decide where it is appropriate or not.
And I don't think that You really do want a commit or a rollback to happen if You are in the nested transaction block - It would definitely break integrity of the application. The maximum the transaction.commit() shoild do in this case is to make a savepoint if available. But if the surrounding block wants to rollback, it should be able to do so (it won't be able if its transaction was already commited in the nested block).

comment:5 Changed 8 years ago by adrian

  • Status changed from new to assigned

comment:6 Changed 7 years ago by SmileyChris

  • Triage Stage changed from Unreviewed to Accepted

I guess "assigned" means accepted

comment:7 Changed 6 years ago by bear330

I got the same problem, what status about this issue?

Thanks.

comment:8 Changed 6 years ago by anonymous

No one care about this issue?

comment:9 Changed 6 years ago by mderk

I do care, but my proposal would affect BC, and nobody came with a BC-compatible alternative proposal yet.
Max

comment:10 Changed 6 years ago by anonymous

It seems like this issue wouldn't be accepted into 1.0 release,

http://groups.google.com/group/django-developers/browse_thread/thread/5ce124e7526dad

Maybe we should do it by ourself with our own decorator?

Django seems wouldn't do anything that will break BC-compatible.

comment:11 Changed 6 years ago by weaver@…

  • Summary changed from [patch] transaction.commit()/rollback() should be aware of nested calls to transaction.managed(True) to [patch] refactored transaction.py
  • Version changed from SVN to newforms-admin
Introduction
?

I stumbled across a problem with nested transactions and created a general fix before I found 2227. At its core, the solution I developed is to change how manage interacts with enter_transaction_management. Currently, when enter_transaction_management is called, the previous state is pushed onto the stack of states. The procedure manage can change this state at any time. This solution forces the state of a transactional context to remain managed if the outer context was managed.

Basic Solution

  • Make each item on the state stack a 2-tuple of (initialized, managed). The initialized flag is used to make sure manage is only called once per block. The managed flag is used by the is_managed procedure as before.
  • Only allow manage to be called once per transaction block. This prevents flip-flopping between a managed and unmanaged state and relieves manage of the responsibility of deciding whether to commit dirty transactions or not.
  • Once a transaction block is set to managed = True, nested transaction blocks are forced to be managed = True even if manage(False) is called. This change prevents nested "autocommit" blocks from committing changes in an outer, manual transaction.
  • connection._commit and connection._rollback are only called for top-level transactions or when there is not a current transaction. commit and rollback operations are no-ops for nested transactions. This is correct for commit, but slightly unfortunate for rollback because a nested transaction block cannot roll itself back. Such a partial rollback would require first-class support for save-points.

Other Changes

  • connection._commit and connection._rollback are only called from commit and rollback respectively.

  • The state and dirty dictionaries are replaced with a threading.local object (see Compatibility).

  • The patch from 7241 has been incorporated.

  • 2304 could be fixed by adding a check for

    DISABLE_TRANSACTION_MANAGEMENT to TransactionState.really_transact.

Compatibility

With this patch applied, Django's runtests.py passes all tests. It is backward-compatible with un-nested transactions and database operations lacking a transactional context (Django, by default, autocommits).

It may not backward-compatible with:

  1. outer transactions that roll back after an inner transaction has been committed
  2. outer transactions that commit after an inner transaction has rolled back

A comments at the top of transaction.py implies that compatibility with Python 2.3 is a goal. If this is still the case, an implementation of threading.local will need to be added to django.utils since threading.local was introduced in Python 2.4. Such an implementation could use a technique similar to the current technique (i.e. a dictionary indexed by thread-id). Alternatively, the TransactionState class could be refactored to not use threading.local as long as Python 2.3 is still supported.

Attachments

A patch is attached as well as a complete copy of the updated transaction.py. I've made some test cases. The tests can be run by unpacking the attached zip file and running ./manage.py test. I've also attached test.py which has the relevant parts of the zip file for easier browsing.

comment:12 Changed 6 years ago by anonymous

  • Summary changed from [patch] refactored transaction.py to atch] transaction.commit()/rollback() should be aware of nested calls to transaction.managed(True)

sorry, I'm new to trac. I didn't mean to change the issue name.

Changed 6 years ago by weaver@…

Changed 6 years ago by weaver@…

Changed 6 years ago by weaver@…

comment:13 Changed 6 years ago by anonymous

  • Cc research@… added

comment:14 Changed 6 years ago by anonymous

  • Keywords transaction, nest, scope added
  • Version changed from newforms-admin to SVN

comment:15 Changed 6 years ago by guettli

  • Cc hv@… added

comment:16 Changed 6 years ago by werty_

maybe it should be better to allow a parameter to the decorator that will say if one wants the method to always use a new transaction for it or work inside the above transaction if one exists (like in .NET). There are cases that you might want one behavior and cases where you might want the other.

if someone thinks where I might want to use the first case the first example that comes to mind is when i want to log in both a file and the db whenever an error appears. This works at the moment but if the changes were to be done it would no longer work like this.
can someone modify the decorator to take one parameter and decide from there how to behave?

comment:17 Changed 5 years ago by Glenn

This seems like a step in the right direction, but not being allowed to call rollback() from within a nested transaction seems like a killer. What are you supposed to do? You'd have to remember that you aren't the top-level transaction level, raise an exception instead of rolling back, and expect whoever is the top-level transaction to rollback for you once the exception propagates out.

That's very brittle. If someone accidentally eats the exception before it gets there, then the transaction might be committed, which is the worst possible failure mode.

Instead, I'd suggest:

  • when rollback is called when nested, mark the transaction failed;
  • if commit is called when not nested and the transaction was marked failed, raise an exception, because rollback should have been called instead.

This allows calling rollback() to cancel the transaction, and guarantees that the transaction will never be committed, which I think is a critical invariant.

Then, this would work (easily dropped into a decorator):

def foo():
    transaction.enter_transaction_management()
    try:
        # code

        transaction.set_dirty()
        transaction.commit()
    except:
        transaction.rollback()
        raise
    finally:
        transaction.leave_transaction_management()

def fum():
    transaction.enter_transaction_management()
    try:
        foo()
    except:
        transaction.rollback()
        raise
    finally:
        transaction.leave_transaction_management()

comment:18 Changed 5 years ago by Glenn

I didn't notice that there was already infrastructure for savepoints, which makes it very easy to do this all correctly. I don't know why that's not being used, so I guess I'll just roll my own wrappers to handle this for now.

comment:19 Changed 4 years ago by anonymous

  • Cc aball@… added

comment:20 Changed 4 years ago by forest

  • Cc forest added

comment:21 Changed 4 years ago by wdoekes

  • Cc wdoekes added

comment:22 Changed 4 years ago by adamnelson

  • Patch needs improvement set

comment:23 Changed 4 years ago by jacmkno

  • Keywords scope, nested, rollback, commit added; scope removed

I really like Glenn's approach because the lack of nested transaction management in the back-end databases is made transparent to the programmer. I think that solving this problem with database features will be a good idea when when those features are ready to fully support nested transactions. In the current implementation good documentation of transactional functions in your app is required to mantain data consitency, and Glen's approach would reduce this need significantly.

comment:24 Changed 4 years ago by Glenn

For most intents and purposes, savepoints *are* nested transactions, just presented a bit differently. I don't think there's anything actually lacking in the back-end--you just have to switch from transactions to savepoints once you're nesting. With this tucked away in a decorator (and a context manager, etc.), the difference doesn't even have to be visible to high-level code and simply be presented as nested transactions.

I've used savepoints this way for a while now, and it works extremely well. It ensures that I can call a function that accesses the database, and if it throws an exception, the database will be left where it was before I called the function. I can catch exceptions and, if appropriate, reliably recover the existing transaction. Without this, I have to carefully audit everything that might access the database, so I know whether exceptions from a function are unrecoverable and force me to abort the entire request in order to roll it back all the way. That's a terrible loss of abstraction.

comment:25 Changed 3 years ago by lrekucki

  • Severity changed from major to Normal

comment:26 Changed 3 years ago by lrekucki

  • Summary changed from atch] transaction.commit()/rollback() should be aware of nested calls to transaction.managed(True) to transaction.commit()/rollback() should be aware of nested calls to transaction.managed(True)
  • Type changed from defect to New feature

comment:27 follow-up: Changed 3 years ago by etianen

  • Cc etianen added

I agree that nested calls to transaction management commands should transparently start using savepoints. It seems like the obvious abstraction layer.

However, for backends which do not support savepoints, what would the fallback be? The only thing I can think of is to check for nesting and only allow transactions to commit in the outer nested block. This means that different backends will behave very differently, massively breaking the abstraction.

For backwards compatibility, the current transaction handling methods will probably have to stay, as they provide a 'broken', yet consistent way of handling transactions between backends. There are workarounds for core framework features, such as #15694.

However, a new set of decorators with a slightly different API could be provided in parallel, with a signature something like this:

@transaction.commit_block(savepoint=False)
def do_something(request):
    # Do something...

This decorator (with a better name), would work like commit_on_success, but only actually perform the commit when the outer block exits. The optional savepoint=True parameter would provide more fine-grained rollback support using savepoints, but generate a warning on backends which do not support savepoints. The core framework could use this on things like the admin views, but would avoid using the savepoint support (this is for individual projects to decide).

Last edited 3 years ago by etianen (previous) (diff)

comment:28 in reply to: ↑ 27 Changed 3 years ago by Glenn

Some serious breakage on this tracker ("Submission rejected as potential spam (BlogSpam says content is spam (badip:state/blacklist.d/127.0.0.1))"); trying again...

Replying to etianen:

However, for backends which do not support savepoints, what would the fallback be? The only thing I can think of is to check for nesting and only allow transactions to commit in the outer nested block. This means that different backends will behave very differently, massively breaking the abstraction.

What backends don't support savepoints? Postgresql, SQLite, Oracle and MySQL all do. They're an essential part of transaction support; I wouldn't consider any database that doesn't support them as production quality.

This decorator (with a better name), would work like commit_on_success, but only actually perform the commit when the outer block exits. The optional savepoint=True parameter would provide more fine-grained rollback support using savepoints, but generate a warning on backends which do not support savepoints.

There's a weird aspect of savepoints to consider:

Most of the time, using or not using savepoints makes no difference to the code within the (sub-)transaction. The difference is to the calling function. If a function throws an exception, whether savepoints were in use or not determines how the *caller* needs to respond to the exception. If savepoints were used, then the caller can catch the exception normally, like any exception, and possibly swallow it and keep going. If savepoints weren't used and the caller was already in a transaction, then the caller can't do that--the exception actually means "the surrounding transaction is invalid", and you have to roll all the way back to the origin of the transaction to roll it back.

This means that, most of the time, the function itself doesn't care whether it's within a savepoint or not. It's the *caller* who cares. In other words, specifying savepoint=True or False in the decorator doesn't really make sense to me--I can think of no case where I'd ever say False.

It's reasonable to use a new decorator for this, since doing this changes the API of functions that use it, but having it default to not using savepoints seems broken and I'm not sure there's much benefit to having a parameter to disable it at all.

comment:29 Changed 3 years ago by vzima

  • Cc vlastimil.zima@… added
  • Easy pickings unset
  • UI/UX unset

I have stumbled into this problem when I tried to make post-save signal code included in same transaction as Model.save(). Problem is that in Model.save_base() a function commit_unless_managed() is called and premature commit can only be avoided by making transaction managed.

Temporarily I solved this by adding check for transaction management into save() method to prevent commit by commit_unless_managed() call

if not is_managed():
    raise TransactionManagementError("This code must be under transaction management")

But I found it better to solve this by appropriate decorator, such as this

def commit_on_success_unless_managed(using=None):
    """
    This decorator activates commit on response unless code is under transaction management.
    """
    def inner_commit_on_success(func, db=None):
        def _commit_on_success(*args, **kw):
            # Only change from original commit_on_success is on next two lines
            if is_managed(using=db):
                return func(*args, **kw)

            try:
                enter_transaction_management(using=db)
                managed(True, using=db)
                try:
                    res = func(*args, **kw)
                except:
                    # All exceptions must be handled here (even string ones).
                    if is_dirty(using=db):
                        rollback(using=db)
                    raise
                else:
                    if is_dirty(using=db):
                        try:
                            commit(using=db)
                        except:
                            rollback(using=db)
                            raise
                return res
            finally:
                leave_transaction_management(using=db)
        return wraps(func)(_commit_on_success)

    # Note that although the first argument is *called* `using`, it
    # may actually be a function; @autocommit and @autocommit('foo')
    # are both allowed forms.
    if using is None:
        using = DEFAULT_DB_ALIAS
    if callable(using):
        return inner_commit_on_success(using, DEFAULT_DB_ALIAS)
    return lambda func: inner_commit_on_success(func, using)

This is not best solution, it is just idea how to solve things without requirement for implementation of nested transactions.

comment:30 Changed 3 years ago by stevejalim

  • Cc stevejalim added

comment:31 Changed 2 years ago by IonelMaries

  • Cc ionel.mc@… added

comment:32 Changed 2 years ago by akaariai

I would like to see two new decorators, @transactional and @atomic.

@transactional will open a new transaction if there is not already one. If this is the case, then it works like @commit_on_success. If there is already a managed transaction open, the transaction will be joined and @transactional is basically a no-op.

@atomic works like @transactional if there isn't already a transaction open. However it differs from @transactional by creating a savepoint if there is already a transaction open, and of course doing a rollback-to-savepoint on exceptions.

Lets be very careful with changing the existing behavior - even small backwards incompatible changes can lead to data corruption bugs for the users.

comment:33 Changed 22 months ago by anonymous

As a naive user I'd very much like to see the save point scheme, it produces the behavior I was naively expecting.

To be clear that behavior is:

If I exception out of a commit_on_success then everything inside that gets scrubbed regardless of whether my caller was also in a commit_on_success. And if they were in a commit_on_success then without messing it up for them.

I shouldn't need to know if a function I call uses transactions inside and I shouldn't need to know if whoever called me uses transactions. Both those lead to unpleasant surprises.

It doesn't seem like it should be a big deal to implement, on entry if there's no transaction create one, otherwise drop a savepoint. On exit if we created a transaction then commit it. On exception exit roll back to the transaction/savepoint we created.

I agree with the desire to preserve backward compatibility, creating a new decorator with this behavior and providing a global flag to enable errors when nesting the current ones would address that. I also like atomic for the name of the new decorator.

comment:34 Changed 19 months ago by akaariai

Here is a patch that adds @in_tx and @atomic to transaction module. @in_tx joins existing transaction or enter managed transaction if there isn't already one, @atomic creates a transaction or a savepoint. So, nested @in_tx calls are no-ops, otherwise it works like commit_on_success, nested @atomic calls create savepoints.

Why not change commit_on_success? First, commit_on_success can't be changed due to backwards compatibility. Second, commit_on_success should commit on success.

Why both in_tx and atomic? Using savepoints everywhere will cause performance problems - quickly testing 1000 raw SQL UPDATEs takes roughly 2.5x the time if wrapped in savepoints (using postgres), and this time is spent in the backend, not in Django.

I think it would be a good idea to block using other transaction decorators or commit/rollback when inside one of the new decorators. The reason is that if you have called @atomic for example, doing a commit or rollback @atomic doesn't know about *will* break the atomicity.

I hate the idea that .commit/.rollback would begin to use savepoints. Transactions in Django are already hard to understand, now if commit or rollback would do something different than commit/rollback it would only make things worse. In addition, at this point this is a major backwards compatibility change.

If we want to support nested calls, then lets invent new names for nested tx primitives.

Patch available from https://github.com/akaariai/django/commit/3041aa61e37ba07819661cd81846922f23c5686c. The patch also includes some changes to core code to use the in_tx decorator.

Version 0, edited 19 months ago by akaariai (next)

comment:35 Changed 19 months ago by akaariai

It might be impossible to not reuse the state in situations like this:

my_atomic = atomic(using='other')

def foo(recurse):
    with my_atomic:
        if recurse:
            foo(False)
foo(True)

The answer is to error out if this is done. Then my_atomic should be defined as:

def my_atomic():
    return atomic(using='other')

A guard against reusing a stateful context manager, and also a fix to make the decorator style work without sharing state, is available from https://github.com/akaariai/django/compare/new_tx

comment:36 Changed 17 months ago by mlarente

The concurrency issue can be fixed by using a thread-local stack, such as this:

import threading

_atomic_state = threading.local()

def atomic(using=None):
    def entering(using):
        if not hasattr(_atomic_state, 'sid_stack'):
            _atomic_state.sid_stack = []
        sid = None

        if is_managed(using=using):
            sid = savepoint(using=using)
        else:
            enter_transaction_management(using=using)
            managed(True, using=using)

        _atomic_state.sid_stack.append(sid)

    def exiting(exc_value, using):
        sid = _atomic_state.sid_stack.pop()
        if sid is not None:
            if exc_value is not None:
                savepoint_rollback(sid, using=using)
        else:
            try:
                if exc_value is not None:
                    if is_dirty(using=using):
                        rollback(using=using)
                else:
                    if is_dirty(using=using):
                        try:
                            commit(using=using)
                        except:
                            rollback(using=using)
                            raise
            finally:
                leave_transaction_management(using=using)

    return _transaction_func(entering, exiting, using)

comment:37 Changed 17 months ago by aaugustin

  • Owner changed from nobody to aaugustin

comment:38 Changed 17 months ago by mitar

  • Cc mmitar@… added

comment:39 Changed 16 months ago by Aymeric Augustin <aymeric.augustin@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In 14cddf51c5f001bb426ce7f7a83fdc52c8d8aee9:

Merged branch 'database-level-autocommit'.

Fixed #2227: atomic supports nesting.
Fixed #6623: commit_manually is deprecated and atomic doesn't suffer from this defect.
Fixed #8320: the problem wasn't identified, but the legacy transaction management is deprecated.
Fixed #10744: the problem wasn't identified, but the legacy transaction management is deprecated.
Fixed #10813: since autocommit is enabled, it isn't necessary to rollback after errors any more.
Fixed #13742: savepoints are now implemented for SQLite.
Fixed #13870: transaction management in long running processes isn't a problem any more, and it's documented.
Fixed #14970: while it digresses on transaction management, this ticket essentially asks for autocommit on PostgreSQL.
Fixed #15694: atomic supports nesting.
Fixed #17887: autocommit makes it impossible for a connection to stay "idle of transaction".

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.