Opened 6 years ago

Closed 6 years ago

Last modified 3 years ago

#9479 closed (fixed)

Infinite loop in QuerySet delete() using MySQL 5.0.44 with InnoDB and default repeatable read transaction isolation

Reported by: jjpersch Owned by: nobody
Component: Database layer (models, ORM) Version: 1.0
Severity: Keywords:
Cc: aball Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

Given: MySQL 5.0.44, InnoDB engine, default repeatable read transaction isolation.

If two database transactions attempt to perform a query set delete() at the same time that reference the same object, the the second delete will loop forever issuing SELECT and DELETE statements.

See django/db/models/query.py, line 385:

        # Delete objects in chunks to prevent the list of related objects from
        # becoming too long.
        while 1:
            # Collect all the objects to be deleted in this chunk, and all the
            # objects that are related to the objects that are to be deleted.
            seen_objs = CollectedObjects()
            for object in del_query[:CHUNK_SIZE]:
                object._collect_sub_objects(seen_objs)

            if not seen_objs:
                break
            delete_objects(seen_objs)

The following mysql console logs demonstrates why this will loop forever in this case:

Schema

% mysql test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 66944
Server version: 5.0.44-log Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> create table foo ( id int(11) not null auto_increment, primary key (id) ) engine=innodb;
Query OK, 0 rows affected (0.06 sec)

mysql> insert into foo values ();
Query OK, 1 row affected (0.00 sec)

mysql> insert into foo values ();
Query OK, 1 row affected (0.00 sec)

mysql> exit
Bye

Connection A

% mysql test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 67571
Server version: 5.0.44-log Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> set autocommit = 0;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from foo;
+----+
| id |
+----+
|  1 | 
|  2 | 
+----+
2 rows in set (0.00 sec)

Connection B

% mysql test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 67642
Server version: 5.0.44-log Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> set autocommit = 0;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from foo;
+----+
| id |
+----+
|  1 | 
|  2 | 
+----+
2 rows in set (0.00 sec)

Connection A

mysql> delete from foo where id = 1;
Query OK, 1 row affected (0.01 sec)

mysql> select * from foo;
+----+
| id |
+----+
|  2 | 
+----+
1 row in set (0.00 sec)

mysql> commit;
Query OK, 0 rows affected (0.00 sec)

Connection B

mysql> delete from foo where id = 1;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from foo;
+----+
| id |
+----+
|  1 | 
|  2 | 
+----+
2 rows in set (0.00 sec)

mysql> delete from foo where id = 1;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from foo;
+----+
| id |
+----+
|  1 | 
|  2 | 
+----+
2 rows in set (0.00 sec)

Until the second transaction commits, it will continue to select the row that was deleted by the first transaction (and attempted to be deleted by the second transaction).

Attachments (2)

9479-r10858.diff (3.3 KB) - added by russellm 6 years ago.
Test case demonstrating problem
9479-r10858.2.diff (2.5 KB) - added by russellm 6 years ago.
Test case demonstrating problem

Download all attachments as: .zip

Change History (17)

comment:1 Changed 6 years ago by jacob

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

comment:2 Changed 6 years ago by jacob

  • milestone set to 1.1

comment:3 Changed 6 years ago by aball

  • Cc aball added

comment:4 follow-up: Changed 6 years ago by russellm

This looks like it might be a bug with InnoDB itself, which has been corrected in 5.1.X. However, given the widespread nature of MySQL 5.0, it may still be worth including a workaround for this.

comment:5 in reply to: ↑ 4 Changed 6 years ago by kmtracey

Replying to russellm:

This looks like it might be a bug with InnoDB itself, which has been corrected in 5.1.X. However, given the widespread nature of MySQL 5.0, it may still be worth including a workaround for this.

That MySQL bug seems to be about being unable to see (read) things done within the same transaction, which I don't think is what this ticket is reporting. This is reporting that if two different transactions simultaneously try to delete the same (set of) object(s), one of them may enter an infinite loop, because once it has read the objects, it won't be able to see that the other thread has deleted them. I can recreate this on my MySQL install (5.0.67), via a shell session and by putting a pdb break within the while 1 loop noted above, but only if I use manual transaction management. Using the Django default autocommit mode, the attempted delete is automatically committed, allowing the 2nd thread to see the results of the delete by the first thread. This is what I see in the mysql log:

090523 15:01:11     174 Connect     kmt@localhost on Playground
                    174 Query       SET storage_engine=INNODB
                    174 Query       SET NAMES utf8
                    174 Query       set autocommit=0
                    174 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 21
090523 15:01:30     175 Connect     kmt@localhost on Playground
                    175 Query       SET storage_engine=INNODB
                    175 Query       SET NAMES utf8
                    175 Query       set autocommit=0
                    175 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 21
090523 15:01:57     175 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 100
                    175 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 8
                    175 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 9
                    175 Query       DELETE FROM `ttt_fcollection` WHERE `id` IN (9, 8)
                    175 Query       commit
090523 15:02:00     174 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 100
                    174 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 8
                    174 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 9
                    174 Query       DELETE FROM `ttt_fcollection` WHERE `id` IN (9, 8)
                    174 Query       commit
090523 15:02:03     175 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 100
090523 15:02:05     174 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 100

Both treads attempt to delete the same pair of objects, and both commit the delete. No infinite loop. Only if I use transaction.enter_transaction_management() and transaction.managed(True) in the 2nd thread do I see:

090523 15:08:55     193 Connect     kmt@localhost on Playground
                    193 Query       SET storage_engine=INNODB
                    193 Query       SET NAMES utf8
                    193 Query       set autocommit=0
                    193 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 21
090523 15:09:49     194 Connect     kmt@localhost on Playground
                    194 Query       SET storage_engine=INNODB
                    194 Query       SET NAMES utf8
                    194 Query       set autocommit=0
                    194 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 21
090523 15:10:03     193 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 100
                    193 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 12
                    193 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 13
                    193 Query       DELETE FROM `ttt_fcollection` WHERE `id` IN (13, 12)
090523 15:10:04     193 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 100
090523 15:10:07     194 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 100
                    194 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 12
                    194 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 13
                    194 Query       DELETE FROM `ttt_fcollection` WHERE `id` IN (13, 12)
090523 15:12:55     193 Query       commit
090523 15:13:06     194 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 100
                    194 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 12
                    194 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 13
                    194 Query       DELETE FROM `ttt_fcollection` WHERE `id` IN (13, 12)
090523 15:13:08     194 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 100
                    194 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 12
                    194 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 13
                    194 Query       DELETE FROM `ttt_fcollection` WHERE `id` IN (13, 12)
090523 15:13:09     194 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 100
                    194 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 12
                    194 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 13
                    194 Query       DELETE FROM `ttt_fcollection` WHERE `id` IN (13, 12)
                    194 Query       SELECT `ttt_fcollection`.`id`, `ttt_fcollection`.`name`, `ttt_fcollection`.`user_id` FROM `ttt_fcollection` LIMIT 100
                    194 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 12
                    194 Query       SELECT `ttt_filet`.`id`, `ttt_filet`.`coll_id`, `ttt_filet`.`rating`, `ttt_filet`.`timage` FROM `ttt_filet` WHERE `ttt_filet`.`coll_id` = 13
                    194 Query       DELETE FROM `ttt_fcollection` WHERE `id` IN (13, 12)

Here the 2nd thread (194) enters an infinite loop because it read the objects before they were deleted by the first thread (193), and even though the delete by the first thread is committed (I did it manually, so it's not immediately after the delete in this trace, but it is in there), this won't ever be seen by the 2nd thread/transaction because of repeatable read. I suppose the referenced MySQL bug/fix might change this by making the 2nd transaction's delete "work" and thus allow the 2nd thread to see that the rows have been deleted, but I'm not at all sure of that from reading the bug. I think the 2nd transaction's delete is just a no-op since the rows have already been deleted...

Anyway, I suppose we could keep track of what objects DELETE has been called on and exclude them from seen_objs. Not sure this needs to be done for 1.1, though, it seems like a pretty rare oddball case to me? But since Jacob put it in 1.1 I won't take it out.

comment:6 Changed 6 years ago by russellm

FYI Karen - Malcolm, Jacob and I discussed this briefly off list. I agree that we could probably punt this one, but Malcolm was of the opinion that this had come up a semi-regularly on the mailing lists, but because it isn't something that is easy to diagnose, the underlying cause isn't always obvious.

I concur that the easy solution here is to just keep a log of the objects we have already deleted. I'm a little concerned that this will mean carrying around a lot of accounting information for big deletes, but the bulk deletion process isn't particularly lightweight to begin with, so this may not be a huge concern. I suspect bulk deletion will be revisited soon in the context of supporting ON DELETE style operations; a workaround will probably do for the interim.

Changed 6 years ago by russellm

Test case demonstrating problem

Changed 6 years ago by russellm

Test case demonstrating problem

comment:7 Changed 6 years ago by russellm

I've just attached a test case (well...two actually, but ignore the first one - it's testing the wrong thing). I'm having a fiddle to see if I can get a clean fix.

comment:8 follow-up: Changed 6 years ago by Alex

Are you sure this tests what you want? delete() on a queryset doesn't use the connection passed to the QuerySet (via query) at all: http://code.djangoproject.com/browser/django/trunk/django/db/models/query.py#L343 since the deletion happens in delete_objects(). If there is an issue it is triggered in the loop before delete_objects.

comment:9 in reply to: ↑ 8 Changed 6 years ago by kmtracey

Replying to Alex:

Are you sure this tests what you want?

The test certainly appears to cause an infinite loop, 100% CPU, keyboard interrupt produces:

No fixtures found.
Deletes on concurrent transactions don't collide and lock the database ... ^CTraceback (most recent call last):
  File "./runtests.py", line 191, in <module>
    django_tests(int(options.verbosity), options.interactive, args)
  File "./runtests.py", line 161, in django_tests
    failures = test_runner(test_labels, verbosity=verbosity, interactive=interactive, extra_tests=extra_tests)
  File "/home/kmt/django/trunk/django/test/simple.py", line 192, in run_tests
    result = unittest.TextTestRunner(verbosity=verbosity).run(suite)
  File "/usr/lib/python2.5/unittest.py", line 705, in run
    test(result)
  File "/usr/lib/python2.5/unittest.py", line 437, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python2.5/unittest.py", line 433, in run
    test(result)
  File "/home/kmt/django/trunk/django/test/testcases.py", line 249, in __call__
    super(TransactionTestCase, self).__call__(result)
  File "/usr/lib/python2.5/unittest.py", line 281, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python2.5/unittest.py", line 260, in run
    testMethod()
  File "/home/kmt/django/trunk/tests/regressiontests/delete_regress/models.py", line 49, in test_concurrent_delete
    query.QuerySet(Book, q).filter(pk=1).delete()
  File "/home/kmt/django/trunk/django/db/models/query.py", line 362, in delete
    for object in del_query[:CHUNK_SIZE]:
  File "/home/kmt/django/trunk/django/db/models/query.py", line 93, in _result_iter
    self._fill_cache()
  File "/home/kmt/django/trunk/django/db/models/query.py", line 660, in _fill_cache
    self._result_cache.append(self._iter.next())
  File "/home/kmt/django/trunk/django/db/models/query.py", line 207, in iterator
    for row in self.query.results_iter():
  File "/home/kmt/django/trunk/django/db/models/sql/query.py", line 287, in results_iter
    for rows in self.execute_sql(MULTI):
  File "/home/kmt/django/trunk/django/db/models/sql/query.py", line 2359, in execute_sql
    cursor.execute(sql, params)
  File "/home/kmt/django/trunk/django/db/backends/mysql/base.py", line 84, in execute
    return self.cursor.execute(query, args)
  File "/var/lib/python-support/python2.5/MySQLdb/cursors.py", line 166, in execute
    self.errorhandler(self, exc, value)
  File "/var/lib/python-support/python2.5/MySQLdb/connections.py", line 35, in defaulterrorhandler
    raise errorclass, errorvalue
KeyboardInterrupt
Exception _mysql_exceptions.ProgrammingError: (2014, "Commands out of sync; you can't run this command now") in <bound method Cursor.__del__ of <MySQLdb.cursors.Cursor object at 0x9ae148c>> ignored
Exception _mysql_exceptions.ProgrammingError: (2014, "Commands out of sync; you can't run this command now") in <bound method Cursor.__del__ of <MySQLdb.cursors.Cursor object at 0x9adcf0c>> ignored

when run on MySQL/InnoDB. Same test on MySQL/MyISAM completes successfully.

comment:10 Changed 6 years ago by russellm

@Alex - You are correct that this test doesn't use the second connection - but that's essentially the point. One connection is gathering objects; the second connection is deleting them. This shouldn't lead to an infinite loop, but because of InnoDB's casual relationship with the concept of transactions, it does.

Proof in point - the test fails if you run it under InnoDB, but passes under MyISAM, or using Postgres. I have worked up a patch that fixes the problem - I just need the test suite to finish running on all the databases before I commit.

I will grant that the test case relies on some internal behavior that would be nice to not expose in tests - especially given the changes that you are about to make in this area. Suggestions are welcome on other ways to write this test - if anyone can suggest a better way I'd be happy to change the test.

comment:11 Changed 6 years ago by Alex

Ok, didn't see that that was the behavior you wanted to test. What I would do to minimize the merge fail is to essentially "unroll" the delete method. That is test the internals explicitly.

comment:12 Changed 6 years ago by russellm

  • Resolution set to fixed
  • Status changed from new to closed

(In [10913]) Fixed #9479 -- Corrected an edge case in bulk queryset deletion that could cause an infinite loop when using MySQL InnoDB.

comment:13 Changed 6 years ago by russellm

(In [10915]) [1.0.X] Fixed #9479 -- Corrected an edge case in bulk queryset deletion that could cause an infinite loop when using MySQL InnoDB.

Merge of 10913-10914 from trunk.

comment:14 Changed 6 years ago by ccahoon

(In [10994]) Fixed #9479 -- Corrected an edge case in bulk queryset deletion that could cause an infinite loop when using MySQL InnoDB.

comment:15 Changed 3 years ago by jacob

  • milestone 1.1 deleted

Milestone 1.1 deleted

Note: See TracTickets for help on using tickets.
Back to Top