Opened 8 years ago

Closed 4 years ago

Last modified 4 years ago

#26552 closed Bug (fixed)

`TransactionTestCase.serialized_rollback` fails to restore objects due to ordering constraints

Reported by: Aymeric Augustin Owned by: Matthijs Kooijman
Component: Testing framework Version: dev
Severity: Normal Keywords:
Cc: Fabio Caritas Barrionuevo da Luz, Matthijs Kooijman Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I hit this problem in a fairly complex projet and haven't had the time to write a minimal reproduction case. I think it can be understood just by inspecting the code so I'm going to describe it while I have it in mind.


Setting serialized_rollback = True on a TransactionTestCase triggers rollback emulation. In practice, for each database:

  • BaseDatabaseCreation.create_test_db calls connection._test_serialized_contents = connection.creation.serialize_db_to_string()
  • TransactionTestCase._fixture_setup calls connection.creation.deserialize_db_from_string(connection._test_serialized_contents)

(The actual code isn't written that way; it's equivalent but the symmetry is less visible.)


serialize_db_to_string orders models with serializers.sort_dependencies and serializes them. The sorting algorithm only deals with natural keys. It doesn't do anything to order models referenced by foreign keys before models containing said foreign keys. That wouldn't be possible in general because circular foreign keys are allowed.

deserialize_db_from_string deserializes and saves models without wrapping in a transaction. This can result in integrity errors if an instance containing a foreign key is saved before the instance it references. I'm suggesting to fix it as follows:

diff --git a/django/db/backends/base/creation.py b/django/db/backends/base/creation.py
index bca8376..7bed2be 100644
--- a/django/db/backends/base/creation.py
+++ b/django/db/backends/base/creation.py
@@ -4,7 +4,7 @@ import time
 from django.apps import apps
 from django.conf import settings
 from django.core import serializers
-from django.db import router
+from django.db import router, transaction
 from django.utils.six import StringIO
 from django.utils.six.moves import input
 
@@ -128,8 +128,9 @@ class BaseDatabaseCreation(object):
         the serialize_db_to_string method.
         """
         data = StringIO(data)
-        for obj in serializers.deserialize("json", data, using=self.connection.alias):
-            obj.save()
+        with transaction.atomic(using=self.connection.alias):
+            for obj in serializers.deserialize("json", data, using=self.connection.alias):
+                obj.save()
 
     def _get_database_display_str(self, verbosity, database_name):
         """

Note that loaddata doesn't have this problem because it wraps everything in a transaction:

    def handle(self, *fixture_labels, **options):
        # ...
        with transaction.atomic(using=self.using):
            self.loaddata(fixture_labels)
        # ...

This suggest that the transaction was just forgotten in the implementation of deserialize_db_from_string.


It should be possible to write a deterministic test for this bug because the order in which serialize_db_to_string serializes models depends on the app registry, and the app registry uses OrderedDict to store apps and models in a deterministic order.

Change History (12)

comment:1 by Claude Paroz, 8 years ago

Has patch: set
Needs tests: set
Triage Stage: UnreviewedAccepted

comment:2 by Fabio Caritas Barrionuevo da Luz, 5 years ago

Cc: Fabio Caritas Barrionuevo da Luz added

comment:3 by Matthijs Kooijman, 4 years ago

I've run into a problem related to this one (just reported as #31051), so I ended up looking into this problem as well. The original report still seems accurate to me, with the proposed solution valid.

I've been working on a fix and (most of the work), testcase for this problem. I'll do some more testing and provide a proper PR for this issue and #31051 soon. The testcase is not ideal yet (testing the testing framework is tricky), but I'll expand on that in the PR.

Furthermore, I noticed that loaddata does not just wrap everything in a transaction, it also explicitly disables constraint checks inside the transaction:

        with connection.constraint_checks_disabled():                                                                   
            self.objs_with_deferred_fields = []                                                                         
            for fixture_label in fixture_labels:                                                                        
                self.load_label(fixture_label)                                                                          
            for obj in self.objs_with_deferred_fields:                                                                  
                obj.save_deferred_fields(using=self.using)                                                              
                                                                                                                        
        # Since we disabled constraint checks, we must manually check for                                               
        # any invalid keys that might have been added                                                                   
        table_names = [model._meta.db_table for model in self.models]                                                   
        try:                                                                                                            
            connection.check_constraints(table_names=table_names)                                                       
        except Exception as e:                                                                                          
            e.args = ("Problem installing fixtures: %s" % e,)                                                           
            raise     

I had a closer look at how this works (since I understood that a transaction already implicitly disables constraint checks) and it turns out that MySQL/InnoDB is an exception and does *not* defer constraint checks to the end of the transaction, but instead needs extra handling (so constraint_checks_disabled() is a no-op on most database backends). See #3615.

comment:4 by Matthijs Kooijman, 4 years ago

Cc: Matthijs Kooijman added
Owner: changed from nobody to Matthijs Kooijman
Status: newassigned

comment:5 by Matthijs Kooijman, 4 years ago

Has patch: unset
Needs tests: unset

A PR is available at https://github.com/django/django/pull/12166. It is still marked as draft, since there are still some issues with running the tests, but review of the code, additions to the discussion in the PR and maybe ideas on how to fix the remaining testsuite issues are welcome already.

comment:6 by Matthijs Kooijman, 4 years ago

Has patch: set

comment:7 by Mariusz Felisiak, 4 years ago

Patch needs improvement: set
Version: 1.9master

Marking as Patch needs improvement, because it's a draft PR.

comment:8 by Matthijs Kooijman, 4 years ago

Patch needs improvement: unset

The issues with the testcases have been resolved, so the patch is ready for review.

comment:9 by Mariusz Felisiak, 4 years ago

Triage Stage: AcceptedReady for checkin

comment:10 by Mariusz Felisiak <felisiak.mariusz@…>, 4 years ago

Resolution: fixed
Status: assignedclosed

In 98f23a8:

Fixed #26552 -- Deferred constraint checks when reloading the database with data for tests.

deserialize_db_from_string() loads the full serialized database
contents, which might contain forward references and cycles. That
caused IntegrityError because constraints were checked immediately.

Now, it loads data in a transaction with constraint checks deferred
until the end of the transaction.

comment:11 by Mariusz Felisiak <felisiak.mariusz@…>, 4 years ago

In 12e6f57:

Refs #26552 -- Added test for reloading the database with self-referential objects.

comment:12 by GitHub <noreply@…>, 4 years ago

In eeab63e5:

Refs #26552 -- Made reloading the database for tests check only loaded tables constraints.

Note: See TracTickets for help on using tickets.
Back to Top