Opened 11 years ago

Closed 10 years ago

Last modified 10 years ago

#20392 closed New feature (fixed)

Rearrange transactional behavior in django.test.TestCase: savepoints around tests

Reported by: Raphaël Barrois Owned by: Tim Graham <timograham@…>
Component: Testing framework Version: dev
Severity: Normal Keywords: test atomic savepoint
Cc: Simon Charette, loic84, mmanfre@…, mathieu.agopian@…, josh.smeaton@…, t.chaumeny@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Leveraging aaugustin's work on atomic(), this patch modifies the behavior of django.test.TestCase:

  • A transaction is opened during setUpClass (one atomic per database)
  • That transaction is closed during tearDownClass
  • The database connection is no longer closed after each test function

This allows to create objects and complex state in setUpClass, and reuse it through all test functions,
without inter-test conflicts.

This could lead to a single load of fixtures during setUpClass (instead of the current setUp), thus improving test speed.

Attachments (1)

refactor_testcase_transactions.diff (7.8 KB ) - added by Raphaël Barrois 11 years ago.
Patch against commit 327e362

Download all attachments as: .zip

Change History (30)

by Raphaël Barrois, 11 years ago

Patch against commit 327e362

comment:1 by Aymeric Augustin, 11 years ago

Triage Stage: UnreviewedAccepted

Yes, this is a good idea.

comment:2 by Aymeric Augustin, 11 years ago

I'm not sure about reset_connections_per_test. Closing the connections during the execution of the tests breaks with in-memory SQLite. What is this parameter designed to achieve? Is it intended as a public API?

comment:3 by Raphaël Barrois, 11 years ago

The reset_connections_per_test attribute is here to avoid duplicating the logic in TransactionTestCase._post_teardown.
The current behavior (see https://github.com/django/django/blob/master/django/test/testcases.py#L516) is to close all database connections after self.tearDown.

I wanted to be able to distinguish between TransactionTestCase, where this connection closure is required for database reset, and TestCase, where it should be avoided to gain the features discussed above.

Put another way, the current behaviour is:

  • TransactionTestCase recreates the database and loads fixtures before each test_foo; closes the connection after each test_foo.
  • TestCase creates the database once, loads TestCase-specific fixtures before each test_foo; rolls back the transaction and closes the connection after each test_foo.

The goal is:

  • No change to TransactionTestCase
  • TestCase creates the database once, begins a transaction at setUpClass where it loads TestCase-specific fixtures, makes a savepoint before each test_foo, rolls back to savepoint after each test_foo, rolls back the transaction then closes connections in tearDownClass.

I'll try to upgrade the patch with more comments, and move the "load fixtures" code to setUpClass.
This discussion should probably move to django-developers@, what do you think?

comment:4 by Simon Charette, 11 years ago

Cc: Simon Charette added

comment:5 by loic84, 10 years ago

Cc: loic84 added

comment:6 by Aymeric Augustin, 10 years ago

comment:7 by Aymeric Augustin, 10 years ago

Xelnor, when you say "create the database", you mean "create the database connection", don't you?

comment:8 by Aymeric Augustin, 10 years ago

Judging by this discussion on django-developers: https://groups.google.com/d/msg/django-developers/N0HEAD1ht8k/GQJ77RLUydsJ and test cases included with the patch, it seems that the patch changes (for the better) the behavior of setUpClass: data changes made in setUpClass now get reset between test cases. This should be mentioned in the release notes.

comment:9 by Michael Manfre, 10 years ago

Cc: mmanfre@… added

comment:10 by jarshwah, 10 years ago

Owner: changed from nobody to jarshwah
Status: newassigned

in reply to:  7 comment:11 by Raphaël Barrois, 10 years ago

Replying to aaugustin:

Xelnor, when you say "create the database", you mean "create the database connection", don't you?

Hmm, yes, I think that's what I thought. Anyway, that's what the patch does ;)

Thanks for reviving this ticket!

comment:12 by jarshwah, 10 years ago

xelnor, I'm looking into bringing this ticket up-to-date and testing the run time of the test suite with loaddata in the setUpClass. Is there any other information that you could offer up, or if you're already maintaining an up-to-date patch, providing the details of that?

comment:13 by Mathieu Agopian, 10 years ago

Cc: mathieu.agopian@… added

in reply to:  12 comment:14 by Raphaël Barrois, 10 years ago

Replying to smeatonj:

xelnor, I'm looking into bringing this ticket up-to-date and testing the run time of the test suite with loaddata in the setUpClass. Is there any other information that you could offer up, or if you're already maintaining an up-to-date patch, providing the details of that?

I haven't maintained this patch ­— I could try to retrieve the original branch if you need; I think the patch was designed against 1.5 (to get atomic).

The biggest issue I had was that this patch requires setUpClass to be called: if a TestCase subclass defines a custom setUpClass but doesn't call super(), we'd have some issues — mostly that this setUpClass would occur outside a transaction.

This could be alleviated with custom hooks, similar to the _pre_setup and _post_teardown used in the Django TestRunner.

Unfortunately, unittest doesn't provide hooks around setUpClass / tearDownClass in the test runner: they are all called on the TestSuite object, and Django reuses the default one.

For me, we have a few options here:

  • Document that, obviously, super() should be called when overriding setUpClass
  • Add safeguards in setUpClass and check them, for instance, in _pre_setup : this could maybe raise if super() wasn't called?
  • Look harder and find a way to replicate the _pre_setup part for setUpClass, perhaps with a custom TestSuite wrapper?

comment:15 by jarshwah, 10 years ago

Thanks for the reply. I've already brought the patch up to date, so no need to chase down the branch. I also implemented the loaddata (fixtures) from inside the setUpClass, which has halved the run time for tests that use fixtures. I've run into a few problems with tests that manipulate the connections and transactions, but I haven't tracked down the exact causes yet.

The issues you raise are good ones. I think documenting the contract should suffice. If someone wanted to override setUpClass, they should have the option of not inheriting the behaviour if they wish - but they'd be responsible for all transactions and tear downs also.

The other way I thought about going was implementing a SharedFixtureTestCase, and moving the behaviour of this patch into that class. It would mean that the existing TestCase would be left as-is, and opt-in is required for the double transaction behaviour. I think this would be the safest route, but wouldn't reach as many people. Luckily, we can update the internal django tests that use fixtures to use this new class, and get quite a substantial speedup for those tests.

I profiled the entire test suite with this patch and loaddata is responsible for about 10% of the total time. That time might be skewed by the test failures I mentioned above, but it should be fairly indicative. I'll keep working on it until there are no more test failures, and then make a decision (hopefully with input from others) on whether to extract a new TestCase subclass.

comment:16 by Aymeric Augustin, 10 years ago

Don't make a new subclass. Either document the pitfall, or test in _pre_setup that the appropriate setUpClass has run and blow up loudly otherwise.

comment:17 by Raphaël Barrois, 10 years ago

I'd side with aaugustin on that one: the contract of TestCase is "whatever happens in the database during the test_ is temporary; if you use fixtures, they will be cleaned afterwards; be careful when creating instances in setUpClass."

I don't think there is any documentation stating that "if you want to reuse data created in setUpClass in other Cases, it will work", nor that this would actually be a good idea.

Currently, users who create objects in setUpClass carefully remove them during tearDownClass, and so the only tests that would be broken by this change would be tests where "one TestCase relies on the setUpClass of another one to have run before", which sounds pretty broken to me.

If you feel that users should be allowed to mess with transactions, this could be done with a simple class attribute, similar to fixtures = .

Besides, any improvement to Django's test suite is likely to benefit big projects using Django as well; please don't make them "hidden" for the sake of backwards-compatibility on undocumented behavior ;)

comment:18 by jarshwah, 10 years ago

This is the reason that I thought about implementing a new class:

# django/tests/file_uploads/tests.py

class FileUploadTests(TestCase):

    @classmethod
    def setUpClass(cls):
        if not os.path.isdir(MEDIA_ROOT):
            os.makedirs(MEDIA_ROOT)

Already we run into issues where the setUpClass doesn't call the parent - so any fixtures that would have been loaded no longer are. I'm unsure whether the behaviour here should be "blow up and make the users fix a backward-incompatible change", or to fall back to the previous behaviour, and load the fixtures in setUp rather than setUpClass. We could provide very clear guidance in the release notes (always call super().setUpClass()), but that no longer allows callers to completely customise their setUpClass (opt-out of class-based transactions). Which behaviour is preferred?

comment:19 by Aymeric Augustin, 10 years ago

Yes, you must blow up and make users add a super() call in that case.

comment:20 by jarshwah, 10 years ago

I had a go at implementing the ideas in this ticket: https://github.com/jarshwah/django/tree/testcase-optimization

Test cases that use fixtures improved by a factor of about 3. Unfortunately, this only represents about 10% of the entire test suite, so doesn't make a huge impact. I also ran into some weird threading issues with sqlite that was causing a segmentation fault.

I'm going to bow out at this point, but I'm more than happy to help if someone else would like to continue on the patch.

comment:21 by jarshwah, 10 years ago

Cc: josh.smeaton@… added
Owner: jarshwah removed
Status: assignednew

comment:22 by Thomas C, 10 years ago

I spent some time working on that and eventually came up with something.

Besides the fact that setUpClass/tearDownClass should call super, I encountered the following problems:

  • When @override_settings is decorating a TestCase, it should also affect setUpClass / tearDownClass

There already is a ticket for that (#21281), but I believe it is required if we are going to tell people they can interact with the database within setUpClass as the result of that interaction might depend on whether or not the settings were overridden when it happened (USE_TZ for instance).
(EDIT: Handled in https://github.com/django/django/commit/d89f56dc4d03f6bf6602536b8b62602ec0d46d2f)

  • Connections should no longer be closed after each test as a global TestCase transaction is maintained.

Some tests in Django test suite (e.g. django/tests/backends) generate database failures and implicitely rely on a new connection being created for each test, one test even close the connection explicitely. I also run into a bunch of failures under Oracle and MySQL, because those backends fail when rollbacking a transaction altering the structure of a table (see can_rollback_ddl = False). Those failures were hidden because the connection was always closed afterwards. I turned those tests into TransactionTestCase and made some adaptation when required.

  • If the database does not support transaction, TestCase will keep behaving as TransactionTestCase.

This behaviour is unchanged, but it means that tests refactored to take advantage of setUpClass should not run under MySQL with MyISAM (no transaction support).
(EDIT: I introduced a setUpTestData class method as suggested on https://groups.google.com/forum/#!topic/django-developers/deADv5ZaL6o so as to avoid this problem.)

I did those modifications in the PR https://github.com/django/django/pull/3464.

In order to demonstrate the advantage of using setUpClass, I modified select_related and queries tests so that data are generated once in setUpClass:

Current implementation:

$ ./tests/runtests.py select_related queries --settings=test_postgres
...
Ran 253 tests in 10.637s

OK (skipped=1, expected failures=2)

Using setUpClass:

$ ./tests/runtests.py select_related queries --settings=test_postgres
...
Ran 253 tests in 2.291s

OK (skipped=1, expected failures=2)
Last edited 10 years ago by Thomas C (previous) (diff)

comment:23 by Thomas C, 10 years ago

Cc: t.chaumeny@… added
Has patch: set

comment:24 by Claude Paroz, 10 years ago

Triage Stage: AcceptedReady for checkin

At least a second set of eyes would be nice!

comment:25 by Aymeric Augustin, 10 years ago

Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

I reviewed the patch and had several comments or questions. Sorry for the delay.

comment:26 by Thomas C, 10 years ago

Patch needs improvement: unset

Thanks for the review. I updated the PR to address your comments.

comment:27 by Tim Graham <timograham@…>, 10 years ago

Owner: set to Tim Graham <timograham@…>
Resolution: fixed
Status: newclosed

In da9fe5c717c179a9e881a40efd3bfe36a21bf4a6:

Fixed #20392 -- Added TestCase.setUpTestData()

Each TestCase is also now wrapped in a class-wide transaction.

comment:28 by Tim Graham <timograham@…>, 10 years ago

In 26dd518b5c0c6f4a7678bfb7bd3ed0bf8dc8978e:

Refactored some tests to take advantage of refs #20392.

comment:29 by Tim Graham <timograham@…>, 10 years ago

In 119154ca7f65d423bb742aa0e32ffcfcb21d4b87:

Refs #20392 -- Load fixtures once within TestCase

Note: See TracTickets for help on using tickets.
Back to Top