Opened 6 years ago

Closed 4 years ago

#10868 closed Bug (fixed)

_destroy_test_db exposes the production database to possibly destructive actions from the unit tests

Reported by: ovidiu Owned by: nobody
Component: Testing framework Version:
Severity: Release blocker Keywords: django.test
Cc: adrian, anssi.kaariainen@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Latest SVN trunk, file django.db.backends.creation.py

414        cursor = self.connection.cursor()
415        self.set_autocommit()
416        time.sleep(1) # To avoid "database is being accessed by other users" errors.
417        cursor.execute("DROP DATABASE %s" % self.connection.ops.quote_name(test_database_name))
418        self.connection.close()

At line 414 django is connected to the production database (the rationale for this is explained in the comment above this code fragment). The connection is closed one second later. If the unit tests involve threads which for some reason become active before the connection is closed then those threads can potentially mess up the production database.

Attachments (5)

10868.diff (6.4 KB) - added by akaariai 4 years ago.
Solve the issue by checking threads created during testing
10868.settings-dict-copy.diff (2.6 KB) - added by julien 4 years ago.
10868.settings-dict-copy.2.diff (2.7 KB) - added by julien 4 years ago.
10868.2.diff (3.6 KB) - added by akaariai 4 years ago.
settings-dict-copy.2 with minor cleanup
tests.py (1.4 KB) - added by akaariai 4 years ago.
Manual postgresql specific tests, run in a separate project

Download all attachments as: .zip

Change History (29)

comment:1 Changed 6 years ago by Alex

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Design decision needed

comment:2 Changed 4 years ago by SmileyChris

  • Severity set to Release blocker
  • Type set to Bug
  • Version 1.1-beta-1 deleted

comment:3 Changed 4 years ago by akaariai

  • Easy pickings unset
  • UI/UX unset

The problem here is that the settings_dict is shared between threads. This means that after self.connection.settings_dict['NAME'] = old_database_name (line 323 in creation.py) every new connection taken by different threads will access the production database, not the test database. I don't know how likely this is to happen. You would need a thread taking a _new_ connection.

For potential fixes: on line 323, do a copy of the settings dict, then set the old database name. self.connection is threading.local subclass, so if the settings_dict is copied, the change affects only current thread (the one doing the drop database). Changing directly the settings dict will do the change for every thread, as the passed in settings_dict is the same instance for every thread. But I think Django does want to restore the environment to the previous state after tests have ran, and thus, if a thread wakes up at a later point (after the test db has been dropped) and takes a new connection, it will be directed to the production database. So 3 more solutions:

  • Do not restore the environment
  • The user should make sure the threads he has created are not running after the test suite is completed (in other words, mark as wont fix).
  • Check alive threads before dropping the test database. If there are more alive threads than the main thread, bail out. I think this is safest way to go forward. Anyways you should not have any test threads running after the test suite has completed. You would possibly want to check the running thread names before tests, and after test and see that they match, so that if there are threads running in the background before the test, you would still be able to run tests.

comment:4 Changed 4 years ago by adrian

Updated info: the code still lives in django/db/backends/creation.py but now it's lines 333-337, in the _destroy_test_db() method.

comment:5 Changed 4 years ago by adrian

akaariai: Thanks for the detailed explanation. Even if we fixed line 323 in backends/creation.py (self.connection.settings_dict['NAME'] = old_database_name), I think we'd still have a problem due to these lines (319-324) in django/test/simple.py --

        # Destroy all the non-mirror databases
        for connection, old_name, destroy in old_names:
            if destroy:
                connection.creation.destroy_test_db(old_name, self.verbosity)
            else:
                connection.settings_dict['NAME'] = old_name

I like your potential fix of making a copy of the settings dict to ensure it only affects the current thread -- so would we have to do the same thing in the django/test/simple.py lines above? Mind writing up a patch of this approach?

comment:6 Changed 4 years ago by adrian

  • Cc adrian added

comment:7 Changed 4 years ago by akaariai

  • Cc anssi.kaariainen@… added

I am going to wait what happens to #17258 before working on this. I am afraid working on this will be wasted work if that one gets committed.

But, just to make sure: it is OK if the setup isn't restored for all threads after the test suite has ran? The settings dict change would restore the non-testing database settings only for the main thread. I don't think it is possible to solve this in a way that:

  • blocks test threads from ever accessing the production DB
  • allows threads started after the tests have ran to access the production DB
  • allows usage of normal threads (instead of test.utils.threading) in the tests

The reason is that if a test-thread makes its first use of a connection after the tests have ran, there is no way to tell it apart from a thread that has been started after the tests have ran.

comment:8 Changed 4 years ago by julien

When comes the time of destroying the test database, there should be no thread running other than the main one. So the main thread could wait for all child threads to complete before proceeding to destroying the database.

comment:9 Changed 4 years ago by akaariai

You could assert there is just one thread running post-tests, but that places the assumption that there is no valid usage of additional background threads when running tests. Maybe this is fine. If not, how about:

  • When setting up the tests, record all running threads: pre_test_threads = set(t.name for t in threading.enumerate())
  • When tests have ran, check that there are no new threads compared to the pre-test set. If there are, call .join(timeout=1.0) on the threads. If they have not exited after the timeout, do not change the settings for the connections & destroy the databaseses. Error out.

This change would break tests for users who leave threads running knowing the threads are not doing anything. Is that a problem? Should there be a hook (or signal) you can use to setup/remove threads pre/post tests have ran?

comment:10 Changed 4 years ago by julien

That sounds like a good proposal. Another slightly different approach would be to wait indefinitely instead of erroring out. If one spawns new threads then they should be responsible for cleaning them up. The risk of damaging the production database is more important to consider than the risk of breaking some tests.

How would you go about writing tests for this proposal? Since this all happens before+after the general test suite is run, I'm not sure this would even be possible.

comment:11 Changed 4 years ago by akaariai

I was going to ask the same question about testing this :) I think it is possible to write a tester script (so that it is possible to confirm manually everything is fine), but including it in the test-suite seems problematic. I do not know a way to do that. Maybe setup a fake test runner + some threads, then call .teardown_databases with empty old_config and see that it does the proper thing. The right place for the check seems to be as the first thing in teardown_databases.

My proposal about the timeout is that if the threads haven't quit after the timeout, then do not try to destroy the testdb. Throw an error instead. Notably, there is no risk to the production database, because if the threads can't be joined, we would NOT change the connection settings back to the production settings. We would just quit & leave the testdb in there.

Changed 4 years ago by akaariai

Solve the issue by checking threads created during testing

comment:12 Changed 4 years ago by akaariai

  • Has patch set

The attached patch solves the issue by checking threads still alive which have been created during testing before tearing down databases. If there are any, bail out.

If it is OK to not restore fully the state of the connections, then this would be easier to solve by just changing the connection settings for the main thread (the thread doing the teardown).

Main complaint against the above patch is IMHO backwards compatibility: if you knowingly let threads exists after tests have ran, you have no choice other than to do thread-cleanup before the teardown stage. And there is no hook for doing that.

comment:13 Changed 4 years ago by julien

I've been having second thoughts about this. It feels plain wrong that any connection to the production database is ever made during any stage of the tests' execution. I haven't got the chance to investigate this yet, but I'd like to find a way to connect to the database server without actually selecting any database at all, and then run the "DROP DATABASE" statement on the test database. The same goes for the initial creation of the test database as well.

comment:14 Changed 4 years ago by akaariai

At least when using PostgreSQL, you need to connect to some database to issue any commands. That includes creating the test database. You can't drop a database somebody is connected to, and that includes dropping the test database from the test database. I am not sure about this, but Oracle is likely going to be even more problematic.

What could be done however is _not_ restoring the connection state after tests have ran. How backwards incompatible that would be is a good question. But not having a connection to the production database at all is even more backwards incompatible.

If the restoring of connection settings is not done, you would basically do the following in teardown: 1) close all existing connections, 2) alter the database wrapper for the main thread's connection settings. 3) take a new connection to the production DB, 4) drop the test databases 5) revert the changes to the running thread's database wrapper.

Now, the changes to the database wrapper are only for the test-db dropping thread. Other threads' connection settings are not altered. Thus, you will not expose the production database to other threads than the test-db dropping thread. The no.5 above is actually making the connection wrapper invalid, as connecting to the non-existing test database is not possible.

Checking that there are no threads running seems like a bad idea to me now. I wonder what would happen when using for example Jython. If I am not mistaken, Jython uses a lot of threads.

It should be somewhat easy to just disable _any_ connections to anywhere after tests have been executed. That would be the safest way :)

comment:15 Changed 4 years ago by julien

Ok, considering those limitations with PostgreSQL, I would say the best way is probably to temporarily swap the main thread's DBWrapper.settings_dict attribute with a thread-local copy, then modify that copy to point to the production database, then drop the test database, and finally re-swap it with the original settings_dict value for good measure, as I guess you had already suggested in comment:3.

comment:16 Changed 4 years ago by akaariai

If leaving the connection settings such that new connections will be made to the (non-existing) test database is considered backwards compatible this should be the sanest solution. Anything else I can think of will end up being an ugly hack.

I think the move of threading.local from DatabaseWrapper to django.db.connections will make patching this easier. Just close the connection to the databases, and you are free to alter the settings dictionary of the thread's DatabaseWrapper. The changes will not be visible to other threads.

This probably needs at least a mention in release notes.

comment:17 Changed 4 years ago by julien

Users who intentionally let threads live and create new connections after the test database is supposedly destroyed are really asking for trouble and are likely already affecting their production database :)
So in my opinion this change would be backwards compatible, at least in the general case. However, if some users come up with genuine use cases then we could consider wrapping this new behaviour into a TestCase method so that one can modify it to their heart's content.

Changed 4 years ago by julien

comment:18 Changed 4 years ago by julien

I've attached a patch as a way of illustrating our recent discussions. A few comments and questions about this patch:

  • I'm not sure if it's worth eventually restoring the original settings dict references. That'd be straight-forward in BaseDatabaseCreation.destroy_test_db() but probably less so in DjangoTestSuiteRunner.teardown_databases().
  • I'm not sure how/if this can be tested.
  • How would be handle DBWrappers potentially being shared between threads?

comment:19 Changed 4 years ago by akaariai

  • Why do the settings_dict.copy(), set old_name for the mirrors? If the connection settings are not going to be restored, then why restore them for the main thread? The same goes for the else branch in the if destroy: ... else.
  • This is going to be hard to test. I don't know how the test runner is tested...
  • I don't think there is anything to do for thread-shared DBWrappers. Except, maybe this idea would work:

Why not create a new DBWrapper just for dropping the test database? This way the connection is guaranteed to not be shared, and it is guaranteed that the changes do not need reverting (as you don't touch anything in django.db.connections at all). So, all in all, it might be possible that teardown_databases() could be this:

for connection, old_name, destroy in old_names: 
    if destroy: 
        connection.creation.destroy_test_db(old_name, self.verbosity)

And connection.creation.destroy_test_db would instantiate a totally new DBWrapper connecting to old_name for test database teardown.

Changed 4 years ago by julien

comment:20 Changed 4 years ago by julien

Thanks, I think your suggestions make sense. I've updated the patch to reflect that.

comment:21 Changed 4 years ago by akaariai

Maybe it would be better to write these lines:

self.connection = backend.DatabaseWrapper( 
    settings_dict, 
    alias='__destroy_test_db__', 
    allow_thread_sharing=False) 
self._destroy_test_db(test_database_name, verbosity) 

As just

newconn = backend.DatabaseWrapper(...)
newconn.ops._destroy_test_db(test_database_name, verbosity)

The coding in the patch breaks the symmetrical linking between DBWrapper and Ops. If you go from DBWrapper to Ops and then back to the connection again, you don't actually end up into the same DBWrapper (or: ops.connection.ops != ops). While that isn't critical at all, it might cause a surprise some day.

But in any case the basic approach seems now correct. It is simple and should prevent accidental access to production DB. It is easy to override the default behavior by a custom TestRunner. So, I am +1 to commit of this. It would be nice to test this somehow, even if the test can not be included in the test suite. I try to see if I have time to write a small custom testproject testing threads sharing etc this weekend.

comment:22 Changed 4 years ago by akaariai

Attached is a tests.py file, which will demonstrate that the attached patch does work. However, there are no tests included in the test suite. You will need to create a new test project, and set up that project to use PostgreSQL. Then start an app, copy the tests to that app and run the tests. Using master, the leftover threads connect to the production DB, but using the attached patch the connection is made to the just dropped test database. The attached patch is basically the same as 10868.settings-dict-copy.2.diff, but it has the minor cleanup mentioned in the above comment.

To include a full test of this in the test suite would require testing DjangoTestRunner.teardown_databases(). That is naturally a bit problematic.

There are release notes in the patch, but they do require rewriting, and it is questionable if anything about this needs to be included.

I think this is now ready for checkin, but maybe somebody else needs to tick that box as I have altered the patch slightly.

Changed 4 years ago by akaariai

settings-dict-copy.2 with minor cleanup

Changed 4 years ago by akaariai

Manual postgresql specific tests, run in a separate project

comment:23 Changed 4 years ago by julien

  • Triage Stage changed from Design decision needed to Ready for checkin

Perfect, thanks for running those tests! I'll commit the patch in a minute after doing a bit of PEP-cleanup (those files are a bit messy).

comment:24 Changed 4 years ago by julien

  • Resolution set to fixed
  • Status changed from new to closed

In [17411]:

Fixed #10868 -- Stopped restoring database connections after the tests' execution in order to prevent the production database from being exposed to potential threads that would still be running. Also did a bit of PEP8-cleaning while I was in the area. Many thanks to ovidiu for the report and to Anssi Kääriäinen for thoroughly investigating this issue.

Note: See TracTickets for help on using tickets.
Back to Top