Opened 14 years ago

Closed 9 years ago

#14415 closed Bug (fixed)

Multiple aliases for one database: testing problems

Reported by: Shai Berger Owned by: Boaz Shvartzman
Component: Testing framework Version: dev
Severity: Normal Keywords: multidb, multiple databases, multiple aliases
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description

In a setting where the multiple-databases feature is used to create multiple aliases for one database, the testing framework can get a little confused.

For ease of reference, assume we have the following dictionary defining database parameters:

db_params = dict (
    NAME = 'main',
    ENGINE = 'django.db.backends.whatever',
    USER = 'user',
    PASSWORD = 'password',
    HOST = '',
    PORT = ''
)

The relatively benign cases

In these cases, the test framework just treats the two aliases as separate databases; which means it first creates the test database for one alias, then tries to create the test database for the second one, running into an error because (of course) it already exists. The user is asked to choose between destroying the existing database (in order to recreate it) and canceling the test. In cases where database routers constrict models to a specific alias, this may completely prevent testing (or at least some of the tests).

This happens if at least one of two conditions hold:

The dictionaries used to define the alias are distinct

That is, the databases are defined by

DATABASES = {
    'default' : db_params,
    'other' :   db_params.copy()
}

or some equivalent

Test name is explicitly specified

That is, db_params looks like:

db_params = dict (
    NAME = 'main',
    TEST_NAME = 'test_main',
    ... # same as above
)

The databases may then be specified by the more natural

DATABASES = {
    'default' : db_params,
    'other' :   db_params
}

The more dangerous case

In this case, the testing framework creates spurious databases and
finalizes be destroying the main, non-test database.

This happens when none of the above applies -- that is, db_params does not include
the TEST_NAME entry, relying on the default test_ prefix addition, and databases
are specified by two references to the same dictionary (as in the last definition above).
Regretfully, one may expect this to be the common use case in a multiple-aliases-for-one-database
approach.

In detail: With the definitions above, the testing framework first creates a test_main database, then a test_test_main database, and runs its tests with both of them present; then, during tear-down, it drops the test_test_main and *main* databases, leaving test_main in place.

A word for fixers

For whoever wants to work on this ticket (perhaps myself...), Russel Keith-Magee has said that

...this almost rates as an 'easy pickings' ticket. This is a situation where you'll get a pass on having a test as part of the patch; testing test setup is one of those areas that's almost impossible to validate, so whoever commits (probably me) will just take it for a very good walk before checking in.

Attachments (4)

14415.1.diff (2.7 KB ) - added by Ramiro Morales 14 years ago.
t14415.diff (3.9 KB ) - added by Russell Keith-Magee 14 years ago.
Alternative solution based around the test runner
t14415.2.diff (4.2 KB ) - added by Russell Keith-Magee 14 years ago.
Updated patch for test database issue
t14415.3.diff (4.2 KB ) - added by Shai Berger 14 years ago.
just adding PORT to t14415.2.diff

Download all attachments as: .zip

Change History (31)

comment:1 by Gabriel Hurley, 14 years ago

Triage Stage: UnreviewedAccepted

comment:2 by Ramiro Morales, 14 years ago

Triage Stage: AcceptedDesign decision needed

This is what I've found:

I think the maximum that Django can provide is what I've tried to implement in the patch:

  • Make sure no cross pollution of DATABASES['alias']['NAME'] and DATABASES['alias']['TEST_NAME'] happens between DB aliases/connections, even when they are described by the same Python dict so in no event the production DATABASES['alias']['NAME'] DB gets removed.
  • Don't report an error if the test database doesn't exist when trying to drop it (or, for SQLite3, don't try to remove the DB file).

What's described above is generally applicable to Postgres, MySQL and SQlite3.

But after fixing that, there are other problems (at test DB teardown time): In Postgres and MySQL the test databases are removed by using the production one as a pivot (a connection is made to the production DB and from there the test one is DROPped). If you get the setup you describe, it might happen (It happens to me) that 'default' will get is test DB ('test_main') dropped and when 'other' tries to connect to its production one to do the same (remove it 'test_test_main' DB) it won't find it.

Solving this means Django would need to detect and keep track of the fact that one alias' test DB name is the same as other alias' production DB, and DATABASES being a dict, there is no guaranteed order of creation/removal of DBs.

The conclusion I get is this is what you describe as the The more dangerous case and one may expect this to be the common use case in a multiple-aliases-for-one-database approach is enough of an edge case that can't be completely solved by Django. If you really need to create more that one DATABASES aliases to the same database and you decide to represent them by pointing them to the same dict, then maybe you will need to implement your own test runner that will need to have some inside knowledge of this and e.g. will need to setup/tear down the test DBs in the right order, etc.

The Oracle backed has its own tools so (again) the user, not Django, tries to implement a workaround to the scenarios you describe (see wiki:OracleTestSetup Understanding the database test setup section.)

by Ramiro Morales, 14 years ago

Attachment: 14415.1.diff added

comment:3 by Ramiro Morales, 14 years ago

Has patch: set

comment:4 by Russell Keith-Magee, 14 years ago

milestone: 1.3
Patch needs improvement: set

I think you've tried to solve this in the wrong place. Rather than fixing this at the level of the database backend, it seems to me that the solution lies in fixing the test runner's setup_databases call. There is already some handling for avoiding duplicate database setup (via the handling for the TEST_MIRROR setting); it seems to me that the problem described here is really just an TEST_MIRROR that is implied because the database name is the same.

So - if we refactor the handling of TEST_MIRROR so that we pre-evaluate the list of database that need to be created and the list that needs to be redirected, we should be able to avoid this problem entirely.

Of course, this is just based on first inspection; there might be some detail that I'm missing. I'll take a closer look later tonight (my time).

Also - bumping to 1.3 milestone because of the potential for unintentional data loss.

by Russell Keith-Magee, 14 years ago

Attachment: t14415.diff added

Alternative solution based around the test runner

comment:5 by Ramiro Morales, 14 years ago

(In [14666]) Added printing of the name of test database being created/destroyed, at verbosity level 2. Refs #14415.

comment:6 by Shai Berger, 14 years ago

I think Russell's approach is closer to home, but there are still problems.

For starters, the patch has a "surface" problem of identifying databases by their (Engine,Name) combination. The current docs about TEST_MIRROR (http://docs.djangoproject.com/en/dev/topics/testing/#testing-master-slave-configurations) show an example where this would be invalid (same name, different hosts). To fix this, it seems, we would need a notion of equivalence of settings -- == between dicts seems close enough for my taste (if a user defines two MySql databases, one with PORT='' and another with PORT='3306', I think they deserve the brouhaha they'll get), but I may be wrong (different OPTIONS'es, for example, for different transactional behavior, would seem legitimate).

But there is a deeper problem: solving the issue by directing the actual connections -- not just fixing the creation/destruction issue -- would cause the test process to have different transaction semantics. Consider:

def do_stuff(alias1, alias2):

    with transaction.commit_manually(using=alias1)

        # Make an invalid change in an object

        m = MyModel.objects.get(pk=17)
        m.fld = 'invalid'
        m.save(using=alias1)

        # Commit other changes, but not the invalid change
    
        transaction.commit(using=alias2) 
    
        # Roll back the invalid change
    
        transaction.rollback(using=alias1)
    
        # Check the outcome
    
        m1 = MyModel.objects.get(pk=17)
        assert m1.fld1 != 'invalid'

This will work in production, but fail in testing under t14415.diff.

by Russell Keith-Magee, 14 years ago

Attachment: t14415.2.diff added

Updated patch for test database issue

comment:7 by Russell Keith-Magee, 14 years ago

@shai - Good call on the HOST issue - I've made that change; new patch incoming shortly.

However, regarding the connection issue - I'm unclear whether what you consider correct behavior, and under what circumstances.

The 'connection copying' behavior is only performed for the TEST_MIRROR case, and that behavior exists historically. To my mind, this is a bug; using connection obtained from the alias that is a TEST_MIRROR shouldn't affect transactions created using the connection that is being mirrored. The new patch addresses this bug.

However, you seem to be implying that the same connection copying behavior exists for connections with a copied settings dictionary, which isn't the case. The patch I've provided causes your sample code to raise an exception at "transaction.commit(using=alias2)", because the code isn't under transaction management (for alias2) at that point.

Have I misunderstood something here?

comment:8 by Shai Berger, 14 years ago

@russelm -- Sorry, I had misread your original patch; you are right, the connections seem to be handled correctly.

I'll look at it again tonight (Israel time), to comment on the change in TEST_MIRROR handling.

One note, though, about adding HOST: While it is, as far as I know, good enough for Django's officially supported DBMSs, it is not enough for SqlServer or Sybase, where a host includes a set of databases, each containing a set of "schemata". Not sure if preparing support for this is easy, and if not easy, if it is worthwhile, but I thought it's worth mentioning.

comment:9 by Shai Berger, 14 years ago

After a second look: Yes, the new TEST_MIRROR behavior seems to be correct and even required to preserve production semantics (interleaved operations on master and slave with no commit may yield different result in production and test with current code).

One other note, though: PORT needs to be treated just like HOST, orthogonal to the SqlServer issue mentioned in my earlier comment.

If there is an easy way to provide tests for these issues, I'd like to do that.

Thanks,

Shai.

comment:10 by Shai Berger, 14 years ago

Fixed the patch, ran some tests with it. It seems to work (though using the same parameters for the two databases makes many tests fail, nothing seems to blow up, and nothing is generated or destroyed more than once).

by Shai Berger, 14 years ago

Attachment: t14415.3.diff added

just adding PORT to t14415.2.diff

comment:11 by Russell Keith-Magee, 14 years ago

To clarify -- when you say "makes many tests fail", is that a good thing or a bad thing? Is this an indication that the test suite is now running and revealing problems in your code, or that the test setup is broken in some way?

comment:12 by Shai Berger, 14 years ago

I think it's a good thing -- essentially, the Django test suite assumes that different aliases point at different databases. I didn't check every failure in detail, but as an example, there was a test in multiple_database that failed because an object saved through one alias was accessible through the other.

comment:13 by Russell Keith-Magee, 14 years ago

Django's test suite has a very specific set of requirements for the test database, and yes, independence between default and other is one of those requirements. If you're running Django's test suite with a mirrored or aliased database, there will be failures.

Thanks for the help testing this, Shai.

comment:14 by Russell Keith-Magee, 14 years ago

Resolution: fixed
Status: newclosed

(In [14696]) Fixed #14415 -- Corrected the process of creating and destroying test databases to prevent accidental deletion of the source database. Also improves the mirroring process to prevent some cross-connection effects. Thanks to Shai Berger for the help diagnosing and testing.

comment:15 by Russell Keith-Magee, 14 years ago

(In [14697]) [1.2.X] Fixed #14415 -- Corrected the process of creating and destroying test databases to prevent accidental deletion of the source database. Also improves the mirroring process to prevent some cross-connection effects. Thanks to Shai Berger for the help diagnosing and testing.

Backport of r14696 from trunk.

comment:16 by Jacob, 13 years ago

milestone: 1.3

Milestone 1.3 deleted

comment:17 by Shai Berger, 9 years ago

Easy pickings: set
Has patch: unset
Resolution: fixed
Severity: Normal
Status: closednew
Type: Uncategorized
UI/UX: unset
Version: 1.2master

It's quite amazing to realize, more than 5 years later, that the bug was actually not properly fixed.

The problem is in django.db.backends.base.creation:

    def test_db_signature(self):
        """
        Returns a tuple with elements of self.connection.settings_dict (a
        DATABASES setting value) that uniquely identify a database
        accordingly to the RDBMS particularities.
        """
        settings_dict = self.connection.settings_dict
        return (
            settings_dict['HOST'],
            settings_dict['PORT'],
            settings_dict['ENGINE'],
            settings_dict['NAME']
        )

The decision whether a database is a mirror of another, when TEST['MIRROR'] is not set explicitly, is taken based on the production settings rather than test settings. You are not likely to notice until you try to test separation of a model into a dedicated database -- where you try to set only the test databases to be separate.

comment:18 by Mainak Gachhui, 9 years ago

Could you give an example where the tests currently fail? I'm not very clear on the model separation.

comment:19 by Boaz Shvartzman, 9 years ago

Owner: changed from nobody to Boaz Shvartzman
Status: newassigned

Let's say you have two db's - "default" and "other".
In db "other" you have table x, which you don't have In db "default" (I used db router to accomplish that).
During a test, you are trying to add a row to table x, only on db "other".

Let's assume this is your DATABASES setting:

DATABASES = {
    'default': {
        'NAME': 'some_name',
        'ENGINE': 'django.db.backends.postgresql_psycopg2'',
        'USER': ...,
        'PASSWORD': ...
    }
}

other = DATABASES['default'].copy()
other['TEST']['NAME'] = 'other_name'
DATABASES['other'] = other

Django will ignore

other['TEST']['NAME'] = 'other_name'

as long as you don't set

other['NAME']

to a value different from

DATABASES['default']['name']

comment:20 by Boaz Shvartzman, 9 years ago

https://github.com/boaz85/django/tree/ticket_14415

I have changed test_db_signature to use

settings_dict['TEST']['NAME']

if set, and

settings_dict['NAME']

as a fallback.

Version 0, edited 9 years ago by Boaz Shvartzman (next)

comment:21 by Boaz Shvartzman, 9 years ago

Has patch: set

comment:22 by Boaz Shvartzman, 9 years ago

Patch needs improvement: unset

comment:23 by Boaz Shvartzman, 9 years ago

Triage Stage: Design decision neededAccepted

comment:24 by Tim Graham, 9 years ago

Type: UncategorizedBug

Did you try to add a test for that change?

comment:25 by Tim Graham, 9 years ago

Needs tests: set

Left an idea for a test on the pull request.

comment:26 by Tim Graham, 9 years ago

Needs tests: unset

comment:27 by Tim Graham <timograham@…>, 9 years ago

Resolution: fixed
Status: assignedclosed

In 5f23f90:

Fixed #14415 -- Used the test database name in BaseDatabaseCreation.test_db_signature().

Note: See TracTickets for help on using tickets.
Back to Top