Code

Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#16622 closed Cleanup/optimization (invalid)

django.test.TestCase slow using PostgreSQL 8.4.4

Reported by: valhallasw Owned by: nobody
Component: Database layer (models, ORM) Version: 1.3
Severity: Normal Keywords:
Cc: valhallasw Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Using very simple tests, such as

class To_confirm_the_stable_database_contains_the_right_data(TestCase):
    def test_adressen_that_should_not_exist_do_not_exist(self):
        adressen = [1409, 1410, 1411, 4094]
        for adres in adressen:
            self.failUnlessEqual(
                Adres.objects.using('stable').filter(pk=4095).count(),
                0
            )

results in very slow tests:

$ ./manage.py test
Creating test database for alias 'default'...
.....
----------------------------------------------------------------------
Ran 5 tests in 6.890s

OK

This is caused by TRUNCATE commands, which is clear when logging all queries using pdb:

b /lib/python2.6/site-packages/Django-1.3-py2.6.egg/django/db/backends/postgresql_psycopg2/base.py:43
commands 1
silent
import time
print "QUERY", time.time(), query
continue
continue

yields

QUERY 1313096004.8
            SELECT c.relname
            FROM pg_catalog.pg_class c
            LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
            WHERE c.relkind IN ('r', 'v', '')
                AND n.nspname NOT IN ('pg_catalog', 'pg_toast')
                AND pg_catalog.pg_table_is_visible(c.oid)
QUERY 1313096004.83 TRUNCATE "models_opdracht", "models_dataholder", (...47 more...)
QUERY 1313096006.59 SELECT setval(pg_get_serial_sequence('"models_dataholder_audit"','_audit_id'), 1, false);
(..)
QUERY 1313096006.63 SELECT setval(pg_get_serial_sequence('"models_groepeninfo"','id'), 1, false);

I thought I read somewhere that PostgreSQL did not support transactioned TRUNCATE before 9.1, but I am unable to find the source. The documentation suggests neither 8.4 nor 9.1 fully support transactions.

I can think of two ways to approach this problem -- one: don't use TRUNCATE, use DELETE instead, which should work fine inside the transaction; or two: instead of TRUNCATEing in each transaction, TRUNCATE before the first transaction, and roll back all further transactions.

I was uncertain whether to file this under 'testing framework' or under 'database layer'. As the most direct effect is on testing, I chose the former.

Attachments (0)

Change History (12)

comment:1 Changed 3 years ago by valhallasw

  • Cc valhallasw added
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

comment:2 Changed 3 years ago by aaugustin

  • Triage Stage changed from Unreviewed to Accepted

It would be useful to benchmark this, for starters => accepting the ticket.

Test speed was recently discussed here: http://groups.google.com/group/django-developers/browse_thread/thread/65a36a8a5433144c/59974d3eaa463cb8

In particular, the slowness of TRUNCATE was mentionned in: http://groups.google.com/group/django-developers/msg/7cdfbc8bd5322b70

comment:3 Changed 3 years ago by valhallasw

It seems I forgot a relevant note -- the tests ran fast under Django 1.2.3 (0.274s). I've checked the queries in the same way as above, and there are no TRUNCATE queries sent.

comment:4 Changed 3 years ago by kmtracey

  • Triage Stage changed from Accepted to Unreviewed

The suggestion to use rollback instead of truncation is exactly what django.test.TestCase does, so long as the databases in use all support transactions. (See https://code.djangoproject.com/browser/django/trunk/django/test/testcases.py#L555). Seeing TRUNCATE used implies the database is being flushed between testcases, which in turn implies at least one of the database connections has reported that it does not support transactions. Whether a database connection supports transactions is determined by seeing whether a rollback has any effect (see https://code.djangoproject.com/browser/django/trunk/django/db/backends/__init__.py#L384). Pushing back to Unreviewed because this all generally does work for Postgres, until we get some specifics on why things are going wrong for the reporter it is hard to say whether the problem is in Django or in the project setup.

comment:5 Changed 3 years ago by valhallasw

_support_transactions seems to report transactions are supported, as can be seen from the following:

(using 1.3 release / https://code.djangoproject.com/browser/django/tags/releases/1.3/django/db/backends/__init__.py#L364)

$ python -m pdb ./manage.py test
> manage.py(2)<module>()
-> from django.core.management import execute_manager
(Pdb) b /lib/python2.6/site-packages/django/db/backends/__init__.py:364
Breakpoint 1 at /lib/python2.6/site-packages/django/db/backends/__init__.py:364
(Pdb) commands 1
(com) silent
(com) print "SUPPORTS TRANSACTIONS: ", (count == 0)
(com) continue
(Pdb) continue
Creating test database for alias 'default'...
SUPPORTS TRANSACTIONS:  True
.....
----------------------------------------------------------------------
Ran 5 tests in 11.715s

OK
The program finished and will be restarted
> manage.py(2)<module>()
-> from django.core.management import execute_manager
(Pdb)

comment:6 Changed 3 years ago by ramiro

  • Triage Stage changed from Unreviewed to Accepted

The TRUNCATE statement looks like this (from PostgreSQL 8.4.8 logs):

2011-08-12 23:13:30 ART LOG:  statement: TRUNCATE "auth_permission", "auth_group", "auth_group_permissions", "django_session", "auth_user_groups", "auth_user_user_permissions",
 "t16622_person", "auth_message", "django_site", "auth_user", "django_content_type";

1.2 shows the same behavior the OP repòrts for 1.2.3. So, ran bisection between 1.2 and 1.3 and it shows the commit where that TRUNCATE operation (plus others) got added is r15239 (for trunk, r15241 for the 1.2.X branch, so most surely there is a regression between 1.2.4 and 1.2.5 in this regard [haven't tested this myself])

Moving back to accepted, leaving component = Test framework for now.

comment:7 Changed 3 years ago by anonymous

I'm a little confused -- the initial report mentions truncate commands plural. The r15239/r15241 commit added a single database flush command that is performed when the test database is created. Is there just one truncate happening or are multiple truncates being seen?

comment:8 Changed 3 years ago by valhallasw

I am seeing a TRUNCATE for every single test case. I will try to do some more debugging this weekend to see why exactly the TRUNCATE is called. In any case, thanks for your comments.

comment:9 Changed 3 years ago by ramiro

Sorry, posted last comment when I was too tired so it lacked details, should have made clear my testing conditions: A simple app and a TestCase with only one test method:

from django.db import models

class Person(models.Model):
    age = models.IntegerField()
from django.test import TestCase

from models import Person


class SimpleTest(TestCase):
    def test_a(self):
        self.assertEqual(0, Person.objects.filter(age=42).count())

    # added afterwards:
    #def test_b(self):
    #    self.assertEqual(0, Person.objects.filter(pk=314).count())

Now (just in case, even when manage.py test already runs tests of all django.contrib apps listed in INSTALLED_APPS), I've added another test method and another similar simple app with its own tests. But number of TRUNCATE operation stays at one here (most surely because of this `flush` operation added in these commits.)

Count me as also confused on why valhallasw sees it once per test.

comment:10 Changed 3 years ago by valhallasw

  • Component changed from Testing framework to Database layer (models, ORM)
  • Type changed from Bug to Cleanup/optimization

tl;dr: it was our fault. sorry for wasting your time :-(

In the end, it was caused because we used a buggy way of adding a database programmatically (i.e. not using the configuration screen) to be able to test on different data sets (this is useful, for instance, for testing statistical functions). This method did not call features.confirm(). This, in turn means the testing framework got None when it asked for features.supports_transactions .
which returned None (instead of True). This was intepreted as 'it's not True', and thus transactions were not used.

I'm not quite sure how sensible returning None is: IMO is would make more sense to raise an Exception if no data is available, or calling confrm() to actually /get/ the data (although creating and deleting a table while in a query might not be the most sensible thing to do).

To give you a full log of my debugging:

(Pdb) b /lib/python2.6/site-packages/django/db/backends/postgresql_psycopg2/base.py:43, "TRUNCATE"  in query

the breakpoint is hit a few times. The first time, this is the stack trace:

(Pdb) w
  /usr/lib/python2.6/bdb.py(372)run()
-> exec cmd in globals, locals
  <string>(1)<module>()
  /home/project/src/manage.py(16)<module>()
-> execute_manager(settings)
  /lib/python2.6/site-packages/django/core/management/__init__.py(438)execute_manager()
-> utility.execute()
  /lib/python2.6/site-packages/django/core/management/__init__.py(379)execute()
-> self.fetch_command(subcommand).run_from_argv(self.argv)
  /lib/python2.6/site-packages/django/core/management/base.py(191)run_from_argv()
-> self.execute(*args, **options.__dict__)
  /lib/python2.6/site-packages/django/core/management/base.py(220)execute()
-> output = self.handle(*args, **options)
  /lib/python2.6/site-packages/django/core/management/commands/test.py(37)handle()
-> failures = test_runner.run_tests(test_labels)
  /lib/python2.6/site-packages/django/test/simple.py(359)run_tests()
-> old_config = self.setup_databases()
  /home/project/src/djatabase/testing/PassableTestRunner.py(16)setup_databases()
-> retval = DjangoTestSuiteRunner.setup_databases(self, **kwargs)
  /lib/python2.6/site-packages/django/test/simple.py(296)setup_databases()
-> test_db_name = connection.creation.create_test_db(self.verbosity, autoclobber=not self.interactive)
  /lib/python2.6/site-packages/django/db/backends/creation.py(376)create_test_db()
-> database=self.connection.alias)
  /lib/python2.6/site-packages/django/core/management/__init__.py(166)call_command()
-> return klass.execute(*args, **defaults)
  /lib/python2.6/site-packages/django/core/management/base.py(220)execute()
-> output = self.handle(*args, **options)
  /lib/python2.6/site-packages/django/core/management/base.py(351)handle()
-> return self.handle_noargs(**options)
  /lib/python2.6/site-packages/django/core/management/commands/flush.py(54)handle_noargs()
-> cursor.execute(sql)
> /lib/python2.6/site-packages/django/db/backends/postgresql_psycopg2/base.py(43)execute()
-> try:

which is the TRUNCATE called by the test database creation.

The second and all later times, this is the stack trace:

(Pdb) w
  /usr/lib/python2.6/bdb.py(372)run()
-> exec cmd in globals, locals
  <string>(1)<module>()
  /home/project/src/manage.py(16)<module>()
-> execute_manager(settings)
  /lib/python2.6/site-packages/django/core/management/__init__.py(438)execute_manager()
-> utility.execute()
  /lib/python2.6/site-packages/django/core/management/__init__.py(379)execute()
-> self.fetch_command(subcommand).run_from_argv(self.argv)
  /lib/python2.6/site-packages/django/core/management/base.py(191)run_from_argv()
-> self.execute(*args, **options.__dict__)
  /lib/python2.6/site-packages/django/core/management/base.py(220)execute()
-> output = self.handle(*args, **options)
  /lib/python2.6/site-packages/django/core/management/commands/test.py(37)handle()
-> failures = test_runner.run_tests(test_labels)
  /lib/python2.6/site-packages/django/test/simple.py(360)run_tests()
-> result = self.run_suite(suite)
  /lib/python2.6/site-packages/django/test/simple.py(316)run_suite()
-> return unittest.TextTestRunner(verbosity=self.verbosity, failfast=self.failfast).run(suite)
  /lib/python2.6/site-packages/django/utils/unittest/runner.py(162)run()
-> test(result)
  /lib/python2.6/site-packages/django/utils/unittest/suite.py(64)__call__()
-> return self.run(*args, **kwds)
  /lib/python2.6/site-packages/django/utils/unittest/suite.py(84)run()
-> self._wrapped_run(result)
  /lib/python2.6/site-packages/django/utils/unittest/suite.py(116)_wrapped_run()
-> test(result)
  /lib/python2.6/site-packages/django/test/testcases.py(292)__call__()
-> self._pre_setup()
  /lib/python2.6/site-packages/django/test/testcases.py(259)_pre_setup()
-> self._fixture_setup()
  /lib/python2.6/site-packages/django/test/testcases.py(569)_fixture_setup()
-> return super(TestCase, self)._fixture_setup()
  /lib/python2.6/site-packages/django/test/testcases.py(271)_fixture_setup()
-> call_command('flush', verbosity=0, interactive=False, database=db)
  /lib/python2.6/site-packages/django/core/management/__init__.py(166)call_command()
-> return klass.execute(*args, **defaults)
  /lib/python2.6/site-packages/django/core/management/base.py(220)execute()
-> output = self.handle(*args, **options)
  /lib/python2.6/site-packages/django/core/management/base.py(351)handle()
-> return self.handle_noargs(**options)
  /lib/python2.6/site-packages/django/core/management/commands/flush.py(54)handle_noargs()
-> cursor.execute(sql)
> /lib/python2.6/site-packages/django/db/backends/postgresql_psycopg2/base.py(43)execute()
-> try:

Traversing the chain up to

> /lib/python2.6/site-packages/django/test/testcases.py(271)_fixture_setup()
-> call_command('flush', verbosity=0, interactive=False, database=db)

https://code.djangoproject.com/browser/django/tags/releases/1.3/django/test/testcases.py#L260

where it's clear flush is called if _fixture_setup() is called.

traversing one more step up to

-> self._fixture_setup()
  /lib/python2.6/site-packages/django/test/testcases.py(569)_fixture_setup()
-> return super(TestCase, self)._fixture_setup()

https://code.djangoproject.com/browser/django/tags/releases/1.3/django/test/testcases.py#L560

with the relevant code

if not connections_support_transactions():
    return super(TestCase, self)._fixture_setup()

and indeed:

(Pdb) connections_support_transactions()
False

connections_support_transactions():

return all(conn.features.supports_transactions
    for conn in connections.all())

and if I request [conn.features.supports_transactions for conn in connections.all()]:

(Pdb) [conn.features.supports_transactions for conn in connections.all()]
[True, None]

OK, so that is the reason I'm seeing flushes. But why is support_transactions None? Apparently it's not being filled.

Calling

[conn.features.confirm() for conn in djconn.all()]

seems to fix that.

End result:

$ ./manage.py test
Creating test database for alias 'default'...
.....
----------------------------------------------------------------------
Ran 5 tests in 0.567s

OK

yay!

Version 0, edited 3 years ago by valhallasw (next)

comment:11 Changed 3 years ago by mtredinnick

  • Resolution set to invalid
  • Status changed from new to closed

I feel the current behaviour is fine (returning None by default). The functional behaviour is correct, it's only a particular performance optimisation that is not invoked in this case. We can leave things as they are for now.

comment:12 Changed 3 years ago by mtredinnick

(By the way, thanks for the detailed explanation of what was going wrong. Saved me having to wonder why it suddenly started working and meant it was possible to evaluate the impact and likelihood of others triggering this problem. That was very helpful!)

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.