Opened 2 years ago

Closed 11 months ago

Last modified 11 months ago

#23372 closed Cleanup/optimization (fixed)

Prevent loaddata from disabling and enabling constraints when no fixtures are found

Reported by: Michael Manfre Owned by: nobody
Component: Testing framework Version: 1.7
Severity: Normal Keywords:
Cc: baryshev@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

On my primary development system (Intel i7 quad core with SSD drives), it takes nearly 20 minutes to get from executing runtests.py to the start of the first test. Most of the time (~18 minutes) is spent on the "loaddata" call_command in the migrate command's sync_apps method.

https://github.com/django/django/blob/master/django/core/management/commands/migrate.py#L292

For comparison, using the same test environment and only changing the version of Django (to stable/1.6.x), it takes less than 2 minutes to get to the first test.

Change History (19)

comment:1 Changed 2 years ago by Michael Manfre

Set as release blocker because this is an order of magnitude performance regression added by new functionality.

comment:2 Changed 2 years ago by Claude Paroz

You didn't mention the database on which you encounter this problem, default SQLite?
It would be nice to track a bit further what is so slow inside loaddata.

comment:3 Changed 2 years ago by Tim Graham

MS SQL Server I presume. I'd guess the slowness might be the transaction.atomic() calls that happen in loaddata (one for app app in Django's test suite) even if no fixtures exist.

comment:4 Changed 2 years ago by Andrew Godwin

Severity: Release blockerNormal

Not a release blocker, I'm afraid; at this point release blocker is reserved for bugs that either crash Django or remove features that previously existed. I'm not willing to delay the 1.7 release, which is literally on the verge of happening, for test slowdown, or we'll literally never get round to releasing.

Notably, the code in question has not been touched by migrations; we do a dumpdata as part of #22487, but that doesn't seem to be part of this. I'm not sure what the cause of the regression could be.

comment:5 Changed 2 years ago by Aymeric Augustin

Not a blocker for 1.7 indeed, but fairly high priority for 1.7.1.

comment:6 Changed 2 years ago by Tim Graham

Triage Stage: UnreviewedAccepted

comment:7 Changed 2 years ago by Shai Berger

I've seen loaddata take eons on SQL Server, and even in the context of migrations (with South). As far as we could figure out, the problem was not transaction management per se, but the handling of constraints: For loaddata to avoid complex and sometimes non-satisfiable ordering requirements, it needs FK constraints to be deferred while it works. On core databases, each for its own reasons, this is not an issue.

SQL Server does not support defering in the normal sense -- what you need to do is turn the constraints off in the beginning of worlk and back on in the end. I don't remember if this is an operation you can do within a transaction, but for sure it is not tied to a transaction -- you can turn the constraints off and then run several transactions before turning them back on. When you do turn them back on, however, you can do it WITH CHECK OPTION -- which loaddata does, or else it might put constraint-violating data into the database.

SQL Server does not -- perhaps, given the semantics, cannot -- keep track of data changed while constraints were off. When you turn them on WITH CHECK OPTION, it checks every single record in the database -- or, at least, takes long enough to do so.

We had a project with several dozen apps, and relatively simple sets of migrations on a full database (for upgrades) would take minutes on Oracle and hours on SQL Server. Then we told SQL Server users to use migrate --all --no-initial-data (we didn't have any initial data anywhere), and sanity returned.

I'm not sure what this implies with regards to a solution. Some of the problem should go away if loaddata only defered constraints when initial data actually existed. Perhaps there's a way to avoid the check in most cases (say, most data loaded in the Django test suite can be given a free ticket because it is verified by the other backends).

comment:8 Changed 2 years ago by Tim Graham

Version: 1.7-rc-31.7

comment:9 Changed 2 years ago by Winter Lion Software

I have an application in 1.7 that I have been building for a while. I have about 75 migrations. When I run

python manage.py test

with SQLite, it takes about 30 seconds to create the test database. In 1.6 this was almost instant (in fact, I just confirmed this. I ran my tests in an old environment at 1.6 and the test database was created instantly. I ran "pip install django --upgrade" and re-ran the tests. It took 30 seconds to create the test database).

I don't know if my issue is the same as this issue (being as I'm not using MSSQL Server), but I'd just like to add that there appears to be a significant slow down when going from 1.6 to 1.7, regardless of the database type.

I'm using "django.db.backends.sqlite3" on Windows 7 64-bit with a Core i7 processor and 8 GB ram.

comment:10 Changed 2 years ago by Tim Graham

Robert, when you run your tests in 1.7+ all migrations are applied, which takes several seconds per migration on 1.7 as reported in #23745. The performance of that will be improved in 1.8 as described in that ticket. You can check if you can use the --keepdb option in 1.8. Also, we might add an option to disable migrations when testing in the future. But ultimately, no, I don't think your issue is the one tracked by this ticket.

comment:11 Changed 2 years ago by Ilya Baryshev

Cc: baryshev@… added

comment:12 in reply to:  10 Changed 2 years ago by Markus Holtermann

Resolution: needsinfo
Status: newclosed

If this is related to the performance of the migration operations - which doesn't seem to be completely clear at this point - the issue should be fixed or at leased reduced by an order of magnitude by the optimizations the migrate process received for 1.8.

I'm closing this with needsinfo for the reasons posted in comment 10

comment:13 Changed 23 months ago by Tim Graham

I wonder if this might be better on master (1.9) after commits like 67235fd4ef1b006fc9cdb2fa20e7bb93b0edff4b and 4aa089a9a9504c4a833eee8161be013206da5d15 -- the migrate command should be more "lightweight" after them.

comment:14 Changed 14 months ago by Tim Graham

Has patch: set
Resolution: needsinfo
Status: closednew

The proposed PR for Django 1.8 reduces the time of the entire suite on MSSQL from about 7.75 hours to 5.25 hours (benchmarks from @manfre). I'm not particularly proud of that hack, but it seems worthwhile unless someone can spot a problem with it.

comment:15 Changed 12 months ago by Tim Graham

Patch needs improvement: set

Anssi has proposed an alternate approach.

comment:16 Changed 12 months ago by Anssi Kääriäinen

A yet new alternate approach is here: https://github.com/django/django/pull/5905. The idea is to skip disabling constraints when no fixture files are found in loaddata. The 1.8 test suite runs in 2500 seconds on my laptop after the patch (plus one line fix in django-mssql, see the pull request).

The previously proposed alternate approach takes off another 10 minutes from the runtime of the test suite on mssql, yielding a runtime of around 30 minutes. The changes needed there are a lot more invasive, as we will need to pre-walk the objects in the fixture files. This could lead to performance problems. The change might be worth it for master branch, but backpatching seems too risky.

comment:17 Changed 11 months ago by Tim Graham

Patch needs improvement: unset
Summary: Test suite initial database construction is too slowPrevent loaddata from disabling and enabling constraints when no fixtures are found
Triage Stage: AcceptedReady for checkin

comment:18 Changed 11 months ago by Tim Graham <timograham@…>

Resolution: fixed
Status: newclosed

In ee9f4686:

Fixed #23372 -- Made loaddata faster if it doesn't find any fixtures.

Django's test suite often tries to load fixture files from apps that have
no fixtures at all. This creates a lot of unnecessary disabling and
enabling of constraints which can be expensive on some database.

To speed this up, loaddata now first checks if any fixture file matches.
If no fixture file is matched, then the command exits before disabling
and enabling of constraints is done.

The main benefit of this change is seen on MSSQL, where tests on
Django 1.8 run hours faster.

comment:19 Changed 11 months ago by Tim Graham <timograham@…>

In 99569b22:

[1.9.x] Fixed #23372 -- Made loaddata faster if it doesn't find any fixtures.

Django's test suite often tries to load fixture files from apps that have
no fixtures at all. This creates a lot of unnecessary disabling and
enabling of constraints which can be expensive on some database.

To speed this up, loaddata now first checks if any fixture file matches.
If no fixture file is matched, then the command exits before disabling
and enabling of constraints is done.

The main benefit of this change is seen on MSSQL, where tests on
Django 1.8 run hours faster.

Backport of ee9f4686b19e2b4a68f5cb4f9d61dc045c1d4c63 from master

Note: See TracTickets for help on using tickets.
Back to Top