Opened 4 years ago

Closed 4 years ago

#31275 closed Cleanup/optimization (fixed)

Optimize MariaDB/MySQL sql_flush

Reported by: Adam Johnson Owned by: Masashi SHIBATA
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Simon Charette Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This was suggested as an extra feature for Django-MySQL by @jonatron in https://github.com/adamchainz/django-mysql/pull/611/files , but it would make more sense to make it in Django core.

Currently MySQL's sql_flush emits TRUNCATE TABLE statements. These are relatively slow on small tables since they force recreation of the table on disk. The alternative, DELETE FROM, is faster. I can see a diference even for a single empty table locally (MariaDB 10.4):

chainz@localhost [21]> truncate t2;
Query OK, 0 rows affected (0.005 sec)

chainz@localhost [22]> delete from t2;
Query OK, 0 rows affected (0.000 sec)

TransactionTestCase uses sql_flush, via the flush management command, to reset the database. Most test suites don't use very large tables, so using DELETE FROM should normally be faster in this case. Optimizing it MySQL to use DELETE FROM for flushing, at least for tables up to a certain size (1000 rows or similar heuristic) could save lot of time when using TransactionTestCase.

Change History (13)

comment:1 by Claude Paroz, 4 years ago

Triage Stage: UnreviewedAccepted

I confirm that TransactionTestCase is horrrrribly slow on MySQL/MariaDB! So a big +1 from me.

comment:2 by jonatron, 4 years ago

Owner: changed from nobody to jonatron
Status: newassigned

comment:3 by jonatron, 4 years ago

Owner: jonatron removed
Status: assignednew

If I recall correctly, TRUNCATE will reset AUTO_INCREMENT but DELETE FROM won't.

comment:4 by Adam Johnson, 4 years ago

If that's the case we'd want to ALTER TABLE tablename AUTO_INCREMENT = 1 for those tables with auto-increment keys.

comment:5 by Masashi SHIBATA, 4 years ago

Hi! Can I work on this ticket?

comment:6 by Claude Paroz, 4 years ago

No need to ask when there is no owner. Please do!

comment:7 by Masashi SHIBATA, 4 years ago

Owner: set to Masashi SHIBATA
Status: newassigned

Thank you!

comment:8 by Masashi SHIBATA, 4 years ago

Has patch: set

comment:9 by Masashi SHIBATA, 4 years ago

comment:10 by Simon Charette, 4 years ago

Cc: Simon Charette added

If that's the case we'd want to ALTER TABLE tablename AUTO_INCREMENT = 1 for those tables with auto-increment keys.

I decided to guide the patch submitter towards avoiding these queries and suggesting the use of TransactionTestCase.reset_sequences instead while documenting the behaviour changed in Django 3.1 and that this feature should be explicitly enabled to achieve this previously implicit reset.

It made more sense to me since the goal of the ticket is to make this operation faster and these sequence alteration queries took longer than plain TRUNCATE calls from my local testing which completely defeats the purpose of this ticket. The fact we have an explicit feature to achieve this behaviour only makes it slightly backward incompatible for users relying on this undefined behaviour.

Based on the fact ALTER TABLE tablename AUTO_INCREMENT = 1 is slower than TRUNCATE tablename I guess MySQL's sql_flush could be optimized further to only issue the latter when asked to flush both a table and its sequence. Given TransactionTestCase calls the flush command with reset_sequences=False that would allow us to favour DELETE over TRUNCATE when no sequences are meant to be flushed and grasp the benefit in the test suite without relying on the 1000 rows heuristics which is a bit arbitrary and require a schema introspection query which isn't cheap either.

I'll add that the benefits for Django's test suite and any suite that heavily uses TransactionTestCase.available_apps will likely be minimal as this ms difference between DELETE and TRUNCATE is only apparent when the number flushed tables after every test is large and the available_apps feature greatly reduces it this number in large Django project. In summary projects that will benefits from this patch would benefit way more by defining available_apps on their tests.

comment:11 by Adam Johnson, 4 years ago

I decided to guide the patch submitter towards avoiding these queries and suggesting the use of TransactionTestCase.reset_sequences instead while documenting the behaviour changed in Django 3.1 and that this feature should be explicitly enabled to achieve this previously implicit reset.

Very sensible. I forget about this option!

In summary projects that will benefits from this patch would benefit way more by defining available_apps on their tests.

Yes indeed, although it can be tedious to define and keep correct.

comment:12 by Mariusz Felisiak, 4 years ago

Triage Stage: AcceptedReady for checkin

comment:13 by Mariusz Felisiak <felisiak.mariusz@…>, 4 years ago

Resolution: fixed
Status: assignedclosed

In 89032876:

Fixed #31275 -- Optimized sql_flush() without resetting sequences on MySQL.

Co-Authored-By: Simon Charette <charettes@…>

Note: See TracTickets for help on using tickets.
Back to Top