Opened 5 years ago

Last modified 2 weeks ago

#31804 assigned New feature

Parallelize database cloning process

Reported by: Ahmad A. Hussein Owned by: Ahmed Ibrahim
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: parallel, mysqlpump
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no
Pull Requests:18668 build:success, 13314 unmerged, 13217 unmerged

Description (last modified by Ahmad A. Hussein)

Parallelizing database cloning processes would yield a nice speed-up for running Django's own test suite (and all django projects that use the default test runner)

So far there are three main ways I see we can implement this:

  • Use a multiprocessing pool at the setup_databases level that'll create workers which run clone_test_db for each method
  • Use a pool at the clone_test_db level which parallelizes the internal _clone_test_db call
  • Scrap parallelizing the cloning in general, but parallelizing the internals of specific backends (at least MySQL fits here)

In the first two options, we'd have to refactor MySQL's cloning process since it has another call to _clone_db. We have to because otherwise we'd have a dump being created inside of each parallel process, slowing the workers greatly.

In the last option, we could consider using mysqlpump instead of mysqldump for both exporting the database and restoring it. The con of this approach is that it isn't general enough to apply to the other backends.

Oracle's cloning process(although not merged in the current master) has internal support for option 3 (users can specify a PARALLEL variable to speed-up expdp/impdp utilities), and it can also use the first two options.

The major con though with the first two options is forcing parallelization

According to the ticket's flags, the next step(s) to move this issue forward are:

  • For anyone except the patch author to review the patch using the patch review checklist and either mark the ticket as "Ready for checkin" if everything looks good, or leave comments for improvement and mark the ticket as "Patch needs improvement".

Change History (8)

comment:1 by Ahmad A. Hussein, 5 years ago

Owner: changed from nobody to Ahmad A. Hussein

comment:2 by Ahmad A. Hussein, 5 years ago

Description: modified (diff)

comment:3 by Carlton Gibson, 5 years ago

Triage Stage: UnreviewedAccepted

Hi Ahmad. Yes: if you can get this going super.

One thing that's been bugging me about #31169 is how slow the DB cloning appears on Windows. (I need to measure exact times...) So if we can speed that up, it would be a big win.

Thanks.

comment:4 by Ahmad A. Hussein, 5 years ago

PR

Still needs more work

comment:5 by Mariusz Felisiak, 20 months ago

Owner: Ahmad A. Hussein removed
Status: assignednew

comment:6 by Ahmed Ibrahim, 6 months ago

Owner: set to Ahmed Ibrahim
Status: newassigned

I'm taking this one!

comment:7 by Ahmed Ibrahim, 6 months ago

Has patch: set
Needs tests: set

comment:8 by Ahmed Ibrahim, 2 weeks ago

Needs tests: unset
Note: See TracTickets for help on using tickets.
Back to Top