Opened 6 weeks ago

Closed 6 days ago

#35904 closed New feature (wontfix)

Speed up fixture loading by adding options bulk insert/create

Reported by: JorisBenschop Owned by:
Component: Testing framework Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by JorisBenschop)

As per this forum discussion, I have created a patch to improve load times for the loaddata command under some circumstances.

Currently the “loaddata” management command uses the obj.save() method for each deserialized object within a fixture. This function first tries an UPDATE statement and, if that fails, tries an INSERT statement. By using the --force_insert a reduction of 50% of queries is achieved.

A second option is to use bulk_create for insertion of multiple records. This improves insertion speed by (n-1/n), or ~99% for insertion of 100 records.

These options are not meant to cover each use case, and therefore are set to optional.

Benchmark results
===============
test to insert 1000 records from a single fixture (using the Article model on Sqlite)
current: 0.116s
with --force_insert: 0.066s
with --bulk_create: 0.010s

test to insert 10000 records from a single fixture
current: 1.07s
with --force_insert: 0.39s
with --bulk_create: 0.104s

I expect larger models to have a more significant improvement even.

Change History (13)

comment:1 by Simon Charette, 6 weeks ago

Resolution: wontfix
Status: newclosed

Hello Joris,

This sounds interesting particularly given features like test case serialized rollbacks (which are quite slow) are based on top of model serialization. It would have to be a distinct option as bulk_create doesn't fire signals which some setup might require.

Just like any new feature requests though they should be discussed on the forum to reach a consensus before being accepted. Given this is a performance related new feature I suggest your proposal come equipped with some details about what kind of improvements users should expect (profiles, benchmarks instead of solely claiming it's fairly inefficient) backed by step to reproduce as well as a PoC that properly deals with other features of serde framework such as natural keys and a plan on how to deal with backends that don't support ignore_conflicts. It might even be a good opportunity to augment our performance tracking system with serde benchmarks.

If there is genuine interest in this feature i will develop it and submit. I have this running in my own codebase already.

It that's the case then sharing this code as a standalone package (e.g. django-fast-loaddata) might be a good way to get traction on the above.

Assuming there is interest in moving forward we can then re-open this issue.

Last edited 6 weeks ago by Simon Charette (previous) (diff)

comment:2 by Sarah Boyce, 2 weeks ago

Summary: Speed up fixture loading by bulk insertSpeed up fixture loading by adding options bulk insert/create
Type: UncategorizedNew feature

comment:3 by JorisBenschop, 2 weeks ago

Description: modified (diff)
Has patch: set
Resolution: wontfix
Status: closednew

comment:4 by JorisBenschop, 2 weeks ago

Description: modified (diff)

comment:5 by JorisBenschop, 2 weeks ago

Description: modified (diff)

comment:6 by JorisBenschop, 2 weeks ago

As requested by Simon, I have re-opened the ticket and specified the expected improvements in a more exact manner. Steps to reproduce are covered in the tests that are in the PR. I am open to add code to the serde testing, if there is interest.

comment:7 by JorisBenschop, 2 weeks ago

Description: modified (diff)

comment:8 by JorisBenschop, 2 weeks ago

Description: modified (diff)

comment:9 by JorisBenschop, 7 days ago

Is there any way i can progress this ticket? I addressed all the issues in the pr to my knowledge

comment:10 by JorisBenschop, 7 days ago

Version: 5.0dev

comment:11 by Jacob Walls, 7 days ago

Hi Joris. The beige notice at the top of ticket advises the next step is for someone besides the author to accept the ticket. This is a new feature, so some engagement on the forum thread is desired and expected. It's been less than two weeks since the forum post was raised, which is in most cases a window too short to allow all voices to participate. I would advise allowing a little more time, especially around the holidays. Thanks for your dedication.

comment:12 by JorisBenschop, 7 days ago

Hi Jacob, thank you so much for explaining this. I understand there are many tickets that ask for your attention. As a submitter, there is always a fine line between allowing time and losing momentum. By no means am i trying to rush you in any way, so i highly appreciate that you explain the expected timelines on this process.

comment:13 by Natalia Bidart, 6 days ago

Resolution: wontfix
Status: newclosed

Closing as per my latest comment in the forum post, there is a lack of community traction for this and a third party app seems like the best next step for this feature request.

Note: See TracTickets for help on using tickets.
Back to Top