Opened 8 years ago

Closed 5 years ago

Last modified 5 years ago

#7596 closed New feature (fixed)

Multiple Database Inserts using 1 query

Reported by: ErikW Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: database, create, multirow, feature
Cc: Trevor Caira, Alexander Koshelev, fnl, sfllaw@…, anssi.kaariainen@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

It would be really nice (ie convenient, reduce load on db, improve performance) to be able to create a "collection" of model objects with a single query. I've tested the raw SQL, and it is significantly faster than using a loop to insert a bunch of database rows. All of the backends seem to support it.

I'm thinking something like
[queryset] = Model.objects.bulk_create([{object}, {object}, {object}])

http://www.desilva.biz/mysql/insert.html
http://www.postgresql.org/docs/8.2/interactive/sql-insert.html (search for: To insert multiple rows using the multirow VALUES syntax)
http://codingforums.com/archive/index.php?t-109802.html

Multirow inserts are not supported in sqlite yet (http://www.sqlite.org/cvstrac/wiki?p=UnsupportedSql), but the backend could fall back to a python loop in this case....

Thanks!

Attachments (3)

t7596-alex.diff.txt (24.5 KB) - added by Russell Keith-Magee 5 years ago.
Alex Gaynor's patch, prepared at DjangoCon Europe 2011
t7596-alex.diff (24.5 KB) - added by Karen Tracey 5 years ago.
Same patch, extension that trac will hopefully recognize properly
7596.diff (26.6 KB) - added by Simon Meers 5 years ago.
Updated/tweaked version of Alex's patch

Download all attachments as: .zip

Change History (21)

comment:1 Changed 8 years ago by Johannes Dollinger

Keywords: feature added
milestone: post-1.0
Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset
Triage Stage: UnreviewedDesign decision needed

comment:2 Changed 8 years ago by goodrone

Maybe use something like sessions in SQLAlchemy?

comment:3 Changed 8 years ago by Trevor Caira

Cc: Trevor Caira added

comment:4 Changed 8 years ago by (none)

milestone: post-1.0

Milestone post-1.0 deleted

comment:5 Changed 8 years ago by anonymous

Triage Stage: Design decision neededAccepted

comment:6 Changed 8 years ago by Russell Keith-Magee

Triage Stage: AcceptedDesign decision needed

Reverting anonymous change.

comment:7 Changed 8 years ago by vkailas

Quick question: A multiple db inserts using one query would achieve more of a speed of than just using transaction managent e.g. @transaction.commit_on_success?

Thanks

comment:8 Changed 7 years ago by Alexander Koshelev

Cc: Alexander Koshelev added

comment:9 Changed 7 years ago by fnl

Cc: fnl added

comment:10 Changed 6 years ago by Carl Meyer

This came up just now in discussing optimizing the Django test suite. A lot of time in TransactionTestCases is wasted in model-by-model get_or_create of content-types and permissions. As it stands, the lookups there could be optimized, but to fully optimize it a bulk-insert ORM operation would also be needed.

comment:11 Changed 6 years ago by Carl Meyer

Triage Stage: Design decision neededAccepted

The API here still requires design discussion, but accepting this in principle.

comment:12 Changed 5 years ago by Simon Law

Cc: sfllaw@… added
Severity: Normal
Type: Uncategorized

comment:13 Changed 5 years ago by Luke Plant

Type: UncategorizedNew feature

Changed 5 years ago by Russell Keith-Magee

Attachment: t7596-alex.diff.txt added

Alex Gaynor's patch, prepared at DjangoCon Europe 2011

comment:14 Changed 5 years ago by Russell Keith-Magee

Easy pickings: unset
Has patch: set
UI/UX: unset

Changed 5 years ago by Karen Tracey

Attachment: t7596-alex.diff added

Same patch, extension that trac will hopefully recognize properly

Changed 5 years ago by Simon Meers

Attachment: 7596.diff added

Updated/tweaked version of Alex's patch

comment:15 Changed 5 years ago by Simon Meers

I've attached a reviewed/updated/tweaked version of Alex's patch.


Looks pretty good to me, mostly stylistic changes though I think the previous
patch would have had some issues with primary key values which evaluated to
False; I've updated this to explicitly test for None.


I'm not a database expert, so I may have missed some intricacies, but the
code itself looks robust.


I've also added a simple proof-of-concept change to the MySQL backend which
works as far as I've tested, but I usually steer as far away from MySQL as
possible so I'm not the person to write this. So far it's identical to
PostgreSQL; I don't think MySQL can do the SQLite trick for combining inserts
with/without auto-increment primary keys?

comment:16 Changed 5 years ago by Anssi Kääriäinen

Cc: anssi.kaariainen@… added

Some notes on the approach.

Combining instances with and without autoincrement primary keys set, use this sql:

    insert into table_with_serial_id(id, other_col) values(1, 'foo'),(default, 'bar');

A different question is if this is something that needs to be supported at all. I don't see it particularly useful to insert anything else than default values into serial columns. Except for bulk loading, but then you don't want to have None values in the mix. Maybe there is an use case?

Also, why doesn't the code use "RETURNING serial_pk" where available? Maybe that would make the situation more confusing, because sometimes the bulk_save method would include the PKs and sometimes not depending on used backend. Maybe a kwarg 'returning_autopk' and if that is set then raise error if the backend doesn't support RETURNING?

comment:17 Changed 5 years ago by Alex Gaynor

Resolution: fixed
Status: newclosed

In [16739]:

Fixed #7596. Added Model.objects.bulk_create, and make use of it in several places. This provides a performance benefit when inserting multiple objects. THanks to Russ for the review, and Simon Meers for the MySQl implementation.

comment:18 Changed 5 years ago by Ramiro Morales

In [16819]:

Added an implementation of bulk insert via the ORM to the Oracle DB backend.

Refs #7596, r16739 and http://troels.arvin.dk/db/rdbms/#insert-multiple

Note: See TracTickets for help on using tickets.
Back to Top