Opened 7 years ago

Last modified 5 months ago

#28821 assigned New feature

Allow QuerySet.bulk_create() on multi-table inheritance when possible

Reported by: Joey Wilhelm Owned by: HAMA Barhamou
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: multi-tabel, bulk-creation, optimization, queryset, sql
Cc: Abhishek Gautam, Sardorbek Imomaliev, jon.dufresne@…, Shai Berger, Adam Johnson, Arthur, Paolo Melchiorre Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

According to this comment in bulk_create:

        # When you bulk insert you don't get the primary keys back (if it's an
        # autoincrement, except if can_return_ids_from_bulk_insert=True), so
        # you can't insert into the child tables which references this.

This implies that, if we do retrieve primary keys from the parent model's bulk insert, then it is possible to bulk insert into the child tables automatically.

Now that Django does have the ability to automatically retrieve, and set, primary keys on a bulk create operation, it would be nice to allow this use case when possible (specifically, when the backend has can_return_ids_from_bulk_insert=True). Keying it off this feature would give PostgreSQL this ability immediately, and then let it work for Oracle as soon as retrieval of PKs is fully supported on that engine as well.

Also, regardless if Django does this automatically, I would like to be able to manually set the _ptr fields on the child records in order to affect a bulk_create without the need for automatic retrieval of IDs. However, even that is not possible, as the bulk_create method fails on multi-table inheritance in all cases.

Change History (18)

comment:1 by Tim Graham, 7 years ago

Summary: Allow bulk_create on multi-table inheritance when possibleAllow QuerySet.bulk_create() on multi-table inheritance when possible
Triage Stage: UnreviewedAccepted

comment:2 by Abhishek Gautam, 7 years ago

Cc: Abhishek Gautam added
Owner: changed from nobody to Abhishek Gautam
Status: newassigned

comment:3 by Abhishek Gautam, 7 years ago

Owner: Abhishek Gautam removed
Status: assignednew

comment:4 by Sardorbek Imomaliev, 5 years ago

Cc: Sardorbek Imomaliev added

comment:5 by Daniel Alley, 4 years ago

This would be an excellent feature and we would really love to be able to use it.

I would like to be able to manually set the _ptr fields on the child records in order to affect a bulk_create without the need for automatic retrieval of IDs.

In certain circumstances (think UUID PKs) it might also be possible, at least with Postgresql (don't know about the others), to do it the other way around. Foreign key integrity checks are deferred to the end of the transaction, so you could save all of the child tables first, and then save all the parent tables. As long as the entire operation is wrapped in a single transaction, it wouldn't matter that _ptr temporarily held an invalid FK.

comment:6 by Jon Dufresne, 4 years ago

Cc: jon.dufresne@… added
Has patch: set

comment:7 by Mariusz Felisiak, 4 years ago

Owner: set to Jon Dufresne
Status: newassigned

comment:8 by Mariusz Felisiak, 4 years ago

Patch needs improvement: set

comment:9 by Shai Berger, 3 years ago

Cc: Shai Berger added

There's a naive implementation of a special case in the new broken-down-models library. Could be interesting to compare.
https://github.com/Matific/broken-down-models/blob/main/bdmodels/models.py#L114 (actual line number may have changed by the time you read this, of course)

comment:10 by Adam Johnson, 3 years ago

Cc: Adam Johnson added

comment:11 by Mariusz Felisiak, 2 years ago

Owner: Jon Dufresne removed
Status: assignednew

comment:12 by HAMA Barhamou, 12 months ago

Has patch: unset
Owner: set to HAMA Barhamou
Patch needs improvement: unset
Status: newassigned

comment:13 by HAMA Barhamou, 11 months ago

Hi Django Team,

I am picking up where @jdufresne left off . My recent commits (https://github.com/django/django/pull/17754) introduce the initial steps towards enabling QuerySet.bulk_create to support multi-table inheritance.

This is just the beginning, and I plan to make iterative improvements to this feature. Looking forward to your feedback and suggestions as we progress.

comment:14 by HAMA Barhamou, 11 months ago

Has patch: set

comment:15 by Mariusz Felisiak, 11 months ago

Patch needs improvement: set

Marking as "needs improvement" as author mentioned that it's a draft.

comment:16 by Ben Hubsch, 7 months ago

I have a use case where I would like to bulk create the child model and and the primary key / parent key is known in advance. That seems the most straightforward version to support, would love to have that enabled

comment:17 by Arthur, 7 months ago

Cc: Arthur added

comment:18 by Paolo Melchiorre, 5 months ago

Cc: Paolo Melchiorre added
Keywords: multi-tabel bulk-creation optimization queryset sql added
Note: See TracTickets for help on using tickets.
Back to Top