Opened 19 months ago

Last modified 7 months ago

#30138 new Cleanup/optimization

Allow QuerySet.bulk_create() to set pk of created objects when ignore_conflicts=True

Reported by: saber solooki Owned by:
Component: Database layer (models, ORM) Version: 2.2
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by Tim Graham)

As requested in ticket:28668#comment:24, here's a ticket for setting the primary keys of created objects. I tried it in PostgreSQL 10.6 and this could be supported. When ignore_confilicts=True, Django doesn't put ' RETURNING "mymodel"."id" ' in the query.

Change History (8)

comment:1 Changed 19 months ago by Tim Graham

Description: modified (diff)
Summary: Return pk of created objects when ignore_conflicts set True on QuerySet.bulk_create()Allow QuerySet.bulk_create() to set pk of created objects when ignore_conflicts=True
Triage Stage: UnreviewedAccepted
Type: UncategorizedCleanup/optimization

comment:2 Changed 18 months ago by Seunghun Lee

Owner: changed from nobody to Seunghun Lee
Status: newassigned

comment:3 Changed 18 months ago by Дилян Палаузов

At https://code.djangoproject.com/ticket/28668 are attached patches, that accomplish this.

comment:4 Changed 18 months ago by Tim Graham

Has patch: set
Patch needs improvement: set

comment:5 Changed 16 months ago by Seunghun Lee

Owner: Seunghun Lee deleted
Status: assignednew

comment:6 Changed 7 months ago by Hasan Ramezani

Owner: set to Hasan Ramezani
Status: newassigned

Based on the ticket description we should add ' RETURNING "mymodel"."id" ' to the query when ignore_confilicts=True. But based on the Simon comment on the PR the database will not return any conflicting row ids.
I am a little confused because of the Database just returns newly created object ids. Check Simon's example in the comment.
What should we do?

comment:7 Changed 7 months ago by Simon Charette

Hasan, this was mentioned in https://code.djangoproject.com/ticket/28668#comment:21 but was dismissed as something the proposed implementation would take of.

It does so by doing a LEFT JOIN over the existing rows and would require a set of columns against which to join to be provided to the call.

Implementing this in a proper manager would likely require a lot of work. It would require ignore_conflicts to accept an iterable of constraint names or tuple of fields corresponding to Field.unique=True or Meta.unique_together definitions. These entries could then be matched to constraint names and used as conflict_target in the ON CONFLICT clause and in the join USING clause. I think we'd also like to allow exclusion constraints to be treated differently because joining on fields won't help here as an insert could be conflicting with more than one result.

I assume callers would also like to be able to differentiate between newly inserted and conflicting entries so the return value of bulk_create would need to be formalized/changed. I assume we'd like to return a list of tuple of the form (bool: inserted, Model: instance, List[Model]: conflicting_instances) and only assign the pk on non-conflicting rows.

In summary this ticket involves a lot of research work to get right and not paint ourselves in a corner with regards to the desired API.

Last edited 7 months ago by Simon Charette (previous) (diff)

comment:8 Changed 7 months ago by Hasan Ramezani

Owner: Hasan Ramezani deleted
Status: assignednew
Note: See TracTickets for help on using tickets.
Back to Top