Opened 6 years ago

Last modified 3 months ago

#30138 new Cleanup/optimization

Allow QuerySet.bulk_create() to set pk of created objects when ignore_conflicts=True

Reported by: saber solooki Owned by:
Component: Database layer (models, ORM) Version: 2.2
Severity: Normal Keywords:
Cc: Alex Vandiver, Ülgen Sarıkavak, şuayip üzülmez Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by Tim Graham)

As requested in ticket:28668#comment:24, here's a ticket for setting the primary keys of created objects. I tried it in PostgreSQL 10.6 and this could be supported. When ignore_confilicts=True, Django doesn't put ' RETURNING "mymodel"."id" ' in the query.

Change History (11)

comment:1 by Tim Graham, 6 years ago

Description: modified (diff)
Summary: Return pk of created objects when ignore_conflicts set True on QuerySet.bulk_create()Allow QuerySet.bulk_create() to set pk of created objects when ignore_conflicts=True
Triage Stage: UnreviewedAccepted
Type: UncategorizedCleanup/optimization

comment:2 by Seunghun Lee, 6 years ago

Owner: changed from nobody to Seunghun Lee
Status: newassigned

comment:3 by Дилян Палаузов, 6 years ago

At https://code.djangoproject.com/ticket/28668 are attached patches, that accomplish this.

comment:4 by Tim Graham, 6 years ago

Has patch: set
Patch needs improvement: set

comment:5 by Seunghun Lee, 6 years ago

Owner: Seunghun Lee removed
Status: assignednew

comment:6 by Hasan Ramezani, 5 years ago

Owner: set to Hasan Ramezani
Status: newassigned

Based on the ticket description we should add ' RETURNING "mymodel"."id" ' to the query when ignore_confilicts=True. But based on the Simon comment on the PR the database will not return any conflicting row ids.
I am a little confused because of the Database just returns newly created object ids. Check Simon's example in the comment.
What should we do?

comment:7 by Simon Charette, 5 years ago

Hasan, this was mentioned in https://code.djangoproject.com/ticket/28668#comment:21 but was dismissed as something the proposed implementation would take of.

It does so by doing a LEFT JOIN over the existing rows and would require a set of columns against which to join to be provided to the call.

Implementing this in a proper manager would likely require a lot of work. It would require ignore_conflicts to accept an iterable of constraint names or tuple of fields corresponding to Field.unique=True or Meta.unique_together definitions. These entries could then be matched to constraint names and used as conflict_target in the ON CONFLICT clause and in the join USING clause. I think we'd also like to allow exclusion constraints to be treated differently because joining on fields won't help here as an insert could be conflicting with more than one result.

I assume callers would also like to be able to differentiate between newly inserted and conflicting entries so the return value of bulk_create would need to be formalized/changed. I assume we'd like to return a list of tuple of the form (bool: inserted, Model: instance, List[Model]: conflicting_instances) and only assign the pk on non-conflicting rows.

In summary this ticket involves a lot of research work to get right and not paint ourselves in a corner with regards to the desired API.

Last edited 5 years ago by Simon Charette (previous) (diff)

comment:8 by Hasan Ramezani, 5 years ago

Owner: Hasan Ramezani removed
Status: assignednew

comment:9 by Alex Vandiver, 15 months ago

Cc: Alex Vandiver added

comment:10 by Ülgen Sarıkavak, 11 months ago

Cc: Ülgen Sarıkavak added

comment:11 by şuayip üzülmez, 3 months ago

Cc: şuayip üzülmez added
Note: See TracTickets for help on using tickets.
Back to Top