Opened 7 years ago
Last modified 12 months ago
#30138 new Cleanup/optimization
Allow QuerySet.bulk_create() to set pk of created objects when ignore_conflicts=True
| Reported by: | saber solooki | Owned by: | |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 2.2 |
| Severity: | Normal | Keywords: | |
| Cc: | Alex Vandiver, Ülgen Sarıkavak, şuayip üzülmez | Triage Stage: | Accepted |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | yes |
| Easy pickings: | no | UI/UX: | no |
Description (last modified by )
As requested in ticket:28668#comment:24, here's a ticket for setting the primary keys of created objects. I tried it in PostgreSQL 10.6 and this could be supported. When ignore_confilicts=True, Django doesn't put ' RETURNING "mymodel"."id" ' in the query.
Change History (11)
comment:1 by , 7 years ago
| Description: | modified (diff) |
|---|---|
| Summary: | Return pk of created objects when ignore_conflicts set True on QuerySet.bulk_create() → Allow QuerySet.bulk_create() to set pk of created objects when ignore_conflicts=True |
| Triage Stage: | Unreviewed → Accepted |
| Type: | Uncategorized → Cleanup/optimization |
comment:2 by , 7 years ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:3 by , 7 years ago
comment:5 by , 7 years ago
| Owner: | removed |
|---|---|
| Status: | assigned → new |
comment:6 by , 6 years ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
Based on the ticket description we should add ' RETURNING "mymodel"."id" ' to the query when ignore_confilicts=True. But based on the Simon comment on the PR the database will not return any conflicting row ids.
I am a little confused because of the Database just returns newly created object ids. Check Simon's example in the comment.
What should we do?
comment:7 by , 6 years ago
Hasan, this was mentioned in https://code.djangoproject.com/ticket/28668#comment:21 but was dismissed as something the proposed implementation would take of.
It does so by doing a LEFT JOIN over the existing rows and would require a set of columns against which to join to be provided to the call.
Implementing this in a proper manager would likely require a lot of work. It would require ignore_conflicts to accept an iterable of constraint names or tuple of fields corresponding to Field.unique=True or Meta.unique_together definitions. These entries could then be matched to constraint names and used as conflict_target in the ON CONFLICT clause and in the join USING clause. I think we'd also like to allow exclusion constraints to be treated differently because joining on fields won't help here as an insert could be conflicting with more than one result.
I assume callers would also like to be able to differentiate between newly inserted and conflicting entries so the return value of bulk_create would need to be formalized/changed. I assume we'd like to return a list of tuple of the form (bool: inserted, Model: instance, List[Model]: conflicting_instances) and only assign the pk on non-conflicting rows.
In summary this ticket involves a lot of research work to get right and not paint ourselves in a corner with regards to the desired API.
comment:8 by , 6 years ago
| Owner: | removed |
|---|---|
| Status: | assigned → new |
comment:9 by , 2 years ago
| Cc: | added |
|---|
comment:10 by , 20 months ago
| Cc: | added |
|---|
comment:11 by , 12 months ago
| Cc: | added |
|---|
At https://code.djangoproject.com/ticket/28668 are attached patches, that accomplish this.