Opened 6 years ago
Last modified 3 months ago
#30138 new Cleanup/optimization
Allow QuerySet.bulk_create() to set pk of created objects when ignore_conflicts=True
Reported by: | saber solooki | Owned by: | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 2.2 |
Severity: | Normal | Keywords: | |
Cc: | Alex Vandiver, Ülgen Sarıkavak, şuayip üzülmez | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
As requested in ticket:28668#comment:24, here's a ticket for setting the primary keys of created objects. I tried it in PostgreSQL 10.6 and this could be supported. When ignore_confilicts=True
, Django doesn't put ' RETURNING "mymodel"."id" '
in the query.
Change History (11)
comment:1 by , 6 years ago
Description: | modified (diff) |
---|---|
Summary: | Return pk of created objects when ignore_conflicts set True on QuerySet.bulk_create() → Allow QuerySet.bulk_create() to set pk of created objects when ignore_conflicts=True |
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → Cleanup/optimization |
comment:2 by , 6 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:3 by , 6 years ago
comment:5 by , 6 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:6 by , 5 years ago
Owner: | set to |
---|---|
Status: | new → assigned |
Based on the ticket description we should add ' RETURNING "mymodel"."id" '
to the query when ignore_confilicts=True
. But based on the Simon comment on the PR the database will not return any conflicting row ids
.
I am a little confused because of the Database just returns newly created object ids. Check Simon's example in the comment.
What should we do?
comment:7 by , 5 years ago
Hasan, this was mentioned in https://code.djangoproject.com/ticket/28668#comment:21 but was dismissed as something the proposed implementation would take of.
It does so by doing a LEFT JOIN
over the existing rows and would require a set of columns against which to join to be provided to the call.
Implementing this in a proper manager would likely require a lot of work. It would require ignore_conflicts
to accept an iterable of constraint names or tuple of fields corresponding to Field.unique=True
or Meta.unique_together
definitions. These entries could then be matched to constraint names and used as conflict_target
in the ON CONFLICT
clause and in the join USING
clause. I think we'd also like to allow exclusion constraints to be treated differently because joining on fields won't help here as an insert could be conflicting with more than one result.
I assume callers would also like to be able to differentiate between newly inserted and conflicting entries so the return value of bulk_create
would need to be formalized/changed. I assume we'd like to return a list of tuple of the form (bool: inserted, Model: instance, List[Model]: conflicting_instances)
and only assign the pk
on non-conflicting rows.
In summary this ticket involves a lot of research work to get right and not paint ourselves in a corner with regards to the desired API.
comment:8 by , 5 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:9 by , 15 months ago
Cc: | added |
---|
comment:10 by , 11 months ago
Cc: | added |
---|
comment:11 by , 3 months ago
Cc: | added |
---|
At https://code.djangoproject.com/ticket/28668 are attached patches, that accomplish this.