Opened 4 weeks ago

Closed 3 weeks ago

#36702 closed Bug (fixed)

bulk_create not returning pk if it was set as an expression

Reported by: Johannes Westphal Owned by: Johannes Westphal
Component: Database layer (models, ORM) Version: 5.2
Severity: Normal Keywords:
Cc: Johannes Westphal, Simon Charette Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Currently, when the pk is set on an object passed to bulk_create the pk is explicitly not updated in the object:

https://github.com/django/django/blob/05ba1a9228128614fb3c475f1c4bdf0160f44dba/django/db/models/query.py#L843

This becomes a problem if the pk of the object has been set to an expression, because then we have no access to the actual generated pk even though the db returned it.

Would it break any functionality to just remove the line above and also update the pk even if it had been set?

Change History (7)

comment:1 by Simon Charette, 4 weeks ago

Triage Stage: UnreviewedAccepted
Type: UncategorizedBug

Hello Jonathan, thank you for your report!

I went back to the change that introduced that line and it goes all the way back to when we added RETURNING support to bulk_create on Postgres (see #29444).

I couldn't any find mention of why we were doing it but I assume it was initially as form of optimization or a way to make sure the primary key preserves its exact identity

e.g.

  • UUID('70cf7ffa-000c-4dfd-b08b-a921f09e5ff2') == UUID('70cf7ffa-000c-4dfd-b08b-a921f09e5ff2')
  • but UUID('70cf7ffa-000c-4dfd-b08b-a921f09e5ff2') is not UUID('70cf7ffa-000c-4dfd-b08b-a921f09e5ff2')

Given the ORM makes favors equality of objects and hashability of objects over identity I think it would be safe to remove this. The bulk_create test suite also doesn't complain on Postgres and SQLite so I think this is a safe change to make, would you like to submit a PR with a test for it?

FWIW I stumbled upon this particular case when working on #36490 to avoid unnecessary transaction in bulk_create. I noticed that there doesn't seem to be a reason to perform two with and without pk partionned INSERT anymore. I demonstrated that both branches can be merged together.

comment:2 by Johannes Westphal, 4 weeks ago

I pushed a test for the bug, with your changes the issue should already be solved.

comment:3 by Simon Charette, 3 weeks ago

Cc: Simon Charette added

Thanks for the test!

I think we should merge your proposed if field != opts.pk: removal with your test first though as my branch is more invasive and this ticket is already focused on a particular accepted bug while the other is more of an optimization.

Would you mind pushing the removal to your branch and open a PR with it? Happy to review it one you do and we can trade reviews when I open the follow up ticket and PR to optimize the insertion if you're interested.

Last edited 3 weeks ago by Simon Charette (previous) (diff)

comment:4 by Johannes Westphal, 3 weeks ago

Owner: set to Johannes Westphal
Status: newassigned

comment:5 by Johannes Westphal, 3 weeks ago

Has patch: set

comment:6 by Simon Charette, 3 weeks ago

Triage Stage: AcceptedReady for checkin

comment:7 by GitHub <noreply@…>, 3 weeks ago

Resolution: fixed
Status: assignedclosed

In 6d4d99b3:

Fixed #36702 -- Made bulk_create() return pk values set by an expression.

Note: See TracTickets for help on using tickets.
Back to Top