Opened 3 months ago

Closed 2 months ago

Last modified 3 weeks ago

#35936 closed Cleanup/optimization (fixed)

Speeding up Postgres bulk_create by using unnest

Reported by: Simon Charette Owned by: Simon Charette
Component: Database layer (models, ORM) Version: 5.1
Severity: Normal Keywords: postgres bulk create unnest
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Per this forum discussion there seems to be a consensus and peer validation of performance benefits of using unnest for the bulk insertion of literal values on Postgres.

Change History (8)

comment:1 by Simon Charette, 3 months ago

Has patch: set
Owner: set to Simon Charette
Patch needs improvement: set
Status: newassigned

comment:2 by David Sanders, 3 months ago

Triage Stage: UnreviewedAccepted

comment:3 by Simon Charette, 3 months ago

Patch needs improvement: unset

comment:4 by Mariusz Felisiak, 2 months ago

Triage Stage: AcceptedReady for checkin

comment:5 by Sarah Boyce <42296566+sarahboyce@…>, 2 months ago

In 2638b755:

Refs #35936 -- Avoided field placeholder lookup for each value inserted.

By building the list of placeholders for each inserted fields once it
doesn't have to be looked up for each inserted rows twice.

The query_values_10000.benchmark.QueryValues10000.time_query_values_10000 ASV
benchmark showed a 5% speed up for 10k items on SQLite for a single field
insertion. Larger performance gains are expected when more fields are involved.

comment:6 by Sarah Boyce <42296566+sarahboyce@…>, 2 months ago

Resolution: fixed
Status: assignedclosed

In a16eedc:

Fixed #35936 -- Used unnest for bulk inserts on Postgres when possible.

This should make bulk_create significantly faster on Postgres when provided
only literal values.

Thanks James Sewell for writing about this technique, Tom Forbes for
validating the performance benefits, David Sanders and Mariusz Felisiak
for the review.

comment:7 by Sarah Boyce <42296566+sarahboyce@…>, 3 weeks ago

In 22fc151b:

Fixed #36107 -- Adjusted UNNEST bulk_create strategy to opt-out sized arrays.

The array fields opt-out heuristic failed to account for sized arrays.

Note that we keep relying on db_type as opposed to performing an ArrayField
instance check against the column's field as there could be other
implementations of model fields that use Postgres arrays to store the
optimization must be disabled for all of them.

Refs #35936.

Thanks Claude Paroz for the report and test.

comment:8 by Sarah Boyce <42296566+sarahboyce@…>, 3 weeks ago

In ad6bca92:

[5.2.x] Fixed #36107 -- Adjusted UNNEST bulk_create strategy to opt-out sized arrays.

The array fields opt-out heuristic failed to account for sized arrays.

Note that we keep relying on db_type as opposed to performing an ArrayField
instance check against the column's field as there could be other
implementations of model fields that use Postgres arrays to store the
optimization must be disabled for all of them.

Refs #35936.

Thanks Claude Paroz for the report and test.

Backport of 22fc151bb86a553d84c62d7effd289356e9b6c6c from main.

Note: See TracTickets for help on using tickets.
Back to Top