#36823 closed Bug (duplicate)

Data loss using `bulk_create()` on Django 5.2 due to Postgres `UNNEST` and explicit cast truncating

Reported by: James Beith Owned by:
Component: Database layer (models, ORM) Version: 5.2
Severity: Normal Keywords: postgres bulk create unnest
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

In Django 5.2, bulk_create() can use a faster strategy that inserts rows via UNNEST of typed arrays (feature ticket). That strategy typically involves explicitly casting the arrays to the destination column types (e.g. varchar(8)[]). So if we've a model field models.CharField(max_length=8) then PostgreSQL explicitly casts to varchar(n) which is a silent truncate to length n.

Previously, in Django 5.1 which doesn't use UNNEST, if we use bulk_create() passing an instance that has a value longer than 8 characters (e.g. "AAAABBBBC") for the field models.CharField(max_length=8) we'd get a DataError raised and no database row persisted. On Django 5.2, the value is truncated to "AAAABBBB" and the row is persisted in the database.

This truncation behaviour is a desired SQL behaviour:

However, if one explicitly casts a value to character varying(n) or character(n), then an over-length value will be truncated to n characters without raising an error. (This too is required by the SQL standard.)

Further examples can be seen here.

Change History (1)

comment:1 by Jacob Walls, 81 minutes ago

Resolution: duplicate
Status: newclosed

Duplicate of #33647 (see recent comments e.g. ticket:33647#comment:27), fix expected with Django 6.0.1 next month.

Note: See TracTickets for help on using tickets.
Back to Top