Opened 113 minutes ago
Closed 81 minutes ago
#36823 closed Bug (duplicate)
Data loss using `bulk_create()` on Django 5.2 due to Postgres `UNNEST` and explicit cast truncating
| Reported by: | James Beith | Owned by: | |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 5.2 |
| Severity: | Normal | Keywords: | postgres bulk create unnest |
| Cc: | Triage Stage: | Unreviewed | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
In Django 5.2, bulk_create() can use a faster strategy that inserts rows via UNNEST of typed arrays (feature ticket). That strategy typically involves explicitly casting the arrays to the destination column types (e.g. varchar(8)[]). So if we've a model field models.CharField(max_length=8) then PostgreSQL explicitly casts to varchar(n) which is a silent truncate to length n.
Previously, in Django 5.1 which doesn't use UNNEST, if we use bulk_create() passing an instance that has a value longer than 8 characters (e.g. "AAAABBBBC") for the field models.CharField(max_length=8) we'd get a DataError raised and no database row persisted. On Django 5.2, the value is truncated to "AAAABBBB" and the row is persisted in the database.
This truncation behaviour is a desired SQL behaviour:
However, if one explicitly casts a value to character varying(n) or character(n), then an over-length value will be truncated to n characters without raising an error. (This too is required by the SQL standard.)
Further examples can be seen here.
Duplicate of #33647 (see recent comments e.g. ticket:33647#comment:27), fix expected with Django 6.0.1 next month.