Opened 2 years ago
Last modified 8 weeks ago
#34943 assigned New feature
Support passing unique constraint names to bulk_create().
| Reported by: | Alex Vandiver | Owned by: | HAMA Barhamou | 
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | dev | 
| Severity: | Normal | Keywords: | bulk insert update upsert | 
| Cc: | Vitor Pereira, Chih Sean Hsu, Simon Charette, Ülgen Sarıkavak | Triage Stage: | Accepted | 
| Has patch: | no | Needs documentation: | no | 
| Needs tests: | no | Patch needs improvement: | no | 
| Easy pickings: | no | UI/UX: | no | 
Description (last modified by )
#31685 added support for INSERT ... ON CONFLICT(col1, col2) DO UPDATE SET ...  In some cases, however, the unique constraint may be on computed columns, which cannot be used with unique_fields because they are directly quoted.
For instance:
from django.db.models.functions import Upper UserTopic.objects.bulk_create( [ut], update_conflicts=True, update_fields=['last_updated','visibility_policy'], unique_fields=['user_profile_id','stream_id',Upper('topic_name')], )
...fails because UserTopic has no field named 'Upper(F(topic_name))'.
And:
UserTopic.objects.bulk_create( [ut], update_conflicts=True, update_fields=['last_updated','visibility_policy'], unique_fields=['user_profile_id','stream_id','upper(topic_name)'], )
...fails similarly, with UserTopic has no field named 'upper(topic_name)'.
It would be a useful feature to be able to handle these cases.
Change History (12)
comment:1 by , 2 years ago
| Cc: | added | 
|---|---|
| Description: | modified (diff) | 
| Summary: | Support computed unique fields for INSERT ON CONFLICT ... UPDATE → Support passing database expressions to bulk_create()'s unique_fields. | 
| Triage Stage: | Unreviewed → Accepted | 
| Type: | Uncategorized → New feature | 
comment:2 by , 2 years ago
| Keywords: | bulk insert update upsert added | 
|---|
follow-up: 8 comment:3 by , 2 years ago
Maybe the would be an opportunity to deprecate/rename the unique_fields kwarg name to unique_expressions or unique_together to denote that not only fields are acceptable here.
Or we could take an hybrid approach like we did with Index and Constraint where either allow update_fields or update_expressions to be specified.
I get a sense that we should have preferred a unique_constraint kwarg that points to the name of an entry of Meta.constraints or the name of a Field(unique) in the first place instead of forcing users to specify the exact same index definition they have in their model, Meta.unique_together, or Meta.constraints instead.
For backends that support the ON CONFLICT ON CONSTRAINT clause we could have simply used it for the constraint name specified otherwise we know what the index expression associated with the specified constraint is so we can generate the proper SQL.
In other words, for a model like
class UserTopic(models.Model): ... class Meta: constraints = [ UniqueConstraint( 'user_profile_id','stream_id',Upper('topic_name'), name='unique_profile_stream_topic' ) ]
The call to bulk_create could simply have been
UserTopic.objects.bulk_create( [ut], update_fields=['last_updated','visibility_policy'], unique_constraint='unique_profile_stream_topic', )
---
My recommendation would be to introduce a unique_constraint: str | tuple[str | Expression] kwarg and deprecate unique_fields. When provided a str it would be a reference to a UniqueConstraint by .name and when it's a tuple it would be expected to be a index expression where str are resolved to field names.
I guess the str -> UniqueConstraint.name part could be a distinct feature request but it seems related to this issue given the only way Django supports creating unique constraints composed of expressions is through Meta.constaints and having to define the exact expression twice is error prone.
comment:4 by , 2 years ago
| Summary: | Support passing database expressions to bulk_create()'s unique_fields. → Support passing unique constraint names to bulk_create(). | 
|---|
Great suggestion, Simon!
comment:5 by , 2 years ago
| Owner: | changed from to | 
|---|---|
| Status: | new → assigned | 
comment:6 by , 2 years ago
It's worth noting that [PostgreSQL's INSERT documentation](https://www.postgresql.org/docs/current/sql-insert.html) suggests against ON CONFLICT ON CONSTRAINT, preferring the raw list of expressions:
It is often preferable to use unique index inference rather than naming a constraint directly using
ON CONFLICT ON CONSTRAINT constraint_name. Inference will continue to work correctly when the underlying index is replaced by another more or less equivalent index in an overlapping way, for example when usingCREATE UNIQUE INDEX ... CONCURRENTLYbefore dropping the index being replaced.
That is, naming a specific constraint is in general more brittle, since it depends on the specific name, rather than naming the effect of the constraint.
In our particular use case, we also currently only have a unique index, not a constraint -- and I'm not sure how well doing that migration would go.  But that's a separate issue.
comment:7 by , 2 years ago
| Cc: | added | 
|---|
Makes sense.
I guess that from an application maintenance perspective there are still benefits in using references to already defined constraint though, think of urlpattern names, annotations, lookups. 
To take back the example above nothing would prevent the ORM from simply resolving unique_constraint='unique_profile_stream_topic' to ('user_profile_id', 'stream_id', Upper('topic_name')) and avoid the database level usage of ON CONFLICT ON CONSTRAINT if that's discouraged.
In other words, the aliasing capabilities could be solely done at the application level assuming users opt-in into the usage of Meta.constraints. That would also happen to address the issue involved in dealing with the difference between unique indexes and constraints as Postgres treat them differently.
comment:8 by , 20 months ago
Replying to Simon Charette:
Maybe the would be an opportunity to deprecate/rename the
unique_fieldskwarg name tounique_expressionsorunique_togetherto denote that not only fields are acceptable here.
Or we could take an hybrid approach like we did with
IndexandConstraintwhere either allowupdate_fieldsorupdate_expressionsto be specified.
I get a sense that we should have preferred a
unique_constraintkwarg that points to the name of an entry ofMeta.constraintsor the name of aField(unique)in the first place instead of forcing users to specify the exact same index definition they have in their model,Meta.unique_together, orMeta.constraintsinstead.
For backends that support the
ON CONFLICT ON CONSTRAINTclause we could have simply used it for the constraint name specified otherwise we know what the index expression associated with the specified constraint is so we can generate the proper SQL.
In other words, for a model like
class UserTopic(models.Model): ... class Meta: constraints = [ UniqueConstraint( 'user_profile_id','stream_id',Upper('topic_name'), name='unique_profile_stream_topic' ) ]The call to
bulk_createcould simply have been
UserTopic.objects.bulk_create( [ut], update_fields=['last_updated','visibility_policy'], unique_constraint='unique_profile_stream_topic', )---
My recommendation would be to introduce a
unique_constraint: str | tuple[str | Expression]kwarg and deprecateunique_fields. When provided astrit would be a reference to aUniqueConstraintby.nameand when it's atupleit would be expected to be a index expression wherestrare resolved to field names.
I guess the
str->UniqueConstraint.namepart could be a distinct feature request but it seems related to this issue given the only way Django supports creating unique constraints composed of expressions is throughMeta.constaintsand having to define the exact expression twice is error prone.
Hi, If you depreciated unique_fields, what would that mean for ticket #34277 https://code.djangoproject.com/ticket/34277, which I'm working on?
comment:9 by , 20 months ago
| Owner: | removed | 
|---|---|
| Status: | assigned → new | 
comment:10 by , 20 months ago
| Cc: | added | 
|---|
comment:11 by , 20 months ago
| Owner: | set to | 
|---|---|
| Status: | new → assigned | 
I will make a proposal to work on the ticket for GSoC 2024. 
comment:12 by , 8 weeks ago
There's some more discussion about this on the new features board: https://github.com/django/new-features/issues/79
Sounds reasonable.