Opened 5 months ago
Closed 4 months ago
#35575 closed New feature (fixed)
Add support for constraint validation on GeneratedFields
Reported by: | Mark Gensler | Owned by: | Mark Gensler |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 5.0 |
Severity: | Normal | Keywords: | generatedfield uniqueconstraint checkconstraint |
Cc: | Mariusz Felisiak, Lily Foote, Simon Charette | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
GeneratedField
s on a model instance are not recalculated during Model.full_clean()
. Therefore, if a GeneratedField
is included in the *expressions
or fields
parameters of a UniqueConstraint
or CheckConstraint
in the instance's Model._meta
, the new values of the GeneratedField
attribute are not used by BaseConstraint.validate()
.
Instead, the model instance's GeneratedField
attribute will return the value which corresponds to the most recent refresh of the instance from the database. If the instance is new and has not yet been saved, the following error is raised:
AttributeError: Cannot read a generated field from an unsaved model.
An IntegrityError
is correctly raised when calling Model.save()
if any constraints involving GeneratedField
s are violated.
Instead, the recalculated values for GeneratedField
s should be used when checking constraints.
This ticket was raised after fixing the bug in #35560 with PR https://github.com/django/django/pull/18309.
Change History (18)
follow-up: 4 comment:1 by , 5 months ago
comment:2 by , 5 months ago
Description: | modified (diff) |
---|
comment:3 by , 5 months ago
Description: | modified (diff) |
---|
follow-up: 5 comment:4 by , 5 months ago
Cc: | added |
---|---|
Summary: | Model.full_clean() does not recalculate GeneratedFields prior to validating model constraints → Add support for constraint validation on GeneratedFields |
Triage Stage: | Unreviewed → Accepted |
Type: | Bug → New feature |
Replying to Mark Gensler:
Before writing any code to address this, I'd like to discuss where best to apply the fix. I see two possible options:
- Recalculate
GeneratedField
attribute values on the fly as part of callingConstraint.validate()
. This will leave the instance attributes unchanged untilModel.save()
is called.- Recalculate
GeneratedField
attributes on the model instance duringModel.full_clean()
, prior to calls toConstraint.validate()
, and irrespective of whether relevant constraints are present.
My gut says option 1
comment:5 by , 5 months ago
Replying to Sarah Boyce:
My gut says option 1
Yes, that seems the simpler approach which directly addresses this problem and no more. Also from the definition of a GeneratedField
in the docs:
A field that is always computed based on other fields in the model. This field is managed and updated by the database itself.
I'll work on option 1.
comment:6 by , 5 months ago
I feel like we should avoid the naive approach of calling refresh_from_db(fields=generated_fields)
as that will mutate the in-memory instance and leave it in this form as well as perform an extra query per constraint.
What should be done IMO is for the query performed in UniqueConstraint
to call replace_expressions({F(gfield.name): gfield.expression.replace_expressions(....) for gfield in generated_field})
so that will make
class Contributor(models.Model): first_name = models.TextField() last_name = models.TextField() full_name = models.GeneratedField( Lower(Concat("first_name", models.Value(" "), "last_name")) ) class Meta: constraints = { UniqueConstraint(names="unique_full_name", MD5("full_name")) } Contributor(first_name='Mark', last_name='Gensler').full_clean()
result in the following the query, assuming first_name
and last_name
are available, would be
SELECT 1 FROM contributor WHERE md5(lower("first_name" || ' ' || "last_name")) = md5(lower('Mark' || ' ' || 'Gensler'))
The more we push to the database the less likely we are to run into race conditions and serde roudtrip issues.
Note that ExclusionConstraint
is affected as well and that the logic added here will need to be made so constraints that include generated fields
or have expressions
with references to generated fields that themselves references exclude
when provided will also need to skip validate
. See the whole flatten
logic for reference which that likely need to be updated to have F("some_generated_field")
flattened in a way that returns all the fields referenced by it.
follow-up: 8 comment:7 by , 5 months ago
I also think it's worth highlighting that this has come from us testing a PR and not "organically". I don't think this is a quick win and we should probably confirm that this is wanted, otherwise it might be best to document that it isn't supported.
comment:8 by , 5 months ago
Replying to Sarah Boyce:
I also think it's worth highlighting that this has come from us testing a PR and not "organically". I don't think this is a quick win and we should probably confirm that this is wanted, otherwise it might be best to document that it isn't supported.
How would we do that? Shall I raise a discussion on the django forum?
I don't see any gap in pure functionality at present. Any expression which is defined for a GeneratedField
could also be written directly as an input to the CheckConstraint
/ UniqueConstraint
/ ExclusionConstraint
, rather than by using the GeneratedField
as shorthand for that expression. This change would simply allow a more concise and readable definition of the constraint and assist with DRY. Or is there another advantage which I'm missing?
If this isn't required, the only change necessary would be to prevent a GeneratedField
appearing in the expression of any constraints. Otherwise a stale value would be returned for the GeneratedField
attribute of an instance during constraint.validate()
. This wouldn't be strictly backwards compatible, but users could re-write any existing constraints which use a GeneratedField
.
*EDIT* Actually rather than preventing GeneratedField
s from appearing in constraints, the system could call checks.Warning()
.
Replying to Simon Charette:
Thanks for the examples and for raising ExclusionConstraint
, I had missed that. I'll investigate a solution using the outline you provided.
follow-up: 10 comment:9 by , 5 months ago
FWIW I gave a shot at what I described above to ensure that it was implementable and not too invasive and it was relatively straightforward. I don't see why we shouldn't support constraints over GeneratedField
knowing that.
comment:10 by , 5 months ago
Replying to Simon Charette:
FWIW I gave a shot at what I described above to ensure that it was implementable and not too invasive and it was relatively straightforward. I don't see why we shouldn't support constraints over
GeneratedField
knowing that.
Thanks Simon, I will use what you've done as a starting point for the PR.
comment:11 by , 5 months ago
Has patch: | set |
---|
comment:12 by , 5 months ago
I have added a draft PR but there are still a couple of outstanding issues. I thought the PR would probably be a better place for review/discussion.
Simon, would you be able to take a look? Thanks Mark
comment:13 by , 4 months ago
The patch looks ready to be reviewed to me. I requested that a release note be added for 5.0.X just in case we wanted to backport but in the light of #35560 disabling validation at first I think we should just consider this a new feature for inclusion in 5.2.x. That would also gives us enough time to properly address #34871 by then without restricting ourselves to solutions that are not to invasive for a backport. Given any of the GeneratedField
validation can be emulated with expression constraints as Mark pointed out it should allow for users to lean into that in the mean time.
IMO the branch demonstrates that enabling this feature is achievable without too much boilerplate an ensures that GeneratedField
works just like any other Field
with regards to Meta.constraints
instead of crashing with IntegrityError
.
comment:14 by , 4 months ago
I think we should just consider this a new feature for inclusion in 5.2.x.
For me, definitely a new feature.
comment:15 by , 4 months ago
Patch needs improvement: | set |
---|
comment:16 by , 4 months ago
Patch needs improvement: | unset |
---|
comment:17 by , 4 months ago
Triage Stage: | Accepted → Ready for checkin |
---|
I'm happy to start working on this. I will raise a PR and add failing tests to catch the bug shortly.
Before writing any code to address this, I'd like to discuss where best to apply the fix. I see two possible options:
GeneratedField
attribute values on the fly as part of callingConstraint.validate()
. This will leave the instance attributes unchanged untilModel.save()
is called.GeneratedField
attributes on the model instance duringModel.full_clean()
, prior to calls toConstraint.validate()
, and irrespective of whether relevant constraints are present.The second option would allow further checks against the new
GeneratedField
value of the instance to be made during validation. On the other hand, the first option would leave the instance's attribute returning the value currently returned/stored by the database.Which would be best?