Opened 2 years ago
Closed 2 years ago
#34160 closed Bug (wontfix)
Django 4.1 Expression contains mixed types for (Big/Small)IntegerFields.
Reported by: | Martin Lehoux | Owned by: | Martin Lehoux |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 4.1 |
Severity: | Release blocker | Keywords: | |
Cc: | Simon Charette, Luke Plant | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Hi,
During an update from Django 4.0.8 to 4.1.3, an unexpected bug occured in our tests.
django.core.exceptions.FieldError: Expression contains mixed types: IntegerField, SmallIntegerField. You must set output_field.
I understand what it means, but I couldn't find what changed between these two versions that could explain the appearance of this error.
Case( When( True, then=F("inventory_count") + Value(1), ), default=F("inventory_count"), )
I could easily fix it with
Case( When( True, then=F("inventory_count") + Value(1), ), default=F("inventory_count"), output_field=IntegerField(), )
Change History (20)
comment:1 by , 2 years ago
Summary: | Django 4.1 → Django 4.1 Expression contains mixed types |
---|
comment:2 by , 2 years ago
Cc: | added |
---|---|
Severity: | Normal → Release blocker |
Summary: | Django 4.1 Expression contains mixed types → Django 4.1 Expression contains mixed types for (Big/Small)IntegerFields. |
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → Bug |
comment:3 by , 2 years ago
I agree we should try to automatically resolve the type. Based on my experience with the related patch, I think it would be worth checking what MySQL and Postgres do for cases like this - there could be some unexpected surprises e.g. the database actually casts to the smaller of the type, or the first type. In this case, the user may need to know that it isn't going to do what you expect.
comment:4 by , 2 years ago
I agree with your suggested solution Markusz. Adding entries to _connector_combinations
seems like the way to go for a backport.
Based on my experience with the related patch, I think it would be worth checking what MySQL and Postgres do for cases like this - there could be some unexpected surprises
I know that Postgres will silently cast values outside of the bigint
range to numeric
with its unfortunate side effect so I'm pretty that it has operators defined to make integer comparison as implicit as possible.
follow-up: 7 comment:6 by , 2 years ago
Hi, I'd be happy to! I will have a look tonight, and will ask a few questions if I have trouble understanding what to do (but from what I read the infrastructure is already there for this feature)
comment:7 by , 2 years ago
Replying to Martin Lehoux:
Hi, I'd be happy to! I will have a look tonight, and will ask a few questions if I have trouble understanding what to do (but from what I read the infrastructure is already there for this feature)
Thanks, it should be enough to add extra combinations for integer fields (a regression test is also required, see test_resolve_output_field_number()).
comment:8 by , 2 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:9 by , 2 years ago
Isn't it the Case
that has an issue computing it's output_field
? If it's the case, I'm not sure that the files you pointed to me were the right ones. (I say that because some of the tests cases, including the one in my example, already worked)
django/db/models/expressions.py
@deconstructible(path="django.db.models.Case") class Case(SQLiteNumericMixin, Expression): template = "CASE %(cases)s ELSE %(default)s END" case_joiner = " " def __init__(self, *cases, default=None, output_field=None, **extra): if not all(isinstance(case, When) for case in cases): raise TypeError("Positional arguments must all be When objects.") super().__init__(output_field) self.cases = list(cases) self.default = self._parse_expressions(default)[0] self.extra = extra
comment:10 by , 2 years ago
What I believe is happening here is that F("inventory_count") + Value(1)
has its output_field
resolved to IntegerField
because it's the common base of inventory_count
(which I assume is a SmallIntegerField
since the models were not provided) and Value(1)
.
While CombinedExpression._resolve_output_field
resolving logic deals with subclassing properly BaseExpression._resolve_output_field
isn't smart enough to resolve a mix of IntegerField
subclasses to the largest denominator. By smart I mean that there isn't a proper way do this at such a low level for all subclasses of BaseExpression
hence why it's considered a bad idea to event try doing so.
The three ways we could be fixing that would be:
- Document this requirement the usage of explicit
output_field
either toValue
beyond the 3.2 release notes. - Make
Value._resolve_field
use different kind ofIntegerField
subclasses when dealing with smaller/larger values. This could have the unintended side effect of surfacing even more errors for mixed type asValue(n < 255)
would resolve toSmallIntegerField
so I think this is a bad idea. - Actually make
BasExpression._resolve_output_field
smarter with the deprecation period it requires.
comment:12 by , 2 years ago
If I understand well, you mean that the future vision (given the comment you pointed out) is not to try to infer everything, and that I should indeed in my code specify this output_field
in this case?
Just trying to understand what I should do on my side, and if there's a way for me to help you with that
comment:13 by , 2 years ago
Martin, the only way I see on how this can be solved is to revisit how BaseExpression._resolve_output_field
deals with variadic merge expressions (e.g. Case
, Coalesce
) in a way that tries to be a bit more clever with regards to type integer types handling. You could give it a shot but be aware that it's going to be exploration work, certainly a nice way to learn more about the ORM works but not necessarily one that will lead to a fool proof solution.
In the mean time, I think that the only solution is effectively to add explicit output_field
.
I'd be happy to review your exploration work but just a small disclaimer that I'm not sure what the proper solution might be.
comment:14 by , 2 years ago
Alright thank you for you explanation, I'd be happy to have a look. I will get back to you if I find something interesting
follow-up: 16 comment:15 by , 2 years ago
Case()
is trying to resolve an output type only for specific fields that cannot be implicitly cast on PostgreSQL (namely GenericIPAddressField
, IPAddressField
, TimeField
, and UUIDField
) in other cases PostgreSQL does the work, see docs. As far as I'm aware, we should be safe to ignore errors in resolving an output field in other cases:
-
django/db/models/expressions.py
diff --git a/django/db/models/expressions.py b/django/db/models/expressions.py index c270ef16c7..575712806f 100644
a b class Case(SQLiteNumericMixin, Expression): 1483 1483 sql_params.extend(default_params) 1484 1484 template = template or template_params.get("template", self.template) 1485 1485 sql = template % template_params 1486 if self._output_field_or_none is not None: 1487 sql = connection.ops.unification_cast_sql(self.output_field) % sql 1486 try: 1487 if self._output_field_or_none is not None: 1488 sql = connection.ops.unification_cast_sql(self.output_field) % sql 1489 except FieldError: 1490 pass 1488 1491 return sql, sql_params 1489 1492 1490 1493 def get_group_by_cols(self):
comment:16 by , 2 years ago
Replying to Mariusz Felisiak:
Case()
is trying to resolve an output type only for specific fields that cannot be implicitly cast on PostgreSQL (namelyGenericIPAddressField
,IPAddressField
,TimeField
, andUUIDField
) in other cases PostgreSQL does the work, see docs. As far as I'm aware, we should be safe to ignore errors in resolving an output field in other cases:
After a deeper investigation, this ☝️ only fixes one case:
Case( When(Value(True), then=F("small_integer") + Value(1)), default=F("integer"), )
the following still crashes:
Case( When(Value(True), then=F("small_integer") + Value(1)), default=F("small_integer"), )
I don't think it's worth changing anymore. I theory, 40b8a6174f001a310aa33f7880db0efeeb04d4c4 is a regression, however the previous behavior was also buggy, e.g. F("small_integer") + Value(100000000)
was resolved to SmallIntegerField
, so I think it better to raise an exception in all cases. I'm going to add a small release note and close this as "wontfix" when it gets merged.
comment:20 by , 2 years ago
Resolution: | → wontfix |
---|---|
Status: | assigned → closed |
Triage Stage: | Accepted → Unreviewed |
Thanks for the report. I think we should automatically resolved
output_field
for a mix of integer types, e.g.:IntegerField
andSmallIntegerField
->IntegerField
,IntegerField
andBigIntegerField
->BigIntegerField
,BigIntegerField
andSmallIntegerField
->BigIntegerField
.Regression in 40b8a6174f001a310aa33f7880db0efeeb04d4c4.