#34254 closed Bug (fixed)
Exists annotations can return non-boolean results (i.e. None) if used with an empty QuerySet.
Reported by: | Keryn Knight | Owned by: | rajdesai24 |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | Exists EmptyQuerySet sqlite postgres |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I suspect this is following on from, but potentially separate to #33018 -- because that ticket starts out using Django 3.2
to observe that an empty queryset (EmptyQuerySet
or whatever, via none()
) can short circuit evaluation to be 0 as ...
in the SQL query, which is exactly the same problem I observed.
However, as far as I can tell, the result of an Exists(queryset.none())
can still return values outside of True/False
, namely, None.
Using Django main
as of 4593bc5da115f2e808a803a4ec24104b6c7a6152
(from Wed Jan 11 ... 2023
), here's the behaviour on both postgres and sqlite. In both scenarios I'm using 3.10.2
with psycopg2==2.9.3
and sqlite3.sqlite_version
is 3.37.0
and sqlite3.version
is 2.6.0
.
IPython outputs 8 and 11 are the problems.
class A(models.Model): pass class B(models.Model): pass
In [1]: from app.models import A, B In [2]: A.objects.using("pg").create() Out[2]: <A: A object (1)> In [3]: B.objects.using("pg").create() Out[3]: <B: B object (1)> In [4]: A.objects.using("sqlite").create() Out[4]: <A: A object (1)> In [4]: B.objects.using("sqlite").create() Out[4]: <B: B object (1)> In [5]: from django.db.models import Exists In [6]: A.objects.using("sqlite").annotate(should_be_bool=Exists(B.objects.all())).first().should_be_bool Out[6]: True In [7]: A.objects.using("sqlite").annotate(should_be_bool=Exists(B.objects.filter(pk=99999999))).first().should_be_bool Out[7]: False In [8]: A.objects.using("sqlite").annotate(should_be_bool=Exists(B.objects.none())).first().should_be_bool # This is the problem, it returned neither True nor False In [9]: A.objects.using("pg").annotate(should_be_bool=Exists(B.objects.all())).first().should_be_bool Out[9]: True In [10]: A.objects.using("pg").annotate(should_be_bool=Exists(B.objects.filter(pk=99999999))).first().should_be_bool Out[10]: False In [11]: A.objects.using("pg").annotate(should_be_bool=Exists(B.objects.none())).first().should_be_bool # This is the problem, it returned neither True nor False
And the queries, which are the same for postgres & sqlite:
# ... {'sql': 'SELECT "app_a"."id", EXISTS(SELECT 1 AS "a" FROM "app_b" LIMIT 1) AS "should_be_bool" FROM "app_a" ORDER BY "app_a"."id" ASC LIMIT 1', 'time': '0.001'}, {'sql': 'SELECT "app_a"."id", EXISTS(SELECT 1 AS "a" FROM "app_b" U0 WHERE U0."id" = 99999999 LIMIT 1) AS "should_be_bool" FROM "app_a" ORDER BY "app_a"."id" ASC LIMIT 1', 'time': '0.001'}, {'sql': 'SELECT "app_a"."id", NULL AS "should_be_bool" FROM "app_a" ORDER BY "app_a"."id" ASC LIMIT 1', 'time': '0.001'}
Given Exists
has an output_field
of BooleanField
and that definition doesn't have null=True
as an argument, it seems incongruent from both an expectations ("exists sounds boolean") and implementation ("it doesn't say it could be null") standpoint.
Whilst the problem exists in main
, it has also changed behaviour (presumably or potentially unexpectedly) since 3.2
, where postgres
and sqlite
actually do different things, hence we tested both above. So main
is now consistent, but I'd personally argue it's consistently wrong (for a given value thereof, no judgement made!)
In 3.2.16
, under sqlite, using annotate(x=Exists(y.none()))
returns False
but on main
it now returns None
(see above) -- the 3.2
behaviour is correct for my expectations
In [4]: A.objects.using("sqlite").annotate(should_be_bool=Exists(B.objects.none())).first().should_be_bool Out[4]: False In [5]: connections['sqlite'].queries Out[5]: {'sql': 'SELECT "app_a"."id", 0 AS "should_be_bool" FROM "app_a" ORDER BY "app_a"."id" ASC LIMIT 1', 'time': '0.000'}
In 3.2.16
with postgres we get neither None
nor False
but the integer 0
instead:
In [4]: A.objects.using("pg").annotate(should_be_bool=Exists(B.objects.none())).first().should_be_bool Out[4]: 0 In [5]: connections['pg'].queries Out[5]: {'sql': 'SELECT "app_a"."id", 0 AS "should_be_bool" FROM "app_a" ORDER BY "app_a"."id" ASC LIMIT 1', 'time': '0.001'}
So we can observe that under 3.2
using the 0 AS ...
behaviour
- sqlite appears to behave correctly (returning
False
) - postgres appears to behave incorrectly (failing to cast a raw integer to a boolean)
And under main
using the NULL AS ...
behaviour
- sqlite no longer behaves the same, returning
None
where I'd expectFalse
(or evenTrue
given the way SQL EXISTS works...) - postgres behaves differently,
0
is nowNone
but I'd still expectTrue
orFalse
as the output.
Change History (15)
follow-up: 4 comment:1 by , 22 months ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 22 months ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:4 by , 22 months ago
Replying to Simon Charette:
I suspect the solution is as simple as setting
Exists.empty_result_set_value = False
[...]
[...]
If you can confirm this addresses the problem you reported [...]
Confirming, it does seem to be as simple as that (a pleasant rarity in ORM edge cases, I'm sure!) for the 2 backends I was testing (postgres
and sqlite
-- I've notably not checked mysql
or oracle
etc):
In [3]: A.objects.using("sqlite").annotate(should_be_bool=Exists(B.objects.none())).first().should_be_bool Out[3]: False In [4]: A.objects.using("pg").annotate(should_be_bool=Exists(B.objects.none())).first().should_be_bool Out[4]: False
The queries are still recorded as different, but I expect that's normal coercion differences between sqlite & postgres:
In [6]: connections['sqlite'].queries Out[6]: [{'sql': 'SELECT "app_a"."id", 0 AS "should_be_bool" FROM "app_a" ORDER BY "app_a"."id" ASC LIMIT 1', 'time': '0.001'}] In [7]: connections['pg'].queries Out[7]: [{'sql': 'SELECT "app_a"."id", false AS "should_be_bool" FROM "app_a" ORDER BY "app_a"."id" ASC LIMIT 1', 'time': '0.001'}]
WRT to fixing it, as rajdesai24 has expressed an interest in doing so, I'll let them take the lead and get the contribution under their belt.
Replying to rajdesai24:
Hey Simon would love to submit a PR
comment:5 by , 22 months ago
Thanks for confirming Keryn, the difference in query generation is effectively expected on depending on whether the backend has a native boolean type.
rajdesai24, sure thing please go ahead and submit a PR. Make sure to include a regression test demonstrating the problem that fails without the patch applied and passes with it applied.
I suggest adding one in tests/annotations/tests.py
similarly to the one added in dd1fa3a31b4680c0d3712e6ae122b878138580c7 but using Exists
instead of Subquery
.
comment:7 by , 22 months ago
Hey Simon the changes are done. The only thing is my test is pretty simple, which is testing the exact query that Keryn reported. Do tell me if any issues with the PR or any changes to the tests.
comment:8 by , 22 months ago
Has patch: | set |
---|---|
Patch needs improvement: | set |
Thanks for the patch rajdesai24. I left some comments on the PR.
comment:9 by , 22 months ago
hey, Simon please review the changes and If everything seems right, merge them if possible.
comment:10 by , 22 months ago
Left some more comments on the PR, the test is not testing the right thing and passes even without the changes applied.
comment:11 by , 22 months ago
Needs tests: | set |
---|
comment:12 by , 22 months ago
Needs tests: | unset |
---|---|
Patch needs improvement: | unset |
Proposed MR now has an appropriate regression test.
comment:13 by , 22 months ago
Triage Stage: | Accepted → Ready for checkin |
---|
Thanks for the detailed report Keryn.
I suspect the solution is as simple as setting
Exists.empty_result_set_value = False
which was missed in dd1fa3a31b4680c0d3712e6ae122b878138580c7 and sinceExists
sublassesSubquery
it inherited its.empty_result_set_value = None
.If you can confirm this addresses the problem you reported could you submit a PR with a regression test?