#34254 closed Bug (fixed)
Exists annotations can return non-boolean results (i.e. None) if used with an empty QuerySet.
| Reported by: | Keryn Knight | Owned by: | rajdesai24 |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | dev |
| Severity: | Normal | Keywords: | Exists EmptyQuerySet sqlite postgres |
| Cc: | Triage Stage: | Ready for checkin | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
I suspect this is following on from, but potentially separate to #33018 -- because that ticket starts out using Django 3.2 to observe that an empty queryset (EmptyQuerySet or whatever, via none()) can short circuit evaluation to be 0 as ... in the SQL query, which is exactly the same problem I observed.
However, as far as I can tell, the result of an Exists(queryset.none()) can still return values outside of True/False, namely, None.
Using Django main as of 4593bc5da115f2e808a803a4ec24104b6c7a6152 (from Wed Jan 11 ... 2023), here's the behaviour on both postgres and sqlite. In both scenarios I'm using 3.10.2 with psycopg2==2.9.3 and sqlite3.sqlite_version is 3.37.0 and sqlite3.version is 2.6.0.
IPython outputs 8 and 11 are the problems.
class A(models.Model):
pass
class B(models.Model):
pass
In [1]: from app.models import A, B
In [2]: A.objects.using("pg").create()
Out[2]: <A: A object (1)>
In [3]: B.objects.using("pg").create()
Out[3]: <B: B object (1)>
In [4]: A.objects.using("sqlite").create()
Out[4]: <A: A object (1)>
In [4]: B.objects.using("sqlite").create()
Out[4]: <B: B object (1)>
In [5]: from django.db.models import Exists
In [6]: A.objects.using("sqlite").annotate(should_be_bool=Exists(B.objects.all())).first().should_be_bool
Out[6]: True
In [7]: A.objects.using("sqlite").annotate(should_be_bool=Exists(B.objects.filter(pk=99999999))).first().should_be_bool
Out[7]: False
In [8]: A.objects.using("sqlite").annotate(should_be_bool=Exists(B.objects.none())).first().should_be_bool
# This is the problem, it returned neither True nor False
In [9]: A.objects.using("pg").annotate(should_be_bool=Exists(B.objects.all())).first().should_be_bool
Out[9]: True
In [10]: A.objects.using("pg").annotate(should_be_bool=Exists(B.objects.filter(pk=99999999))).first().should_be_bool
Out[10]: False
In [11]: A.objects.using("pg").annotate(should_be_bool=Exists(B.objects.none())).first().should_be_bool
# This is the problem, it returned neither True nor False
And the queries, which are the same for postgres & sqlite:
# ...
{'sql': 'SELECT "app_a"."id", EXISTS(SELECT 1 AS "a" FROM "app_b" LIMIT 1) AS "should_be_bool" FROM "app_a" ORDER BY "app_a"."id" ASC LIMIT 1',
'time': '0.001'},
{'sql': 'SELECT "app_a"."id", EXISTS(SELECT 1 AS "a" FROM "app_b" U0 WHERE U0."id" = 99999999 LIMIT 1) AS "should_be_bool" FROM "app_a" ORDER BY "app_a"."id" ASC LIMIT 1',
'time': '0.001'},
{'sql': 'SELECT "app_a"."id", NULL AS "should_be_bool" FROM "app_a" ORDER BY "app_a"."id" ASC LIMIT 1',
'time': '0.001'}
Given Exists has an output_field of BooleanField and that definition doesn't have null=True as an argument, it seems incongruent from both an expectations ("exists sounds boolean") and implementation ("it doesn't say it could be null") standpoint.
Whilst the problem exists in main, it has also changed behaviour (presumably or potentially unexpectedly) since 3.2, where postgres and sqlite actually do different things, hence we tested both above. So main is now consistent, but I'd personally argue it's consistently wrong (for a given value thereof, no judgement made!)
In 3.2.16, under sqlite, using annotate(x=Exists(y.none())) returns False but on main it now returns None (see above) -- the 3.2 behaviour is correct for my expectations
In [4]: A.objects.using("sqlite").annotate(should_be_bool=Exists(B.objects.none())).first().should_be_bool
Out[4]: False
In [5]: connections['sqlite'].queries
Out[5]:
{'sql': 'SELECT "app_a"."id", 0 AS "should_be_bool" FROM "app_a" ORDER BY "app_a"."id" ASC LIMIT 1',
'time': '0.000'}
In 3.2.16 with postgres we get neither None nor False but the integer 0 instead:
In [4]: A.objects.using("pg").annotate(should_be_bool=Exists(B.objects.none())).first().should_be_bool
Out[4]: 0
In [5]: connections['pg'].queries
Out[5]:
{'sql': 'SELECT "app_a"."id", 0 AS "should_be_bool" FROM "app_a" ORDER BY "app_a"."id" ASC LIMIT 1',
'time': '0.001'}
So we can observe that under 3.2 using the 0 AS ... behaviour
- sqlite appears to behave correctly (returning
False) - postgres appears to behave incorrectly (failing to cast a raw integer to a boolean)
And under main using the NULL AS ... behaviour
- sqlite no longer behaves the same, returning
Nonewhere I'd expectFalse(or evenTruegiven the way SQL EXISTS works...) - postgres behaves differently,
0is nowNonebut I'd still expectTrueorFalseas the output.
Change History (15)
follow-up: 4 comment:1 by , 3 years ago
| Triage Stage: | Unreviewed → Accepted |
|---|
comment:2 by , 3 years ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:4 by , 3 years ago
Replying to Simon Charette:
I suspect the solution is as simple as setting
Exists.empty_result_set_value = False[...]
[...]
If you can confirm this addresses the problem you reported [...]
Confirming, it does seem to be as simple as that (a pleasant rarity in ORM edge cases, I'm sure!) for the 2 backends I was testing (postgres and sqlite -- I've notably not checked mysql or oracle etc):
In [3]: A.objects.using("sqlite").annotate(should_be_bool=Exists(B.objects.none())).first().should_be_bool
Out[3]: False
In [4]: A.objects.using("pg").annotate(should_be_bool=Exists(B.objects.none())).first().should_be_bool
Out[4]: False
The queries are still recorded as different, but I expect that's normal coercion differences between sqlite & postgres:
In [6]: connections['sqlite'].queries
Out[6]:
[{'sql': 'SELECT "app_a"."id", 0 AS "should_be_bool" FROM "app_a" ORDER BY "app_a"."id" ASC LIMIT 1',
'time': '0.001'}]
In [7]: connections['pg'].queries
Out[7]:
[{'sql': 'SELECT "app_a"."id", false AS "should_be_bool" FROM "app_a" ORDER BY "app_a"."id" ASC LIMIT 1',
'time': '0.001'}]
WRT to fixing it, as rajdesai24 has expressed an interest in doing so, I'll let them take the lead and get the contribution under their belt.
Replying to rajdesai24:
Hey Simon would love to submit a PR
comment:5 by , 3 years ago
Thanks for confirming Keryn, the difference in query generation is effectively expected on depending on whether the backend has a native boolean type.
rajdesai24, sure thing please go ahead and submit a PR. Make sure to include a regression test demonstrating the problem that fails without the patch applied and passes with it applied.
I suggest adding one in tests/annotations/tests.py similarly to the one added in dd1fa3a31b4680c0d3712e6ae122b878138580c7 but using Exists instead of Subquery.
comment:7 by , 3 years ago
Hey Simon the changes are done. The only thing is my test is pretty simple, which is testing the exact query that Keryn reported. Do tell me if any issues with the PR or any changes to the tests.
comment:8 by , 3 years ago
| Has patch: | set |
|---|---|
| Patch needs improvement: | set |
Thanks for the patch rajdesai24. I left some comments on the PR.
comment:9 by , 3 years ago
hey, Simon please review the changes and If everything seems right, merge them if possible.
comment:10 by , 3 years ago
Left some more comments on the PR, the test is not testing the right thing and passes even without the changes applied.
comment:11 by , 3 years ago
| Needs tests: | set |
|---|
comment:12 by , 3 years ago
| Needs tests: | unset |
|---|---|
| Patch needs improvement: | unset |
Proposed MR now has an appropriate regression test.
comment:13 by , 3 years ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
Thanks for the detailed report Keryn.
I suspect the solution is as simple as setting
Exists.empty_result_set_value = Falsewhich was missed in dd1fa3a31b4680c0d3712e6ae122b878138580c7 and sinceExistssublassesSubqueryit inherited its.empty_result_set_value = None.If you can confirm this addresses the problem you reported could you submit a PR with a regression test?