#33018 closed Bug (fixed)
Incorrect annotation value when doing a subquery with empty queryset
| Reported by: | decomorreno | Owned by: | David Wobrock |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 3.2 |
| Severity: | Normal | Keywords: | orm, annotate, EmptyQuerySet, empty, count |
| Cc: | David Wobrock | Triage Stage: | Ready for checkin |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
ORM seems to generate annotation/subqueries incorrectly if empty queryset is used.
Models:
class Article(models.Model):
author_name = models.CharField(max_length=100)
content = models.TextField()
is_public = models.BooleanField()
class Comment(models.Model):
article = models.ForeignKey(Article, related_name="comments", on_delete=models.CASCADE)
author_name = models.CharField(max_length=100)
content = models.TextField()
test data:
article = Article.objects.create(author_name="Jack", content="Example content", is_public=True) comment = Comment.objects.create(article=article, author_name="John", content="Example comment")
queries:
qs = Article.objects.all()
# keep one list_x uncommented to see the difference:
list_x = ["random_thing_that_is_not_equal_to_any_authors_name"] # list not empty, bug doesnt occur
#list_x = [] # if this list is empty, then the bug occurs
comment_qs = Comment.objects.filter(author_name__in=list_x)
qs = qs.annotate(
A=Coalesce(Subquery(
comment_qs.annotate(x=Count('content')).values('x')[:1], output_field=IntegerField(),
), 101) # if list_x == [], Coalesce wont work and A will be 0 instead of 101
)
# please note that above annotation doesnt make much logical sense, its just for testing purposes
qs = qs.annotate(
B=Value(99, output_field=IntegerField())
)
qs = qs.annotate(
C=F("A") + F("B") # if list_x == [], C will result in 0 sic! instead of 101 + 99 = 200
)
data = {
"A": qs.last().A,
"B": qs.last().B,
"C": qs.last().C,
}
print(data)
print(format_sql(qs.query))
console output for list_x=["random_thing_that_is_not_equal_to_any_authors_name"] (expected, correct):
{'A': 101, 'B': 99, 'C': 200}
SELECT "articles_article"."id",
"articles_article"."author_name",
"articles_article"."content",
"articles_article"."is_public",
COALESCE(
(SELECT COUNT(U0."content") AS "x"
FROM "articles_comment" U0
WHERE U0."author_name" IN (random_thing_that_is_not_equal_to_any_authors_name)
GROUP BY U0."id", U0."article_id", U0."author_name", U0."content"
LIMIT 1), 101) AS "A",
99 AS "B",
(COALESCE(
(SELECT COUNT(U0."content") AS "x"
FROM "articles_comment" U0
WHERE U0."author_name" IN (random_thing_that_is_not_equal_to_any_authors_name)
GROUP BY U0."id", U0."article_id", U0."author_name", U0."content"
LIMIT 1), 101) + 99) AS "C"
FROM "articles_article"
console output for list_x=[] (incorrect):
{'A': 0, 'B': 99, 'C': 0}
SELECT "articles_article"."id",
"articles_article"."author_name",
"articles_article"."content",
"articles_article"."is_public",
0 AS "A",
99 AS "B",
0 AS "C"
FROM "articles_article"
Background story: Above queries are made up (simplified), but based on some parts of logic that I had in my code. list_x was generated dynamically, and it was very hard to detect what is causing unexpected results. This behavior is very strange, I believe its a bug and needs to be fixed, because it is totally unintuitive that:
SomeModel.objects.filter(x__in=["something_that_causes_this_qs_lenth_to_be_0"])
and
SomeModel.objects.filter(x__in=[])
may yield different results when used in queries later, even though results of this querysets are logically equivalent
I will attach a minimal repro project (with code from above)
Attachments (1)
Change History (11)
by , 4 years ago
| Attachment: | bug_repro.zip added |
|---|
comment:1 by , 4 years ago
| Summary: | Incorrect annotation value when doing a subquery on empty queryset → Incorrect annotation value when doing a subquery with empty queryset |
|---|
comment:2 by , 4 years ago
| Triage Stage: | Unreviewed → Accepted |
|---|
comment:3 by , 4 years ago
| Cc: | added |
|---|---|
| Has patch: | set |
| Owner: | changed from to |
| Status: | new → assigned |
Hi!
Thanks for the hints Simon, I tried a first patch where we catch empty values here https://github.com/django/django/pull/14770
Should we expand the solution a bit further and rename empty_aggregate_value as you suggested, and use it in the SQLCompiler too?
comment:5 by , 4 years ago
Hi, the patch is waiting on some review I believe. So once a maintainer has a bit of time available, we'll be able to move forward :)
comment:6 by , 4 years ago
| Needs tests: | set |
|---|---|
| Patch needs improvement: | set |
comment:7 by , 4 years ago
| Needs tests: | unset |
|---|---|
| Patch needs improvement: | unset |
comment:8 by , 4 years ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
The
0assignment on empty result set comes from this line. I assume we could adjust the logic to rely ongetattr(col, 'empty_aggregate_value', NotImplemented)and fallback to'0'if it's missing.Makes me wonder if we want to rename
empty_aggregate_valuetoempty_result_set_valueinstead since it's not entirely bound to aggregation anymore.e.g. the following should exhibit the same behavior
It also seems weird that we default to
0as opposed toNULLwhich would be a more correct value.Alternatively we could adjust
Coalesce.as_sqlto catchEmptyResultSetwhen it's compiling its source expressions but that's more involved as most of the logic for that currently lives inFunc.as_sql. We could also use both of these approaches.