#33018 closed Bug (fixed)
Incorrect annotation value when doing a subquery with empty queryset
| Reported by: | decomorreno | Owned by: | David Wobrock |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 3.2 |
| Severity: | Normal | Keywords: | orm, annotate, EmptyQuerySet, empty, count |
| Cc: | David Wobrock | Triage Stage: | Ready for checkin |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
ORM seems to generate annotation/subqueries incorrectly if empty queryset is used.
Models:
class Article(models.Model):
author_name = models.CharField(max_length=100)
content = models.TextField()
is_public = models.BooleanField()
class Comment(models.Model):
article = models.ForeignKey(Article, related_name="comments", on_delete=models.CASCADE)
author_name = models.CharField(max_length=100)
content = models.TextField()
test data:
article = Article.objects.create(author_name="Jack", content="Example content", is_public=True) comment = Comment.objects.create(article=article, author_name="John", content="Example comment")
queries:
qs = Article.objects.all()
# keep one list_x uncommented to see the difference:
list_x = ["random_thing_that_is_not_equal_to_any_authors_name"] # list not empty, bug doesnt occur
#list_x = [] # if this list is empty, then the bug occurs
comment_qs = Comment.objects.filter(author_name__in=list_x)
qs = qs.annotate(
A=Coalesce(Subquery(
comment_qs.annotate(x=Count('content')).values('x')[:1], output_field=IntegerField(),
), 101) # if list_x == [], Coalesce wont work and A will be 0 instead of 101
)
# please note that above annotation doesnt make much logical sense, its just for testing purposes
qs = qs.annotate(
B=Value(99, output_field=IntegerField())
)
qs = qs.annotate(
C=F("A") + F("B") # if list_x == [], C will result in 0 sic! instead of 101 + 99 = 200
)
data = {
"A": qs.last().A,
"B": qs.last().B,
"C": qs.last().C,
}
print(data)
print(format_sql(qs.query))
console output for list_x=["random_thing_that_is_not_equal_to_any_authors_name"] (expected, correct):
{'A': 101, 'B': 99, 'C': 200}
SELECT "articles_article"."id",
"articles_article"."author_name",
"articles_article"."content",
"articles_article"."is_public",
COALESCE(
(SELECT COUNT(U0."content") AS "x"
FROM "articles_comment" U0
WHERE U0."author_name" IN (random_thing_that_is_not_equal_to_any_authors_name)
GROUP BY U0."id", U0."article_id", U0."author_name", U0."content"
LIMIT 1), 101) AS "A",
99 AS "B",
(COALESCE(
(SELECT COUNT(U0."content") AS "x"
FROM "articles_comment" U0
WHERE U0."author_name" IN (random_thing_that_is_not_equal_to_any_authors_name)
GROUP BY U0."id", U0."article_id", U0."author_name", U0."content"
LIMIT 1), 101) + 99) AS "C"
FROM "articles_article"
console output for list_x=[] (incorrect):
{'A': 0, 'B': 99, 'C': 0}
SELECT "articles_article"."id",
"articles_article"."author_name",
"articles_article"."content",
"articles_article"."is_public",
0 AS "A",
99 AS "B",
0 AS "C"
FROM "articles_article"
Background story: Above queries are made up (simplified), but based on some parts of logic that I had in my code. list_x was generated dynamically, and it was very hard to detect what is causing unexpected results. This behavior is very strange, I believe its a bug and needs to be fixed, because it is totally unintuitive that:
SomeModel.objects.filter(x__in=["something_that_causes_this_qs_lenth_to_be_0"])
and
SomeModel.objects.filter(x__in=[])
may yield different results when used in queries later, even though results of this querysets are logically equivalent
I will attach a minimal repro project (with code from above)
Attachments (1)
Change History (11)
by , 4 years ago
| Attachment: | bug_repro.zip added |
|---|
comment:1 by , 4 years ago
| Summary: | Incorrect annotation value when doing a subquery on empty queryset → Incorrect annotation value when doing a subquery with empty queryset |
|---|
comment:2 by , 4 years ago
| Triage Stage: | Unreviewed → Accepted |
|---|
comment:3 by , 4 years ago
| Cc: | added |
|---|---|
| Has patch: | set |
| Owner: | changed from to |
| Status: | new → assigned |
Hi!
Thanks for the hints Simon, I tried a first patch where we catch empty values here https://github.com/django/django/pull/14770
Should we expand the solution a bit further and rename empty_aggregate_value as you suggested, and use it in the SQLCompiler too?
comment:5 by , 4 years ago
Hi, the patch is waiting on some review I believe. So once a maintainer has a bit of time available, we'll be able to move forward :)
comment:6 by , 4 years ago
| Needs tests: | set |
|---|---|
| Patch needs improvement: | set |
comment:7 by , 4 years ago
| Needs tests: | unset |
|---|---|
| Patch needs improvement: | unset |
comment:8 by , 4 years ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
The
0assignment on empty result set comes from this line. I assume we could adjust the logic to rely ongetattr(col, 'empty_aggregate_value', NotImplemented)and fallback to'0'if it's missing.Makes me wonder if we'd want to rename
empty_aggregate_valuetoempty_result_set_valueinstead since it would not entirely be bound to aggregation anymore.e.g. the following should exhibit the same behavior
It also seems weird that we default to
0as opposed toNULLwhich would be a more correct value for a non-coalesced annotation.Alternatively we could adjust
Coalesce.as_sqlto catchEmptyResultSetwhen it's compiling its source expressions but that's more involved as most of the logic for that currently lives inFunc.as_sql. We could also use both of these approaches.