Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#33018 closed Bug (fixed)

Incorrect annotation value when doing a subquery with empty queryset

Reported by: decomorreno Owned by: David Wobrock
Component: Database layer (models, ORM) Version: 3.2
Severity: Normal Keywords: orm, annotate, EmptyQuerySet, empty, count
Cc: David Wobrock Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

ORM seems to generate annotation/subqueries incorrectly if empty queryset is used.

Models:

class Article(models.Model):
    author_name = models.CharField(max_length=100)
    content = models.TextField()
    is_public = models.BooleanField()


class Comment(models.Model):
    article = models.ForeignKey(Article, related_name="comments", on_delete=models.CASCADE)
    author_name = models.CharField(max_length=100)
    content = models.TextField()

test data:

article = Article.objects.create(author_name="Jack", content="Example content", is_public=True)
comment = Comment.objects.create(article=article, author_name="John", content="Example comment")

queries:

qs = Article.objects.all()

# keep one list_x uncommented to see the difference:
list_x = ["random_thing_that_is_not_equal_to_any_authors_name"] # list not empty, bug doesnt occur
#list_x = [] # if this list is empty, then the bug occurs

comment_qs = Comment.objects.filter(author_name__in=list_x)

qs = qs.annotate(
    A=Coalesce(Subquery(
        comment_qs.annotate(x=Count('content')).values('x')[:1], output_field=IntegerField(),
    ), 101) # if list_x == [], Coalesce wont work and A will be 0 instead of 101
)
# please note that above annotation doesnt make much logical sense, its just for testing purposes

qs = qs.annotate(
    B=Value(99, output_field=IntegerField())
)

qs = qs.annotate(
    C=F("A") + F("B") # if list_x == [], C will result in 0 sic! instead of 101 + 99 = 200
)

data = {
    "A": qs.last().A,
    "B": qs.last().B,
    "C": qs.last().C,
}

print(data)
print(format_sql(qs.query))

console output for list_x=["random_thing_that_is_not_equal_to_any_authors_name"] (expected, correct):

{'A': 101, 'B': 99, 'C': 200}
SELECT "articles_article"."id",
       "articles_article"."author_name",
       "articles_article"."content",
       "articles_article"."is_public",
       COALESCE(
                  (SELECT COUNT(U0."content") AS "x"
                   FROM "articles_comment" U0
                   WHERE U0."author_name" IN (random_thing_that_is_not_equal_to_any_authors_name)
                   GROUP BY U0."id", U0."article_id", U0."author_name", U0."content"
                   LIMIT 1), 101) AS "A",
       99 AS "B",
       (COALESCE(
                   (SELECT COUNT(U0."content") AS "x"
                    FROM "articles_comment" U0
                    WHERE U0."author_name" IN (random_thing_that_is_not_equal_to_any_authors_name)
                    GROUP BY U0."id", U0."article_id", U0."author_name", U0."content"
                    LIMIT 1), 101) + 99) AS "C"
FROM "articles_article"

console output for list_x=[] (incorrect):

{'A': 0, 'B': 99, 'C': 0}
SELECT "articles_article"."id",
       "articles_article"."author_name",
       "articles_article"."content",
       "articles_article"."is_public",
       0 AS "A",
       99 AS "B",
       0 AS "C"
FROM "articles_article"

Background story: Above queries are made up (simplified), but based on some parts of logic that I had in my code. list_x was generated dynamically, and it was very hard to detect what is causing unexpected results. This behavior is very strange, I believe its a bug and needs to be fixed, because it is totally unintuitive that:

SomeModel.objects.filter(x__in=["something_that_causes_this_qs_lenth_to_be_0"])

and

SomeModel.objects.filter(x__in=[])

may yield different results when used in queries later, even though results of this querysets are logically equivalent

I will attach a minimal repro project (with code from above)

Attachments (1)

bug_repro.zip (20.4 KB ) - added by decomorreno 3 years ago.

Download all attachments as: .zip

Change History (11)

by decomorreno, 3 years ago

Attachment: bug_repro.zip added

comment:1 by decomorreno, 3 years ago

Summary: Incorrect annotation value when doing a subquery on empty querysetIncorrect annotation value when doing a subquery with empty queryset

comment:2 by Simon Charette, 3 years ago

Triage Stage: UnreviewedAccepted

The 0 assignment on empty result set comes from this line. I assume we could adjust the logic to rely on getattr(col, 'empty_aggregate_value', NotImplemented) and fallback to '0' if it's missing.

Makes me wonder if we'd want to rename empty_aggregate_value to empty_result_set_value instead since it would not entirely be bound to aggregation anymore.

e.g. the following should exhibit the same behavior

Author.objects.annotate(annotation=Coalesce(Author.objects.empty(), 42))

It also seems weird that we default to 0 as opposed to NULL which would be a more correct value for a non-coalesced annotation.

Alternatively we could adjust Coalesce.as_sql to catch EmptyResultSet when it's compiling its source expressions but that's more involved as most of the logic for that currently lives in Func.as_sql. We could also use both of these approaches.

Last edited 3 years ago by Simon Charette (previous) (diff)

comment:3 by David Wobrock, 3 years ago

Cc: David Wobrock added
Has patch: set
Owner: changed from nobody to David Wobrock
Status: newassigned

Hi!

Thanks for the hints Simon, I tried a first patch where we catch empty values here https://github.com/django/django/pull/14770

Should we expand the solution a bit further and rename empty_aggregate_value as you suggested, and use it in the SQLCompiler too?

comment:4 by decomorreno, 3 years ago

Hi, thanks for your PR, do we have any updates on this?

comment:5 by David Wobrock, 3 years ago

Hi, the patch is waiting on some review I believe. So once a maintainer has a bit of time available, we'll be able to move forward :)

comment:6 by Mariusz Felisiak, 3 years ago

Needs tests: set
Patch needs improvement: set

comment:7 by David Wobrock, 3 years ago

Needs tests: unset
Patch needs improvement: unset

comment:8 by Mariusz Felisiak, 3 years ago

Triage Stage: AcceptedReady for checkin

comment:9 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

Resolution: fixed
Status: assignedclosed

In dd1fa3a3:

Fixed #33018 -- Fixed annotations with empty queryset.

Thanks Simon Charette for the review and implementation idea.

comment:10 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

In b2a0978:

[4.0.x] Fixed #33018 -- Fixed annotations with empty queryset.

Thanks Simon Charette for the review and implementation idea.

Backport of dd1fa3a31b4680c0d3712e6ae122b878138580c7 from main

Note: See TracTickets for help on using tickets.
Back to Top