Opened 4 months ago

Closed 4 months ago

#32995 closed Bug (duplicate)

Incorrect GROUP BY in ORM query with Function

Reported by: Joshua "jag" Ginsberg Owned by: nobody
Component: Database layer (models, ORM) Version: 3.1
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


With this model (abbreviated to relevant fields):

class ListeningLineCall(models.Model):
    class CallResult(models.IntegerChoices):
        IN_PROGRESS = 1, "In progress"
        CONNECTED = 2, "Connected"
        LOST = 3, "Lost before connection"
        BLOCKED = 4, "Blocked"
        UNAVAILABLE = 5, "Unavailable"

    class Medium(models.IntegerChoices):
        VOICE = 1, "Voice"
        SMS = 2, "Text"

    caller = models.ForeignKey(ListeningLineCaller, on_delete=models.CASCADE)
    medium = models.PositiveIntegerField(choices=Medium.choices, default=Medium.VOICE)
    call_sid = models.CharField(max_length=34, unique=True, blank=True, null=True)
    result = models.PositiveIntegerField(choices=CallResult.choices, default=CallResult.IN_PROGRESS)
    called_at = models.DateTimeField(auto_now_add=True)

and the following QuerySet:

calls_by_date_medium_and_status = (
 ListeningLineCall.objects.annotate(called_at_date=TruncDate("called_at")).filter(called_at_date__gt=seven_days_ago, called_at_date__lt=today)
        .values_list("medium", "result", "called_at_date")

... the following SQL is generated:

>>> print(calls_by_date_medium_and_status.query)
SELECT "listeningline_listeninglinecall"."medium", "listeningline_listeninglinecall"."result", ("listeningline_listeninglinecall"."called_at" AT TIME ZONE 'US/Eastern')::date AS "called_at_date", COUNT("listeningline_listeninglinecall"."call_sid") AS "count" FROM "listeningline_listeninglinecall" WHERE (("listeningline_listeninglinecall"."called_at" AT TIME ZONE 'US/Eastern')::date > 2021-07-29 AND ("listeningline_listeninglinecall"."called_at" AT TIME ZONE 'US/Eastern')::date < 2021-08-05 AND NOT ("listeningline_listeninglinecall"."result" = 1)) GROUP BY "listeningline_listeninglinecall"."medium", "listeningline_listeninglinecall"."result", ("listeningline_listeninglinecall"."called_at" AT TIME ZONE 'US/Eastern')::date, "listeningline_listeninglinecall"."called_at"

I expected that the GROUP BY clause would contain the date-truncated version of called_at but I did not expect that it would include the non-truncated version of called_at, the inclusion of which is making my result-set unusable.

I do not see called_at without the function in the Query object's group_by attribute:

>>> pprint.pprint(calls_by_date_medium_and_status.query.group_by)
(Col(listeningline_listeninglinecall, listeningline.ListeningLineCall.medium),
 Col(listeningline_listeninglinecall, listeningline.ListeningLineCall.result),
 TruncDate(Col(listeningline_listeninglinecall, listeningline.ListeningLineCall.called_at)))

Change History (3)

comment:1 Changed 4 months ago by Mariusz Felisiak

Resolution: needsinfo
Status: newclosed
Type: UncategorizedBug

Thanks for the report, unfortunately I cannot reproduce your issue with provided models and querysets (at least on Django 3.1, 3.2, and the current main branch). Can you provide a sample project?

comment:2 Changed 4 months ago by Joshua "jag" Ginsberg

Resolution: needsinfo
Status: closednew

I think I found the issue/discrepancy between the model snippet and my actual one: the default ordering for queries included called_at -- which seems to have affected the GROUP BY, even though called_at wasn't in the query and the resulting SQL didn't have an ORDER BY clause.

This feels like a bug, though not the one I originally believed. Thoughts?

comment:3 Changed 4 months ago by Mariusz Felisiak

Resolution: duplicate
Status: newclosed

Thanks for extra details.

This feels like a bug, though not the one I originally believed. Thoughts?

It's a duplicate of #32546 (fixed in Django 4.0).

Note: See TracTickets for help on using tickets.
Back to Top