Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#34145 closed Bug (needsinfo)

Explicit GROUPing by aggregate is not supported

Reported by: Karolis Ryselis Owned by: nobody
Component: Database layer (models, ORM) Version: 4.1
Severity: Normal Keywords:
Cc: Simon Charette Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Consider the following models:

class PackingTag(models.Model):
    title = models.CharField(max_length=64)


class Packing(models.Model):
    tags = models.ManyToManyField(to='PackingTag')


class PackingItem(models.Model):
    packing = models.ForeignKey(to='Packing', on_delete=models.CASCADE)
    amount = models.DecimalField(max_digits=12, decimal_places=4)

Consider a queryset like this:

PackingItem.objects.annotate(tags=Min('packing__tags__title')).values('tags').annotate(total_amount=Sum('amount'))

Currently on both 4.1 and 3.2 it yields this query using SQLite backend (MySQL backend on 3.2 outputs a very similar query as well):

SELECT MIN("packings_packingtag"."title") AS "tags", CAST(SUM("packings_packingitem"."amount") AS NUMERIC) AS "total_amount"
FROM "packings_packingitem"
         INNER JOIN "packings_packing" ON ("packings_packingitem"."packing_id" = "packings_packing"."id")
         LEFT OUTER JOIN "packings_packing_tags" ON ("packings_packing"."id" = "packings_packing_tags"."packing_id")
         LEFT OUTER JOIN "packings_packingtag" ON ("packings_packing_tags"."packingtag_id" = "packings_packingtag"."id")

I would expect this queryset to either group by tags (this example may not make much sense, my use case involves GroupConcat aggregate from django_mysql package, however, this is the same for all aggregates) because passing a regular field expression constructs a group by in the SQL, or crash with an error telling that grouping by annotations is not supported.

Change History (3)

comment:1 by Simon Charette, 2 years ago

Cc: Simon Charette added
Component: UncategorizedDatabase layer (models, ORM)
Resolution: needsinfo
Status: newclosed
Summary: It is currently not possible to group by annotated fieldsExplicit GROUPing by aggregate is not supported

I would expect this queryset to either group by tags...

I think you'll have to provide more details on what kind of SQL you are try to produce as it's not possible to GROUP BY an aggregate function and it's not clear from the example you've provided what exactly you are trying to achieve.

Are you looking to use a subquery?

PackingItem.objects.values(
    min_packing_tags_title=PackingTag.objects.values(
        "packings"
    ).filter(packings=OuterRef("packing")).values_list(
        Min("title"),
    )
).annotate(
    total_amount=Sum('amount')
)

Which should generate something along the lines of

SELECT
    (
        SELECT MIN(title)
        FROM packing_tag
        LEFT JOIN packing_tags ON (packing_tag.id = packing_tags.tag_id)
        WHERE packing_tags.packing_id = packtingitem.packing_id
        GROUP BY packing_tags.packing_id
    ) AS min_packing_tags_title,
    SUM(amount) AS sum_amount
FROM packtingitem
GROUP BY min_packing_tags_title
Version 0, edited 2 years ago by Simon Charette (next)

comment:2 by Karolis Ryselis, 2 years ago

The tricky part is when you try to add more fields to group by.
My actual Packing and PackingItem models have more fields, so what I am actually trying to achieve is something like

PackingItem.objects.filter(packing__in=my_packings, amount__gt=0).annotate(tags=GroupConcat('packing__tags', separator=',')).values('unit', 'good', 'tags').annotate(total_amount=Sum('amount'))

What happens inside is that unit and good (FK fields in my case) are added to group by while tags is silently ignored and I get an actual queryset grouped by part of the fields that I passed to values. I believe that this is a case where an exception could thrown stating that this is not supported rather than giving "best effort" result.

comment:3 by Simon Charette, 2 years ago

Let's decorticate the queryset your building step by step (I'll drop the filter as it's not significant)

PackingItem.objects.annotate(tags=GroupConcat('packing__tags'))

Generates

SELECT packingitem.*, GROUP_CONCAT(packing_tags.tag_id)
FROM packingitem
LEFT JOIN packing ON (packing.id = packingitem.packing_id)
LEFT JOIN packing_tags ON (packing_tags.packing_id = packing.id)
GROUP BY packingitem.id

Then, which is still entirely valid and we can't error out on

PackingItem.objects.annotate(
    tags=GroupConcat('packing__tags')
).values('unit', 'good', 'tags')
SELECT packingitem.unit, packingitem.good, GROUP_CONCAT(packing_tags.tag_id)
FROM packingitem
LEFT JOIN packing ON (packing.id = packingitem.packing_id)
LEFT JOIN packing_tags ON (packing_tags.packing_id = packing.id)
GROUP BY packingitem.unit, packingitem.good

Then you'd want

PackingItem.objects.annotate(
    tags=GroupConcat('packing__tags')
).values('unit', 'good', 'tags').annotate(
    total_amount=Sum('amount')
)

To either error out. The thing here is that I can't see an easy way to perform a deprecation; values() has been dual purposed for years (selecting fields and defining grouping) and changing that now will be highly backward incompatible.

If you want to give it a shot yourself and see how the test suite behaves to propose a patch here are the two locations you should look into

  1. annotate method which sets query.group_by when an annotation of an aggregate is made (source)
  2. set_values method which further adjust the group by based on the provided values (source)

My guess is that you'll want to adjust 1. to raise an exception if instance(clone.query.group_by, tuple) and any(expr.contains_aggregate for expr in clone.query) in the if alias in annotations and annotation.contains_aggregate branch.

Last edited 2 years ago by Simon Charette (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top