Opened 21 months ago

Closed 21 months ago

Last modified 21 months ago

#31377 closed Bug (fixed)

Django 3.0: "GROUP BY" clauses error with tricky field annotation

Reported by: Holovashchenko Vadym Owned by: Hasan Ramezani
Component: Database layer (models, ORM) Version: 3.0
Severity: Release blocker Keywords: group by, postgres, annotate
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Let's pretend that we have next model structure with next model's relations:

class A(models.Model):
    bs = models.ManyToManyField('B',
                                related_name="a",
                                through="AB")


class B(models.Model):
    pass


class AB(models.Model):
    a = models.ForeignKey(A, on_delete=models.CASCADE, related_name="ab_a")
    b = models.ForeignKey(B, on_delete=models.CASCADE, related_name="ab_b")

    status = models.IntegerField()


class C(models.Model):
    a = models.ForeignKey(
        A,
        null=True,
        blank=True,
        on_delete=models.SET_NULL,
        related_name="c",
        verbose_name=_("a")
    )
    status = models.IntegerField()

Let's try to evaluate next query

ab_query = AB.objects.filter(a=OuterRef("pk"), b=1)
filter_conditions = Q(pk=1) | Q(ab_a__b=1)

query = A.objects.\
    filter(filter_conditions).\
    annotate(
        status=Subquery(ab_query.values("status")),
        c_count=Count("c"),
)

answer = query.values("status").annotate(total_count=Count("status"))
print(answer.query)
print(answer)

On Django 3.0.4 we have an error

django.db.utils.ProgrammingError: column reference "status" is ambiguous

and query is next:

SELECT (SELECT U0."status" FROM "test_app_ab" U0 WHERE (U0."a_id" = "test_app_a"."id" AND U0."b_id" = 1)) AS "status", COUNT((SELECT U0."status" FROM "test_app_ab" U0 WHERE (U0."a_id" = "test_app_a"."id" AND U0."b_id" = 1))) AS "total_count" FROM "test_app_a" LEFT OUTER JOIN "test_app_ab" ON ("test_app_a"."id" = "test_app_ab"."a_id") LEFT OUTER JOIN "test_app_c" ON ("test_app_a"."id" = "test_app_c"."a_id") WHERE ("test_app_a"."id" = 1 OR "test_app_ab"."b_id" = 1) GROUP BY "status"

However, Django 2.2.11 processed this query properly with the next query:

SELECT (SELECT U0."status" FROM "test_app_ab" U0 WHERE (U0."a_id" = ("test_app_a"."id") AND U0."b_id" = 1)) AS "status", COUNT((SELECT U0."status" FROM "test_app_ab" U0 WHERE (U0."a_id" = ("test_app_a"."id") AND U0."b_id" = 1))) AS "total_count" FROM "test_app_a" LEFT OUTER JOIN "test_app_ab" ON ("test_app_a"."id" = "test_app_ab"."a_id") LEFT OUTER JOIN "test_app_c" ON ("test_app_a"."id" = "test_app_c"."a_id") WHERE ("test_app_a"."id" = 1 OR "test_app_ab"."b_id" = 1) GROUP BY (SELECT U0."status" FROM "test_app_ab" U0 WHERE (U0."a_id" = ("test_app_a"."id") AND U0."b_id" = 1))

so, the difference in "GROUP BY" clauses
(as DB provider uses "django.db.backends.postgresql", postgresql 11)

Change History (8)

comment:1 Changed 21 months ago by Simon Charette

Severity: NormalRelease blocker
Triage Stage: UnreviewedAccepted

This is due to a collision of AB.status and the status annotation.

The easiest way to solve this issue is to disable group by alias when a collision is detected with involved table columns. This can be easily worked around by avoiding to use an annotation name that conflicts with involved table column names.

comment:2 Changed 21 months ago by Hasan Ramezani

Owner: changed from nobody to Hasan Ramezani
Status: newassigned

@Simon I think we have the check for collision in annotation alias and model fields .
How can we find the involved tables columns?
Thanks

comment:3 Changed 21 months ago by Simon Charette

Hasan this is another kind of collision, these fields are not selected and part of join tables so they won't be part of names.

We can't change the behavior at the annotate() level as it would be backward incompatible and require extra checks every time an additional table is joined.

What needs to be adjust is sql.Query.set_group_by to set alias=None if alias is not None and alias in {... set of all column names of tables in alias_map ...} before calling annotation.get_group_by_cols

https://github.com/django/django/blob/fc0fa72ff4cdbf5861a366e31cb8bbacd44da22d/django/db/models/sql/query.py#L1943-L1945

Last edited 21 months ago by Simon Charette (previous) (diff)

comment:4 Changed 21 months ago by Hasan Ramezani

Has patch: set

Simon, I've created a patch to fix this problem.
I think the collision, in this case, is between C.status and the status annotation. if we remove c_count=Count("c") from the query we won't have the error. So, I created a Book1 test model in my patch to simulate the query.

PR

Last edited 21 months ago by Mariusz Felisiak (previous) (diff)

comment:5 Changed 21 months ago by Simon Charette

Related to #28078 and #28072.

comment:6 Changed 21 months ago by Mariusz Felisiak

Triage Stage: AcceptedReady for checkin

comment:7 Changed 21 months ago by Mariusz Felisiak <felisiak.mariusz@…>

Resolution: fixed
Status: assignedclosed

In 10866a10:

Fixed #31377 -- Disabled grouping by aliases on QuerySet.values()/values_list() when they collide with field names.

Regression in fb3f034f1c63160c0ff13c609acd01c18be12f80.

Thanks Holovashchenko Vadym for the report.

comment:8 Changed 21 months ago by Mariusz Felisiak <felisiak.mariusz@…>

In 72652bcb:

[3.0.x] Fixed #31377 -- Disabled grouping by aliases on QuerySet.values()/values_list() when they collide with field names.

Regression in fb3f034f1c63160c0ff13c609acd01c18be12f80.

Thanks Holovashchenko Vadym for the report.

Backport of 10866a10fe9f0ad3ffdf6f93823aaf4716e6f27c from master

Note: See TracTickets for help on using tickets.
Back to Top