Opened 3 years ago
Closed 3 years ago
#32658 closed Uncategorized (invalid)
Subquery ignores query ordering
Reported by: | Bálint Balina | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.2 |
Severity: | Normal | Keywords: | Subquery, annotations, ordering |
Cc: | Simon Charette, Hannes Ljungberg | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I've started migrating a huge project from Django==2.2 to ==3.2, and many tests started to fail, because of the problem below:
When using a query with ordering in an annotated Subquery, the query's ordering is ignored (or removed).
In a bare new Django project, I have successfully reproduced the issue:
from django.db import models class MyModel(models.Model): parent = models.ForeignKey(to='self', on_delete=models.CASCADE, null=True) name = models.TextField()
MyModel.objects.create(name='parent') MyModel.objects.create(name='b', parent=parent) MyModel.objects.create(name='a', parent=parent)
children_query = MyModel.objects.filter(parent=OuterRef('pk')).annotate(child_names=Func(F('name'), function='group_concat')).values('child_names').order_by('name') parent_query = MyModel.objects.annotate(child_names=Subquery(children_query)) parent_query.query.__str__()
Outputs:
SELECT "app_mymodel"."id", "app_mymodel"."parent_id", "app_mymodel"."name", (SELECT group_concat(U0."name") AS "child_names" FROM "app_mymodel" U0 WHERE U0."parent_id" = "app_mymodel"."id") AS "child_names" FROM "app_mymodel"
The only thing I do, is pip install Django==2.2, run the same thing in a shell and:
Outputs:
SELECT "app_mymodel"."id", "app_mymodel"."parent_id", "app_mymodel"."name", (SELECT group_concat(U0."name") AS "child_names" FROM "app_mymodel" U0 WHERE U0."parent_id" = ("app_mymodel"."id") ORDER BY U0."name" ASC) AS "child_names" FROM "app_mymodel"
Notice the ORDER BY U0."name" ASC is missing in Django 3.2, though I have explicitly set the ordering on the query. This happens both on SQLite and PostgreSQL backends.
Change History (6)
comment:1 by , 3 years ago
comment:2 by , 3 years ago
Cc: | added |
---|---|
Resolution: | → invalid |
Status: | new → closed |
IIRC the ordering is cleared for optimization purposes when LIMIT
, DISTINCT
and aggregation are not used.
I think that ordering wouldn't be disabled if your function was properly marked as being an aggregate (as it's one)
children_query = MyModel.objects.filter( parent=OuterRef('pk'), ).annotate( child_names=Aggregate('name', function='group_concat'), ).values('child_names').order_by('name')
If you use Func
instead of Aggregate
then the ORM has to way to know your opaque expression is performing aggregation and that ordering might be relevant.
If using Aggregate
instead of Func
still results in cleared ordering then please re-open this ticket as it shouldn't be the case otherwise this a clearly a misuse of the ORM and likely only work in your case because MySQL is very lax about GROUP BY
requirements.
comment:3 by , 3 years ago
Resolution: | invalid |
---|---|
Status: | closed → new |
Okay, that sounds sane, but the solution you provided does not work:
children_query = MyModel.objects.filter( parent=OuterRef('pk'), ).annotate( child_names=Aggregate('name', function='group_concat'), ).values('child_names').order_by('name') queryset = MyModel.objects.annotate( other_names=Subquery(children_query) ).only('id') print(str(queryset.query))
SELECT "app_mymodel"."id", (SELECT ARRAY_AGG(U0."name" ) AS "child_names" FROM "app_mymodel" U0 WHERE U0."parent_id" = "app_mymodel"."id" GROUP BY U0."id") AS "other_names" FROM "app_mymodel"
comment:4 by , 3 years ago
I this case I think we should either make the logic in charge of clearing ordering consider contains_aggregate
as well.
An alternative which might more correct would be to move the ordering clearing logic to In(Lookup).process_rhs
after all that's what we do for clearing the select clause. The devils lurks in the details there though as the solution needs to handle Subquery
, sql.Query
, and QuerySet
(all resolves to sql.Query
though) right-hand-sides and the ordering can only be cleared under some particular conditions.
Maybe adding a sql.Query.as_haystack
(please suggest a better name) method that takes care of all this logic and duck-typing against it in In(Lookup).process_rhs
would provide a decoupled enough API?
comment:5 by , 3 years ago
Cc: | added |
---|
comment:6 by , 3 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
ORDER BY
clause should be included in the GROUP_CONCAT()
not in the main query, e.g. GROUP_CONCAT(U0."name" ORDER BY U0."name" ASC) AS "child_names")
. You can check OrderableAggMixin to see how it's implemented for ArrayAgg
, StringAgg
, and JSONBAgg
on PostgreSQL.
After digging deeper, I have found that this is the duplicate of https://code.djangoproject.com/ticket/31687 , which was closed without any real explanation about why this working behaviour changed. I am aware of the solution provided, but it would require quite some code changes, which is fine but seems unreasonable. At very least I would like to know why it is necessary to remove the ordering from the subquery.