Opened 6 years ago
Closed 3 years ago
#30209 closed Bug (duplicate)
Union with group by don't generate correct Subquery
Reported by: | Nikolas | Owned by: | Can Sarıgöl |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | union, group by |
Cc: | Can Sarıgöl | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | yes |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
reopen: https://code.djangoproject.com/ticket/30203
Simplified for example:
have simple query
qs1: QuerySet = User.objects.filter(id__gt=10) qs1 = qs1.values('is_active').annotate(id=Count('pk'))
generate:
SELECT "suser_customuser"."is_active", COUNT("suser_customuser"."id") AS "id" FROM "suser_customuser" WHERE "suser_customuser"."id" > 10 GROUP BY "suser_customuser"."is_active"
all correct, now i wont use union
qs1: QuerySet = User.objects.filter(id__gt=10) qs2: QuerySet = User.objects.filter(id__lt=10) qs1 = qs1.values('is_active').annotate(id=Count('pk'))
now get not correct SQL:
(SELECT "suser_customuser"."is_active", "suser_customuser"."id" FROM "suser_customuser" WHERE "suser_customuser"."id" > 10) UNION (SELECT "suser_customuser"."is_active", "suser_customuser"."id" FROM "suser_customuser" WHERE "suser_customuser"."id" < 10)
expected:
select * from ((SELECT "suser_customuser"."is_active", "suser_customuser"."id" FROM "suser_customuser" WHERE "suser_customuser"."id" > 10) UNION (SELECT "suser_customuser"."is_active", "suser_customuser"."id" FROM "suser_customuser" WHERE "suser_customuser"."id" < 10)) U0 group by is_active, id
i think this is bug, if not why not show any warning or exception?
or just say how rewrite query for correct use Subquery
screenshot: http://dl4.joxi.net/drive/2019/02/24/0002/0003/143363/63/2eab4a5318.jpg
Attachments (1)
Change History (15)
by , 6 years ago
Attachment: | 2eab4a5318.jpg added |
---|
comment:1 by , 6 years ago
Type: | Uncategorized → Bug |
---|
follow-ups: 3 4 comment:2 by , 6 years ago
Needs documentation: | set |
---|---|
Triage Stage: | Unreviewed → Accepted |
Version: | 1.11 → master |
I agree that either an exception should be raised and the documentation be adjusted to explicitly mention aggregation won't work or adding an aggregate function should perform a subquery pushdown.
comment:3 by , 6 years ago
Replying to Simon Charette:
I agree that either an exception should be raised and the documentation be adjusted to explicitly mention aggregation won't work or adding an aggregate function should perform a subquery pushdown.
Please can help, how can rewrite query for correct work???? My boss shut me in close time, but i stock there)
comment:4 by , 6 years ago
Replying to Simon Charette:
I agree that either an exception should be raised and the documentation be adjusted to explicitly mention aggregation won't work or adding an aggregate function should perform a subquery pushdown.
I'm planning to make this a sub query pushdown as suggested. FYI, this is my first contribution to the code base, so suggestions and feedback would be most appreciated.
follow-ups: 6 7 comment:5 by , 6 years ago
I'm planning to make this a sub query pushdown as suggested. FYI, this is my first contribution to the code base, so suggestions and feedback would be most appreciated.
TKdka, not to discourage you but this is certainly going to be a hard first contribution as the ORM doesn't have any formalized API to perform such pushdown yet. If that can provide you any guidance the most similar code I can think of is in django.db.models.sql.Query.get_aggregation
.
comment:6 by , 6 years ago
Replying to Simon Charette:
I'm planning to make this a sub query pushdown as suggested. FYI, this is my first contribution to the code base, so suggestions and feedback would be most appreciated.
TKdka, not to discourage you but this is certainly going to be a hard first contribution as the ORM doesn't have any formalized API to perform such pushdown yet. If that can provide you any guidance the most similar code I can think of is in
django.db.models.sql.Query.get_aggregation
.
Thanks for the guidance and code reference. As nobody else has stepped up on this in 3 weeks, I'll give it a shot (-:
comment:7 by , 6 years ago
Replying to Simon Charette:
I'm planning to make this a sub query pushdown as suggested. FYI, this is my first contribution to the code base, so suggestions and feedback would be most appreciated.
TKdka, not to discourage you but this is certainly going to be a hard first contribution as the ORM doesn't have any formalized API to perform such pushdown yet. If that can provide you any guidance the most similar code I can think of is in
django.db.models.sql.Query.get_aggregation
.
After looking into this, you are completely right that it's a big endeavor. The difficult part is that the query needs to be manipulated even after the annotate call, which is significantly different from the aggregate example.
Therefore, I'll submit a small patch that raises an exception when this is attempted.
I'm not sure what exception class should be used here. TypeError? Thoughts?
follow-up: 9 comment:8 by , 6 years ago
I'm not sure what exception class should be used here. TypeError? Thoughts?
Good question. I know that we're currently preventing some operations on combined querysets so I'd try to see what type of exception we're currently raising and stick to it to be coherent. I think .filter()
is disallowed for example. It's possible that we simply crash though.
I think the best place to add this check would be in query.Queryset.annotate
just before set_group_by
is called if an annotation contains aggregates.
comment:9 by , 6 years ago
Replying to Simon Charette:
I'm not sure what exception class should be used here. TypeError? Thoughts?
Good question. I know that we're currently preventing some operations on combined querysets so I'd try to see what type of exception we're currently raising and stick to it to be coherent. I think
.filter()
is disallowed for example. It's possible that we simply crash though.
I think the best place to add this check would be in
query.Queryset.annotate
just beforeset_group_by
is called if an annotation contains aggregates.
On where the exception should be raised...
I agree with general placement, but I'm thinking the exception should be raised as soon as an aggregate function as noticed, so just after
line 1048 is called The other part of the conditional,
clone.query.group_by = True
is problematic as well. This appears to create a group by over all the fields, which is not supported. Thoughts?
On the type of exception that should be raised...
Using union+filter does not raise an exception (nor does it work). I think that the correct exception should be something along the lines of a SQLNotSupported exception. The exception should state what SQL feature combination is not supported and then suggest using a raw query or rewriting the query. Is there already something like this?
On related bugs...
I found two similar bugs while exploring this feature (one of which is mentioned above). I'll create tickets for them as well, but wanted to mention them here too.
As mentioned above, there is a bug with filter + union as well. Performing a filter with a union neither raises an exception nor adds the filter.
qs1 = Book.objects.filter(pk__lte=10) qs2 = Book.objects.filter(pk__gt=10) qs3 = qs1.union(qs2) qs4 = qs3.filter(publisher_id=1) print (qs4.query)
SELECT "aggregation_book"."id", "aggregation_book"."isbn", "aggregation_book"."name", "aggregation_book"."pages", "aggregation_book"."rating", "aggregation_book"."price", "aggregation_book"."contact_id", "aggregation_book"."publisher_id", "aggregation_book"."pubdate" FROM "aggregation_book" WHERE "aggregation_book"."id" <= 10 UNION SELECT "aggregation_book"."id", "aggregation_book"."isbn", "aggregation_book"."name", "aggregation_book"."pages", "aggregation_book"."rating", "aggregation_book"."price", "aggregation_book"."contact_id", "aggregation_book"."publisher_id", "aggregation_book"."pubdate" FROM "aggregation_book" WHERE "aggregation_book"."id" > 10
There's a similar bug when using filter + non-aggregating annotate functions. The annotated field is not added, and an exception is not raised.
qs1 = Book.objects.filter(pk__lte=10) qs2 = Book.objects.filter(pk__gt=10) qs3 = qs1.union(qs2) qs4 = qs3.annotate(annotateField=F('id')+1) print (qs4.query)
SELECT "aggregation_book"."id", "aggregation_book"."isbn", "aggregation_book"."name", "aggregation_book"."pages", "aggregation_book"."rating", "aggregation_book"."price", "aggregation_book"."contact_id", "aggregation_book"."publisher_id", "aggregation_book"."pubdate" FROM "aggregation_book" WHERE "aggregation_book"."id" <= 10 UNION SELECT "aggregation_book"."id", "aggregation_book"."isbn", "aggregation_book"."name", "aggregation_book"."pages", "aggregation_book"."rating", "aggregation_book"."price", "aggregation_book"."contact_id", "aggregation_book"."publisher_id", "aggregation_book"."pubdate" FROM "aggregation_book" WHERE "aggregation_book"."id" > 10
comment:10 by , 6 years ago
Cc: | added |
---|---|
Has patch: | set |
comment:11 by , 6 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
follow-up: 13 comment:12 by , 4 years ago
It seems the linked PR was already closed, so it seems implementing this properly has been abandoned?
As for showing a warning when a union is annotated with an aggregation, it seems that https://github.com/django/django/pull/11591 forbids *all* annotations on unions, so this case seems to be convered by that as well. Not sure what that means for the status of this ticket, though.
comment:13 by , 4 years ago
Replying to Matthijs Kooijman:
It seems the linked PR was already closed, so it seems implementing this properly has been abandoned?
As for showing a warning when a union is annotated with an aggregation, it seems that https://github.com/django/django/pull/11591 forbids *all* annotations on unions, so this case seems to be convered by that as well. Not sure what that means for the status of this ticket, though.
I opened a new PR but it's still WIP. There is a tricky parentheses issue on MySQL/MariaDB. I hope I will find time to finish it.
comment:14 by , 3 years ago
Resolution: | → duplicate |
---|---|
Status: | assigned → closed |
A proper exception when calling annotate()
after union()
, intersection()
, and difference()
was added in 1853383969a4c53bbeba998757c30410bd3df4bb.
Closing as a duplicate of #28519.
screenshot