Opened 11 years ago
Closed 11 years ago
#21524 closed Uncategorized (wontfix)
Annotation and .extra with group function breaks at database level
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.4 |
Severity: | Normal | Keywords: | oracle |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
This is somewhat related to #14657 and #13274
For some awkward reason I have the following queryset:
qs = MyModel.objects.all() qs = qs.annotate( total=Sum('statstable__total'), closed=Sum('statstable__closed'), opened=Sum('statstable__opened'), #opened_percent=Avg('statstable__opened_percent'), # The average of the percents is not the correct ratio #closed_percent=Avg('statstable__closed_percent'), # value between closed and total ).filter(total__gte=min_total) qs = qs.extra( select={ 'closed_percent': 'SUM(statstable.closed)/SUM(statstable.total)*100', 'opened_percent': 'SUM(statstable.opend)/SUM(statstable.total)*100', } )
When you try to do this on Oracle, it will put 'closed_percent' and 'opened_percent' on GROUP BY list, which will raise a DatabaseError: ORA-00934: group function is not allowed here
.
Since we are not supposed to mess with ".group_by" without Django's inner code, there is no way to solve this problem.
The quick and dirty solution for me right now was to edit django/db/models/sql/compiler.py
at line 572 and add the following code:
if isinstance(col, (list, tuple)): result.append('%s.%s' % (qn(col[0]), qn(col[1]))) elif hasattr(col, 'as_sql'): result.append(col.as_sql(qn, self.connection)) else: if 'avg(' in col.lower() or 'sum(' in col.lower(): # This is what I added, just checking if the extra select/column has the # group functions that I need, if it does, skip it, since you # can't GROUP BY group functions continue result.append('(%s)' % str(col))
I'm using Django 1.4 on this project (with no ETA to upgrade, unfortunally), and I see that on HEAD it has changed a bit, but remains "Unconditionally adding the extra_select items", which is the root of my problems: https://github.com/django/django/blob/master/django/db/models/sql/compiler.py#L575
Change History (5)
comment:1 by , 11 years ago
comment:2 by , 11 years ago
You're clearly pushing the boundaries of what the ORM can do... Writing the query in plain SQL (look for raw SQL in the documentation) could be easier at this point — and a short term solution.
I'll let someone who knows the aggregation code better than me make a decision on this ticket.
comment:3 by , 11 years ago
The solution suggested is indeed dirty -- and bogus (I can think up valid queries it breaks, the most trivial of which adds constant 'sum()' in the extra columns).
In a similar case, I did something like
def annotate_percents(qset): class AnnotatedQuerySet(qset.__class__): def iterator(self): for obj in super(AnnotatedQuerySet, self).iterator(): obj.closed_percent = obj.closed/obj.total * 100 obj.opened_percent = obj.opened/obj.total * 100 yield obj qset.__class__ = AnnotatedQuerySet return qset
Now, in the case you have with AVG you may really need to do things in the database; you can still do this efficiently by judicious use of prefetch_related
.
It can allow you to do things that look like you add a new related query per object returned, but really only cost you one more database access per the whole queryset.
I hope this helps; note that even if something like your use-case ends up supported, it will probably not be in Django 1.4.x.
comment:4 by , 11 years ago
@Shai, thanks for your help, but this query that I do is used in 2~3 tables with almost same columns (they are statistical tables), and I need to group data by ranges of dates and then order the result by the highest and lowest percentage, hence my need for all the math being done by the database. I don't think getting all the data and then math and sort on the python would be a good solution for now.
@aaugustin, I do indeed am pushing what the ORM can do, and will totally accept this ticket be closed as "invalid" or "wontfix", even if I'm just using two queryset methods: .extra
and .annotate
.
comment:5 by , 11 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
I don't think we can support this. It is impossible to detect which parts of the SQL in extra select needs to be in group by. Trying to parse the SQL doesn't work. Consider for example:
account_balance - SUM(transaction_amount)
Here account_balance must be in group by, but transaction_amount not.
The .extra() as it is currently implemented in Django just doesn't work for your use case. I hope we will have something better later on, but for now I'll have to just wontfix this issue.
This "validation" also must happens at line 114: