#25307 closed Bug (fixed)
Cannot use .annotate with conditional expressions
Reported by: | Jared Proffitt | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | josh.smeaton@…, matt@…, Simon Charette, newport.travis@… | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
I am trying to create a query that first annotates a Count
, then aggregates some Sum
s together using Case
objects. If I do the query without the annotate, everything is fine. If I do the query without any Case
objects in the aggregate, everything is fine. But when they are both present I get this error:
'WhereNode' object has no attribute 'get_source_expressions'
My query is something like this:
counts = models.Package.objects.annotate( used_count=Count('projects') ).distinct().aggregate( no_parent=Sum(Case( When(parent_id__isnull=True, then=1), default=0, output_field=IntegerField() )) )
I would like to eventually add the used_count
value in the aggregate, but this itself won't even run.
Change History (20)
comment:1 Changed 8 years ago by
comment:2 Changed 8 years ago by
I have the same issue, here is enough code to reproduce the issue.
# In models.py from django.db import models class DateModel(models.Model): date1 = models.DateTimeField() date2 = models.DateTimeField() # elsewhere from django.db.models import Q, Case, When from django.db.models.functions import Coalesce from datetime import date annotate = {'date': Coalesce('date1', 'date2')} dms = DateModel.objects.annotate(**annotation) aggregation = { 'count_recent': Count(Case(When(Q(date__gte=date(year=2015, month=1, day=1))))) } dms = dms.aggregate(**aggregation)
AttributeError: 'WhereNode' object has no attribute 'get_source_expressions'
comment:3 Changed 8 years ago by
Cc: | josh.smeaton@… added |
---|---|
Triage Stage: | Unreviewed → Accepted |
Version: | 1.8 → master |
comment:5 Changed 8 years ago by
Please correct me if I'm wrong, but it seems as if the issue lies with the django.db.models.expressions.When
class, in that the condition
attribute is supposed to be a django.db.models.Q
object and yet the get_source_epxressions()
and set_source_expressions()
methods are treating the condition
attribute as an django.db.models.expressions.Expression
, when it most certainly is not.
This violates the API and hence client-code of the API (such as aggregation code in this case) will fail unexpectedly.
E.g.
In the example provided by @martsberger, the count_recent
expression gets down to django.db.models.sql.query.Query.rewrite_cols()
. That method recursively travels into the nested expressions until it hits the Q
object returned by the When
expression's get_source_expressions()
and that's where it fails.
The problem is only triggered (during aggregation) by column rewriting, when this line evaluates to True
in django.db.models.sql.query.Query.get_aggregation()
:
if (isinstance(self.group_by, list) or has_limit or has_existing_annotations or self.distinct):
As I said initially, I believe the root cause is the violation of the Expression
API by the When
subclass.
comment:6 Changed 8 years ago by
Cc: | matt@… added |
---|
comment:7 Changed 8 years ago by
I'm not sure when/how, but the condition
attribute of the When
Expression
gets converted to a django.db.models.sql.where.WhereNode
, so it's actually a WhereNode
object that is having .get_source_expressions()
called on it.
Regardless, django.db.models.expressions.When
's get_source_expressions()
and set_source_expressions()
methods need to bridge the gap between Expression
s and django.utils.tree.Node
s, in order to properly re-write the column names to the aliases from the subquery.
I can't see a quick/easy way to do this, but as I'm not experienced in the ORM code, I could be wrong.
comment:8 Changed 8 years ago by
We have a couple of things that *resolve* to an expression. For example, Q() and F() objects are not expressions, but will resolve to an expression when added to a query.
If WhereNode doesn't respect the expression API, then the fix is to make WhereNodes full expressions.
comment:9 Changed 8 years ago by
To clarify, I think it's the django.db.models.expressions.When
class (which is a subclass of Expression
) that is not respecting the API, with regards to its get_source_expressions()
and set_source_expressions()
methods.
comment:10 Changed 8 years ago by
Cc: | Simon Charette added |
---|
Also just hit this when upgrading an old project using django-aggregate-if on Django 1.7 to conditional aggregation on Django 1.8.
comment:11 Changed 8 years ago by
From a short investigation it looks like the issue is actually caused by a distinct()
and aggregate()
of conditional expression combination.
Can the people affected confirm that removing the distinct()
call from their query silence the exception?
comment:12 Changed 8 years ago by
That does make it break as well, but not simply just the use of distinct. If you use EITHER .distinct()
OR .annotate()
along with an .aggregate()
that contains a conditional expression. (not sure exactly what part of the condition expression, but it must be either because of the Case()
or the When()
.
I have made a completely useless example using the polls app in the django tutorial. I copied the polls models exactly, and created this query, which will cause the error to happen:
counts = models.Choice.objects.annotate( used_count=Count('question') ).distinct().aggregate( yes=Sum(Case( When(choice_text='yes', then=F('votes')), default=0, output_field=IntegerField() )), no=Sum(Case( When(choice_text='no', then=F('votes')), default=0, output_field=IntegerField() )), )
(Obviously, then used_count
will always be 1, and its pretty pointless in this example, but you get the idea).
So if you remove the either the .annotate()
OR the .distinct()
, the query still breaks the same way. The presence of either one causes it to break. But if you remove BOTH, the query runs fine.
I do not know anything about the internals of the Query api, or I would help fix the problem. But it seems there is a problem when combining certain queryset methods like annotate
or distinct
with aggregate
s that have condition expressions.
I know it has to do with condition expressions, because doing the query like the following works just fine:
counts = models.Choice.objects.annotate( used_count=Count('question') ).distinct().aggregate( yes=Sum('votes'), )
comment:14 Changed 8 years ago by
Cc: | newport.travis@… added |
---|
As mentioned on the pull request, this issue seems to exist for any of the 4 cases here:
if (isinstance(self.group_by, list) or has_limit or has_existing_annotations or self.distinct):
I was able to write unit tests for all but the isinstance(self.group_by, list)
case, if someone is able to provide a test for that case I can add it to my changes.
Edit: Unit test updates on my pull request should now cover all cases, my pull request can be closed in favor of #6267
comment:16 Changed 7 years ago by
Patch needs improvement: | unset |
---|
comment:18 Changed 7 years ago by
Triage Stage: | Accepted → Ready for checkin |
---|
Any chance you could provide a regression test for Django's test suite? It's difficult to reproduce the issue without your models.