Opened 10 years ago
Last modified 10 years ago
#24431 closed Bug
Combining extra(), annotate() or distinct() and count() can generate invalid SQL generation — at Initial Version
Reported by: | briankrznarich | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.8beta1 |
Severity: | Release blocker | Keywords: | extra distinct aggregate count annotate |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | yes | UI/UX: | no |
Description
When applying a count() to a queryset that contains extra() values and either an annotation, or a distinct() call, get_aggregation() discards any existing select_related entries. If the extra() components rely on those joins, the SQL will be invalid.
The failures are new to 1.8, and do not appear to be present in 1.7. (I discovered the problem when upgrading fom 1.7 to beta for testing).
Example:
Permission.objects.all().select_related('content_type').extra(select={'foo':"django_content_type.app_label>'q'"}).distinct().count()
ProgrammingError: missing FROM-clause entry for table "django_content_type"
LINE 1: SELECT COUNT('*') FROM (SELECT DISTINCT (django_content_type.
Remove the distinct(), or the count(), and it's fine (this is a contrived example, I know it makes no practical sense).
Another Example with some toy models:
class Address(models.Model):
city = models.CharField(max_length=40)
class Author(models.Model):
name = models.CharField(max_length=40)
address = models.ForeignKey(Address)
class BlogPost(models.Model):
data = models.CharField(max_length=40)
author = models.ForeignKey(Author)
For each author, select a count of their total blog entries, as well as a custom boolean value based on their address:
Author.objects.all().annotate(blogpost_count=Count('blogpost')).select_related('address').extra(select={'in_maryland':"address.state='MD'"}).count()
This is fine until you append the count()
We hit this issue where TastyPie applies count() to then end of some complicated querysets, far far away from where we constructed them. In our case, we're using extra() values for an order_by that requires several comparisons.
A potential fix is in django/db/models/sql/query.py line 396 in get_aggregation().
Change
inner_query.select_related = False
To
if not self._extra: #Skip optimization, extra() fields might need these joins
inner_query.select_related = False
The cause appears to be a performance optimization (don't do joins if the results don't appear the affect the aggregate).
This has similar keywords and errors to #11329, but I think the underlying cause is different and possibly easier to address.