Opened 9 years ago

Last modified 9 years ago

#24431 closed Bug

Combining extra(), annotate() or distinct() and count() can generate invalid SQL generation — at Version 1

Reported by: briankrznarich Owned by: nobody
Component: Database layer (models, ORM) Version: 1.8beta1
Severity: Release blocker Keywords: extra distinct aggregate count annotate
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description (last modified by briankrznarich)

When applying a count() to a queryset that contains extra() values and either an annotation, or a distinct() call, get_aggregation() discards any existing select_related entries. If the extra() components rely on those joins, the SQL will be invalid.

The failures are new to 1.8, and do not appear to be present in 1.7. (I discovered the problem when upgrading fom 1.7 to beta for testing).

Example:
Permission.objects.all().select_related('content_type').extra(select={'foo':"django_content_type.app_label>'q'"}).distinct().count()

ProgrammingError: missing FROM-clause entry for table "django_content_type"
LINE 1: SELECT COUNT('*') FROM (SELECT DISTINCT (django_content_type.

Remove the distinct(), or the count(), and it's fine (this is a contrived example, I know it makes no practical sense).

Another Example with some toy models:

class Address(models.Model):
    city = models.CharField(max_length=40)
class Author(models.Model):
    name = models.CharField(max_length=40)
    address = models.ForeignKey(Address)
class BlogPost(models.Model):
    data = models.CharField(max_length=40)
    author = models.ForeignKey(Author)

For each author, select a count of their total blog entries, as well as a custom boolean value based on their address:

Author.objects.all().annotate(blogpost_count=Count('blogpost')).select_related('address').extra(select={'in_maryland':"address.state='MD'"}).count()

This is fine until you append the count()

We hit this issue where TastyPie applies count() to then end of some complicated querysets, far far away from where we constructed them. In our case, we're using extra() values for an order_by that requires several comparisons.

A potential fix is in django/db/models/sql/query.py line 396 in get_aggregation().
Change

inner_query.select_related = False

To

if not self._extra: #Skip optimization, extra() fields might need these joins
    inner_query.select_related = False

The cause appears to be a performance optimization (don't do joins if the results don't appear the affect the aggregate).

This has similar keywords and errors to #11329, but I think the underlying cause is different and possibly easier to address.

Change History (1)

comment:1 by briankrznarich, 9 years ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.
Back to Top