Opened 4 years ago

Closed 4 years ago

#31687 closed Bug (invalid)

Subquery loses ordering when passed to Func even if Func needs ordering.

Reported by: john-parton Owned by: nobody
Component: Database layer (models, ORM) Version: 3.0
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Description of Problem:

When passing a Subquery object to a Func object with an order by clause specified, the order by clause is often dropped, even when it might be significant.

Here's a permalink to the relevant code on master:

https://github.com/django/django/blob/678c8dfee458cda77fce0d1c127f1939dc134584/django/db/models/sql/query.py#L1029-L1034

Here's the snippet from the resolve_expression method of django.db.models.sql.query.Query

        # It's safe to drop ordering if the queryset isn't using slicing,
        # distinct(*fields) or select_for_update().
        if (self.low_mark == 0 and self.high_mark is None and
                not self.distinct_fields and
                not self.select_for_update):
            clone.clear_ordering(True)

Here's some example code:

class Author(models.Model):
    name = models.CharField(max_length=50)
    alias = models.CharField(max_length=50, null=True, blank=True)
    age = models.PositiveSmallIntegerField(default=30)


class Article(models.Model):
    authors = models.ManyToManyField(Author, related_name='articles')
    title = models.CharField(max_length=50)
    
    
print(Article.objects.annotate(
    authors_by_age=Func(
        Subquery(
            Author.objects.filter(articles__id=OuterRef('id')).order_by('age').values('name')
        ), function='ARRAY'
    )
).query)

SELECT "example_article"."id", "example_article"."title", ARRAY((SELECT U0."name" FROM "example_author" U0 INNER JOIN "example_article_authors" U1 ON (U0."id" = U1."author_id") WHERE U1."article_id" = "example_article"."id")) AS "authors_by_age" FROM "example_article"

ARRAY would be a Postgres-specific function in this case.

I couldn't find this behavior documented anywhere. I only figured it out by digging through the code.

Possible Solutions:

Right now, I either have to supply a distinct field as a hack, monkey patch the query object, or write raw sql. Obviously none of those are great.

For my use-case, I would prefer for Django to just trust that I know when ordering is necessary and to leave it alone.

Change History (1)

comment:1 by Mariusz Felisiak, 4 years ago

Resolution: invalid
Status: newclosed
Summary: Subquery loses ordering when passed to Func even if Func needs orderingSubquery loses ordering when passed to Func even if Func needs ordering.

IMO you should use ArrayAgg (ARRAY_AGG) instead of ARRAY which is perfect for your use case and supports ordering. For example:

from django.contrib.postgres.aggregates import ArrayAgg

Article.objects.annotate(
    authors_by_age=ArrayAgg('authors__name', ordering=F('authors__age')),
)

Database functions that allow subqueries and depend on ordering in most of cases support separate ORDER BY clauses.

Note: See TracTickets for help on using tickets.
Back to Top