Opened 3 years ago

Closed 8 months ago

Last modified 8 months ago

#29095 closed Bug (fixed)

QuerySet.count() with Sum window function generates invalid SQL on PostgreSQL

Reported by: Shadi Akiki Owned by: nobody
Component: Database layer (models, ORM) Version: 3.0
Severity: Normal Keywords: window
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Using .count() on the queryset from Sum in Django 2.0 Window function yields the error

    django.db.utils.ProgrammingError: column "cms2_balances_base_v1.initial_balance" must appear in the GROUP BY clause or be used in an aggregate function
    LINE 1: ...T "cms2_balances_base_v1"."pub_date" AS Col1, SUM("cms2_bala...

Here is the exact django code

    from apps.cms2_balances.models import *
    from django.db.models import Avg, F, Max, Min, Window, Sum, Func
    
    t1 = obligor.basev1_set.annotate(
        cumulative_obligor=Window(
            expression=Sum(F('initial_balance')),
            partition_by=[F('currency'), F('obligor')],
            order_by=['pub_date', 'content_type_id', 'object_id']
        )
    )
    t1.all() # <<< no error
    t1.count() # <<< yields error

The query of this queryset works when I run it in postgres directly. Here it is

    SELECT
        "cms2_balances_base_v1"."obligor_id",
        "cms2_balances_base_v1"."originator_id",
        "cms2_balances_base_v1"."currency_id",
        "cms2_balances_base_v1"."content_type_id",
        "cms2_balances_base_v1"."object_id",
        "cms2_balances_base_v1"."pub_date",
        "cms2_balances_base_v1"."initial_balance",
        "cms2_balances_base_v1"."principal_balance",
        "cms2_balances_base_v1"."interest_balance",
        SUM("cms2_balances_base_v1"."initial_balance")
            OVER (
                PARTITION BY "cms2_balances_base_v1"."currency_id",
                    "cms2_balances_base_v1"."obligor_id"
                ORDER BY "cms2_balances_base_v1"."pub_date",
                    "cms2_balances_base_v1"."content_type_id",
                "cms2_balances_base_v1"."object_id"
            ) AS "cumulative_obligor"
    FROM "cms2_balances_base_v1"
    WHERE "cms2_balances_base_v1"."obligor_id" = 2164
    ;

Change History (7)

comment:1 Changed 3 years ago by Tim Graham

Summary: count on queryset with sum window function in django 2.0 + postgres throws errorQuerySet.count() with Sum window function generates invalid SQL on PostgreSQL
Triage Stage: UnreviewedAccepted

comment:2 Changed 2 years ago by holvianssi

I can't reproduce this. I'm using Django tests expressions_window app model, and trying with query:

Employee.objects.annotate(
    cumulative_obligor=Window(
        expression=Sum(F('salary')),
        partition_by=[F('department')],
        order_by=['hire_date']
    ) 
).filter(pk__gte=1).count()

What's the missing ingredient to make this crash? Or could this have been fixed subsequently?

comment:3 Changed 19 months ago by Simon Charette

Resolution: needsinfo
Status: newclosed

Closing as needsinfo as I couldn't reproduce either. Feel free to reopen if you can provide more details.

comment:4 Changed 8 months ago by Christopher G Johnson

Resolution: needsinfo
Status: closednew

Hi There! I ran across this old bug and have some additional information to share. In my case the "missing ingredient" is when you join to another table and then include a column from the join table in the window function's ORDER BY clause.

I created a test to show the problem here:

https://github.com/clickonchris/django/commit/29feb51531ac4d1a65a40003d609295b3d88c001

The .count() function generates SQL as below. I've included a code comment above the problematic GROUP BY at the end

SELECT COUNT(*) FROM (
  SELECT 
    "expressions_window_employee"."id" AS Col1, 
    SUM("expressions_window_employee"."salary") 
    OVER (PARTITION BY "expressions_window_employee"."department" ORDER BY "expressions_window_employee"."hire_date", "expressions_window_employeeclassification"."description") AS "department_sum" 
  FROM "expressions_window_employee" INNER JOIN "expressions_window_employeeclassification" ON ("expressions_window_employee"."classification_id" = "expressions_window_employeeclassification"."code") 
  -- This GROUP BY seems unnecessary
 GROUP BY "expressions_window_employee"."id"
) subquery

comment:5 Changed 8 months ago by felixxm

Triage Stage: AcceptedUnreviewed

comment:6 Changed 8 months ago by felixxm

Resolution: fixed
Status: newclosed
Version: 2.03.0

Fixed in 3f32154f40a855afa063095e3d091ce6be21f2c5. Christopher, thanks for tests.

comment:7 Changed 8 months ago by Mariusz Felisiak <felisiak.mariusz@…>

In 33ec01ca:

Refs #29095 -- Added test for using QuerySet.count() with window expressions ordered by related fields.

Fixed in 3f32154f40a855afa063095e3d091ce6be21f2c5.

Note: See TracTickets for help on using tickets.
Back to Top