Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#34368 closed Bug (fixed)

Subquery referencing WINDOW expression breaks query compilation

Reported by: Jannis Vajen Owned by: nobody
Component: Database layer (models, ORM) Version: 4.2
Severity: Release blocker Keywords: window orm filter subquery
Cc: Simon Charette Triage Stage: Ready for checkin
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

#28333 introduced the ability to filter on WINDOW expressions but when a WINDOW expression is referred to in a subquery, the query compiler fails: ValueError: Need either using or connection

I assume it has to do with self.connection not being passed on to inner_query.get_compiler in get_qualify_sql introduced here: https://github.com/django/django/commit/f387d024fc75569d2a4a338bfda76cc2f328f627#diff-f58de2deaccecd2d53199c5ca29e3e1050ec2adb80fb057cdfc0b4e6accdf14fR624

Below is the full traceback. A dummy test is available here: https://github.com/jnns/django/commit/04d59e27503fb376d60314806876ecbae89d9b62

Traceback (most recent call last):
  File "/usr/lib64/python3.11/unittest/case.py", line 57, in testPartExecutor
    yield
  File "/usr/lib64/python3.11/unittest/case.py", line 623, in run
    self._callTestMethod(testMethod)
  File "/usr/lib64/python3.11/unittest/case.py", line 579, in _callTestMethod
    if method() is not None:
  File "tests/expressions_window/tests.py", line 1025, in test_filter_subquery
    list(
  File "django/db/models/query.py", line 398, in __iter__
    self._fetch_all()
  File "django/db/models/query.py", line 1881, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "django/db/models/query.py", line 91, in __iter__
    results = compiler.execute_sql(
  File "django/db/models/sql/compiler.py", line 1545, in execute_sql
    sql, params = self.as_sql()
  File "django/db/models/sql/compiler.py", line 732, in as_sql
    extra_select, order_by, group_by = self.pre_sql_setup(
  File "django/db/models/sql/compiler.py", line 84, in pre_sql_setup
    self.setup_query(with_col_aliases=with_col_aliases)
  File "django/db/models/sql/compiler.py", line 73, in setup_query
    self.select, self.klass_info, self.annotation_col_map = self.get_select(
  File "django/db/models/sql/compiler.py", line 296, in get_select
    sql, params = self.compile(col)
  File "django/db/models/sql/compiler.py", line 542, in compile
    sql, params = node.as_sql(self, self.connection)
  File "django/db/models/expressions.py", line 1536, in as_sql
    subquery_sql, sql_params = self.query.as_sql(compiler, connection)
  File "django/db/models/sql/query.py", line 1150, in as_sql
    sql, params = self.get_compiler(connection=connection).as_sql()
  File "django/db/models/sql/compiler.py", line 751, in as_sql
    result, params = self.get_qualify_sql()
  File "django/db/models/sql/compiler.py", line 676, in get_qualify_sql
    inner_query_compiler = inner_query.get_compiler(
  File "django/db/models/sql/query.py", line 298, in get_compiler
    raise ValueError("Need either using or connection")
ValueError: Need either using or connection

Change History (10)

comment:1 by Mariusz Felisiak, 2 years ago

Cc: Simon Charette added
Triage Stage: UnreviewedAccepted

Thanks again for the report! Would you like to prepare a patch?

comment:2 by Jannis Vajen, 2 years ago

I started work on https://github.com/django/django/pull/16597 but now I'm in need of assistance. The query compiles fine if connection is added, but I am unsure about the correctness of the resulting query.

I added the test WindowFunctionTests.test_filter_subquery to describe the behaviour I expected to see. Maybe this is out of scope for the referenced feature #15922 or completely intentional. I am not sure.

This is the query in the failing test is currently generated:

SELECT "expressions_window_employee"."name",
  (SELECT "col1"
   FROM
     (SELECT *
      FROM
        (SELECT U0."code" AS "col1",
                RANK() OVER (PARTITION BY "expressions_window_employee"."department"
                             ORDER BY "expressions_window_employee"."salary" DESC) AS "qual0"
         FROM "expressions_window_classification" U0) "qualify"
      WHERE "col1" = ("qual0") ) "qualify_mask"
   LIMIT 1) AS "code"
FROM "expressions_window_employee"

Whereas this is what I would expect:

SELECT name,
  (SELECT "code"
   FROM "expressions_window_classification"
   WHERE "code" = "department_salary_rank"
   LIMIT 1) AS code
FROM (
  SELECT 
     "expressions_window_employee"."name",
     RANK() OVER (PARTITION BY "expressions_window_employee"."department"
                  ORDER BY "expressions_window_employee"."salary" DESC) AS "department_salary_rank"
   FROM "expressions_window_employee"
) 

comment:3 by Simon Charette, 2 years ago

Passing connection will address the crash but there seems to be a more fundamental issue when using OuterRef to refer to a window expression

SELECT "expressions_window_employee"."id",
       "expressions_window_employee"."name",
       "expressions_window_employee"."salary",
       "expressions_window_employee"."department",
       "expressions_window_employee"."hire_date",
       "expressions_window_employee"."age",
       "expressions_window_employee"."classification_id",
       "expressions_window_employee"."bonus",
       RANK() OVER (PARTITION BY "expressions_window_employee"."department"
                    ORDER BY "expressions_window_employee"."salary" DESC) AS "department_salary_rank",

  (SELECT "col1"
   FROM
     (SELECT *
      FROM
        (SELECT U0."name" AS "col1",
                RANK() OVER (PARTITION BY "expressions_window_employee"."department"
                             ORDER BY "expressions_window_employee"."salary" DESC) AS "qual0",
                            U0."age" AS "qual1"
         FROM "expressions_window_employee" U0) "qualify"
      WHERE "qual1" = ("qual0") ) "qualify_mask"
   LIMIT 1) AS "foo"
FROM "expressions_window_employee";

The RANK() window within the subquery will be against the subquery table and not the outer one which only happens to work in this case because it happens to be the same table.

I believe the proper SQL here would be

SELECT
        *,
        (SELECT U0."name"
         FROM "expressions_window_employee" U0
         WHERE U0."age" = "department_salary_rank"
         LIMIT 1) AS "foo"
FROM (
        SELECT "expressions_window_employee"."id",
               "expressions_window_employee"."name",
               "expressions_window_employee"."salary",
               "expressions_window_employee"."department",
               "expressions_window_employee"."hire_date",
               "expressions_window_employee"."age",
               "expressions_window_employee"."classification_id",
               "expressions_window_employee"."bonus",
               RANK() OVER (PARTITION BY "expressions_window_employee"."department"
                            ORDER BY "expressions_window_employee"."salary" DESC) AS "department_salary_rank"
        FROM "expressions_window_employee"
) subquery

Things get even more complex when filtering against a subquery annotation that refers to a window function as that filter now requires QUALIFY emulation. Things get even more complex when using nested subqueries referring to an outerquery window function.

I would strongly suggest we opt for explicitly not supporting subqueries references to outerref window function instead of trying to get 4.2 working with them and create a new feature request to add support for them as they require special care to get working properly.

  • django/db/models/expressions.py

    diff --git a/django/db/models/expressions.py b/django/db/models/expressions.py
    index 2d8a5c5919..a426195b0d 100644
    a b def as_sql(self, *args, **kwargs):  
    857857
    858858    def resolve_expression(self, *args, **kwargs):
    859859        col = super().resolve_expression(*args, **kwargs)
     860        if col.contains_over_clause:
     861            raise NotSupportedError(
     862                f"Referencing outer query window expression is not supported: {self.name}"
     863            )
    860864        # FIXME: Rename possibly_multivalued to multivalued and fix detection
    861865        # for non-multivalued JOINs (e.g. foreign key fields). This should take
    862866        # into account only many-to-many and one-to-many relationships.
  • django/db/models/sql/compiler.py

    diff --git a/django/db/models/sql/compiler.py b/django/db/models/sql/compiler.py
    index 6929f216b4..21234fb6b0 100644
    a b def collect_replacements(expressions):  
    674674                )
    675675            )
    676676        inner_query_compiler = inner_query.get_compiler(
    677             self.using, elide_empty=self.elide_empty
     677            self.using, connection=self.connection, elide_empty=self.elide_empty
    678678        )
    679679        inner_sql, inner_params = inner_query_compiler.as_sql(
    680680            # The limits must be applied to the outer query to avoid pruning
  • tests/expressions_window/tests.py

    diff --git a/tests/expressions_window/tests.py b/tests/expressions_window/tests.py
    index 027fc9c25c..a27c4f222f 100644
    a b def test_invalid_filter(self):  
    15781578            list(qs.exclude(window=1, department="Accounting"))
    15791579
    15801580
    1581 class WindowUnsupportedTests(TestCase):
     1581class WindowUnsupportedTests(SimpleTestCase):
    15821582    def test_unsupported_backend(self):
    15831583        msg = "This backend does not support window expressions."
    15841584        with mock.patch.object(connection.features, "supports_over_clause", False):
    def test_unsupported_backend(self):  
    15871587                    dense_rank=Window(expression=DenseRank())
    15881588                ).get()
    15891589
     1590    def test_filter_subquery(self):
     1591        qs = Employee.objects.annotate(
     1592            department_salary_rank=Window(
     1593                Rank(), partition_by="department", order_by="-salary"
     1594            )
     1595        )
     1596        msg = "Referencing outer query window expression is not supported: department_salary_rank"
     1597        with self.assertRaisesMessage(NotSupportedError, msg):
     1598            qs.annotate(
     1599                employee_name=Subquery(
     1600                    Employee.objects.filter(
     1601                        age=OuterRef("department_salary_rank")
     1602                    ).values("name")[:1]
     1603                )
     1604            )
     1605
    15901606
    15911607class NonQueryWindowTests(SimpleTestCase):
    15921608    def test_window_repr(self):
Last edited 2 years ago by Simon Charette (previous) (diff)

comment:4 by Mariusz Felisiak, 2 years ago

Jannis, let's follow Simon's recommendation.

comment:5 by Jannis Vajen, 2 years ago

Dear Mariusz, dear Simon,

I updated the old pull request I opened before with a new commit containing Simon's suggestions. Please have a look.

comment:6 by Simon Charette, 2 years ago

Thanks Jannis! To make it clear, I think we should definitely have a ticket to add outer reference to window support but it seems impossible to do so before the 4.2 release.

in reply to:  6 comment:7 by Mariusz Felisiak, 2 years ago

Replying to Simon Charette:

Thanks Jannis! To make it clear, I think we should definitely have a ticket to add outer reference to window support but it seems impossible to do so before the 4.2 release.

A new ticket is probably not required as #28333 is still open.

comment:8 by Mariusz Felisiak, 2 years ago

Triage Stage: AcceptedReady for checkin

comment:9 by Mariusz Felisiak <felisiak.mariusz@…>, 2 years ago

Resolution: fixed
Status: newclosed

In c67ea79a:

Fixed #34368 -- Made subquery raise NotSupportedError when referencing outer window expression.

Regression in f387d024fc75569d2a4a338bfda76cc2f328f627.

Co-authored-by: Jannis Vajen <jvajen@…>

comment:10 by Mariusz Felisiak <felisiak.mariusz@…>, 2 years ago

In fc15d11f:

[4.2.x] Fixed #34368 -- Made subquery raise NotSupportedError when referencing outer window expression.

Regression in f387d024fc75569d2a4a338bfda76cc2f328f627.

Co-authored-by: Jannis Vajen <jvajen@…>

Backport of c67ea79aa981ae82595d89f8018a41fcd842e7c9 from main

Note: See TracTickets for help on using tickets.
Back to Top