Opened 4 hours ago
Last modified 24 minutes ago
#37097 assigned Bug
Nested double-unioned query for model with ordering fails on Postgres/MySql
| Reported by: | Shai Berger | Owned by: | Jacob Walls |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | dev |
| Severity: | Release blocker | Keywords: | |
| Cc: | Shai Berger, Simon Charette | Triage Stage: | Accepted |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
Yep. Sorry about the title, that's the most succinct I can make it.
Add the following test in django/tests/queries/test_qs_combinators.py, say after test_union_in_with_ordering_and_slice
def test_double_union_in_with_ordering(self): qs1 = Author.objects.filter(num__gt=7) qs2 = Author.objects.filter(num__lt=2) qs3 = Author.objects.filter(num=5) self.assertQuerySetEqual( # Query formatted to allow easy commenting-out of parts Author.objects.exclude(id__in=qs1 .union(qs2, all=True) .union(qs3, all=True) .values("id")), [] )
On Postgres, this passes on 6.0, but fails on the main branch (towards 6.1 as I write this). The error is
django.db.utils.ProgrammingError: each UNION query must have the same number of columns LINE 1: ...) ORDER BY "__orderbycol2" ASC) UNION ALL (SELECT "U0"."id" ...
because the query (formatted for some readability) is
SELECT "queries_author"."id", "queries_author"."name", "queries_author"."num", "queries_author"."extra_id" FROM "queries_author" WHERE NOT ( "queries_author"."id" IN ( ( ( SELECT "U0"."id" AS "id", "U0"."name" AS "__orderbycol2" FROM "queries_author" "U0" WHERE "U0"."num" > 7 ORDER BY "U0"."name" ASC ) UNION ALL ( SELECT "U0"."id" AS "id", "U0"."name" AS "__orderbycol2" FROM "queries_author" "U0" WHERE "U0"."num" < 2 ORDER BY "U0"."name" ASC ) ORDER BY "__orderbycol2" ASC -- I think this is the culprit ) UNION ALL ( SELECT "U0"."id" AS "id" FROM "queries_author" "U0" WHERE "U0"."num" = 5 ORDER BY "U0"."name" ASC ) ) ) ORDER BY "queries_author"."name" ASC
Notes:
- Double union is required, a single union works
- This happens when the ordering is defined in the model Meta, but not if you just add ordering on the queries. I suspect this is because having the ordering in the Meta makes the query equivalent to
Author.objects.exclude(id__in=qs1 .union(qs2, all=True) .order_by("name") # This is implied .union(qs3, all=True) .values("id"))
- On Oracle and Sqlite, the inner order is removed by the fix of #36938. I haven't actually tested it on MySql, but I believe it has the same relevant feature flags and so should behave the same.
- The test as written fails on Sqlite on 6.0, with
django.db.utils.DatabaseError: ORDER BY not allowed in subqueries of compound statements-- similar tests are skipped there; but it passes on Postgres, and passes on Sqlite on main.
Change History (3)
comment:1 by , 4 hours ago
| Keywords: | Regression removed |
|---|---|
| Owner: | set to |
| Status: | new → assigned |
| Triage Stage: | Unreviewed → Accepted |
comment:2 by , 96 minutes ago
| Cc: | added |
|---|
Bisected to 087bb9e8f3478d53f12b1737af865992af17c5f2, which seems like an improvement.
SQL on stable/6.0.x:
SELECT ... FROM "queries_author" WHERE NOT ("queries_author"."id" IN (( (SELECT U0."id" AS "id" FROM "queries_author" U0 WHERE U0."num" > 7 ORDER BY U0."name" ASC) UNION ALL (SELECT U0."id" AS "id" FROM "queries_author" U0 WHERE U0."num" < 2 ORDER BY U0."name" ASC)) UNION ALL (SELECT U0."id" AS "id" FROM "queries_author" U0 WHERE U0."num" = 5 ORDER BY U0."name" ASC))) ORDER BY "queries_author"."name" ASC;
And on main:
SELECT ... FROM "queries_author" WHERE NOT ("queries_author"."id" IN (( (SELECT "U0"."id" AS "id", "U0"."name" AS "__orderbycol2" FROM "queries_author" "U0" WHERE "U0"."num" > 7 ORDER BY "U0"."name" ASC) UNION ALL (SELECT "U0"."id" AS "id", "U0"."name" AS "__orderbycol2" FROM "queries_author" "U0" WHERE "U0"."num" < 2 ORDER BY "U0"."name" ASC) ORDER BY "__orderbycol2" ASC) UNION ALL (SELECT "U0"."id" AS "id" FROM "queries_author" "U0" WHERE "U0"."num" = 5 ORDER BY "U0"."name" ASC))) ORDER BY "queries_author"."name" ASC;
Two options come to mind:
Option 1: take a lighter touch when setting default ordering after union? (EDIT: removed incorrect suggested diff.)
Option 2: Extend the fix from yesterday's Oracle/SQLite fixes to also clear unnecessary orderings on Postgres:
-
django/db/models/sql/compiler.py
diff --git a/django/db/models/sql/compiler.py b/django/db/models/sql/compiler.py index 764dc46cfc..bf90397ed8 100644
a b class SQLCompiler: 636 636 if selected is not None and compiler.query.selected is None: 637 637 compiler.query = compiler.query.clone() 638 638 compiler.query.set_values(selected) 639 if ( 640 ( 641 features.requires_compound_order_by_subquery 642 and not features.ignores_unnecessary_order_by_in_subqueries 643 ) 644 or not features.supports_parentheses_in_compound 645 ) and compiler.get_order_by(): 639 if compiler.get_order_by(): 646 640 compiler.query = compiler.query.clone() 647 641 compiler.query.clear_ordering(force=False) 648 642 part_sql, part_args = compiler.as_sql(with_col_aliases=True)
Simon, do you have an opinion about the best course of action here?
comment:3 by , 26 minutes ago
When an __in lookup is used we explicitly clear the ordering as it's unnecessary (including the default one).
This problem made me think of your initial approach at making Query.clear_ordering somewhat recursive Jacob. Should we instead consider making the following changes instead?
-
django/db/models/sql/query.py
diff --git a/django/db/models/sql/query.py b/django/db/models/sql/query.py index 45192b7809..603b7056ad 100644
a b class Query(BaseExpression): 2402 2402 self.extra_order_by = () 2403 2403 if clear_default: 2404 2404 self.default_ordering = False 2405 for query in self.combined_queries: 2406 query.clear_ordering(force=force, clear_default=clear_default) 2405 2407 2406 2408 def set_group_by(self, allow_aliases=True): 2407 2409 """
Thanks, I'll take a look today.