Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#33768 closed Bug (fixed)

MySQL ordering of nulls last/first is broken in combination with UNION

Reported by: Florian Apolloner Owned by: Simon Charette
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Adam Johnson, Simon Charette Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Florian Apolloner)

putting the following into test_qs_combinators.py:

    def test_ordering_with_null_first(self):
        Number.objects.filter(other_num=5).update(other_num=None)
        qs = (
            Number.objects.filter(num__lte=1)
            .union(Number.objects.filter(num__gte=2))
            .order_by(F("other_num").asc(nulls_first=True))
            .values_list("other_num", flat=True)
        )
        self.assertQuerysetEqual(qs, [None, 1, 2, 3, 4, 6, 7, 8, 9, 10])
        qs = (
            Number.objects.filter(num__lte=1)
            .union(Number.objects.filter(num__gte=2))
            .order_by(F("other_num").asc(nulls_last=True))
            .values_list("other_num", flat=True)
        )
        self.assertQuerysetEqual(qs, [1, 2, 3, 4, 6, 7, 8, 9, 10, None])

results in:

======================================================================
FAIL: test_ordering_with_null_first (queries.test_qs_combinators.QuerySetSetOperationTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/florian/sources/django.git/tests/queries/test_qs_combinators.py", line 171, in test_ordering_with_null_first
    self.assertQuerysetEqual(qs, [1, 2, 3, 4, 6, 7, 8, 9, 10, None])
  File "/home/florian/sources/django.git/django/test/testcases.py", line 1328, in assertQuerysetEqual
    return self.assertEqual(list(items), values, msg=msg)
AssertionError: Lists differ: [None, 1, 2, 3, 4, 6, 7, 8, 9, 10] != [1, 2, 3, 4, 6, 7, 8, 9, 10, None]

First differing element 0:
None
1

- [None, 1, 2, 3, 4, 6, 7, 8, 9, 10]
+ [1, 2, 3, 4, 6, 7, 8, 9, 10, None]

----------------------------------------------------------------------
(0.000) UPDATE `queries_number` SET `other_num` = NULL WHERE `queries_number`.`other_num` = 5; args=(5,); alias=default
(0.000) (SELECT `queries_number`.`other_num` FROM `queries_number` WHERE `queries_number`.`num` <= 1) UNION (SELECT `queries_number`.`other_num` FROM `queries_number` WHERE `queries_number`.`num` >= 2) ORDER BY (1) ASC; args=(1, 2); alias=default
(0.000) (SELECT `queries_number`.`other_num` FROM `queries_number` WHERE `queries_number`.`num` <= 1) UNION (SELECT `queries_number`.`other_num` FROM `queries_number` WHERE `queries_number`.`num` >= 2) ORDER BY (1) IS NULL, (1) ASC; args=(1, 2); alias=default

----------------------------------------------------------------------

Due to the UNION we are rewriting queries to use column numbers so we can somewhat reliably target the columns. This breaks down when nulls_first/last is used because (1) IS NULL is interpreted as expression in mysql and not as the colname 1 being NULL. Which brings me back to https://twitter.com/fapolloner/status/1533512493208936450 -- I think that the rewriting of this expression should happen in the compiler and not in OrderBy itself. This way we could duplicate the OrderBy into two and push them into the relevant select clauses like we already do in https://github.com/django/django/blob/49b470b9187b6be60e573fed08c8f4a87f133750/django/db/models/sql/compiler.py#L410 for other things.

Sadly it is a mess… Thoughts? Paging Adam since he was the resident MySQL specialist IIRC :)

Change History (11)

comment:1 by Florian Apolloner, 2 years ago

Description: modified (diff)

comment:2 by Simon Charette, 2 years ago

Cc: Simon Charette added

What if we performed a subquery pushdown instead to make compilation more predictable so we can have the order by clause refer to the union columns by name?

SELECT `other_num`
FROM (
    (SELECT `queries_number`.`other_num` FROM `queries_number` WHERE `queries_number`.`num` <= 1)
    UNION
    (SELECT `queries_number`.`other_num` FROM `queries_number` WHERE `queries_number`.`num` >= 2)
)
ORDER BY `other_num` IS NULL

This is a strategy that is used successfully for aggregation for example. I don't see OrderBy.as_sql usage of template for compilation is an abuse at all; it's in charge of it's own compilation unit. As mentioned on #33767 I'd prefer if we errored out on constant order by usage instead of making leaps to support it because it has ambiguous evaluation rules.

comment:3 by Carlton Gibson, 2 years ago

Triage Stage: UnreviewedAccepted

comment:4 by Florian Apolloner, 2 years ago

What if we performed a subquery pushdown instead to make compilation more predictable so we can have the order by clause refer to the union columns by name?

Maybe, in hindsight that would probably be have the way to implement unions (etc) in the first place. Maybe now is the time to switch to that :) What does a subquery mean in terms of performance -- I'd hope the database wouldn't care to much…

I don't see OrderBy.as_sql usage of template for compilation is an abuse at all; it's in charge of it's own compilation unit.

That is fair, but then again it still requires help from the outside because we have to pull it into the select clause when used with distinct. Which then results in horrible queries like the one outlined by Tim in https://github.com/django/django/pull/15687#issuecomment-1133798092 (see "The old SQL") where we have the initial select value and then once with IS NULL and once without because we can't match it up to the initial select value that is aliased to last_date anymore. In practice part of this is caused by https://github.com/django/django/blob/49b470b9187b6be60e573fed08c8f4a87f133750/django/db/models/sql/compiler.py#L466 where we use a regex to parse the generated SQL from OrderBy and match it back to the select clause, which then obviously fails because the compilation of OrderBy generated two clauses. Imo this is all rather fragile…

Also, and this is where #33767 comes back into play, we should try as hard as possible to reuse aliases where possible. Tim's old SQL example executed on PG nowadays (his is CockroachDB) looks like this:

 SELECT DISTINCT "ordering_author"."id",
                "ordering_author"."name",
                "ordering_author"."editor_id",
                (SELECT Max(U0."pub_date") AS "last_date"
                 FROM   "ordering_article" U0
                 WHERE  ( U0."author_id" = ( "ordering_author"."id" )
                          AND Upper(U0."headline" :: text) LIKE Upper(
                              '%Article%') )
                 GROUP  BY U0."author_id") AS "last_date",
FROM   "ordering_author"
ORDER  BY (SELECT Max(U0."pub_date") AS "last_date"
           FROM   "ordering_article" U0
           WHERE  ( U0."author_id" = ( "ordering_author"."id" )
                    AND Upper(U0."headline" :: text) LIKE Upper('%Article%') )
           GROUP  BY U0."author_id") DESC NULLS FIRST


when it could be

SELECT DISTINCT "ordering_author"."id",
                "ordering_author"."name",
                "ordering_author"."editor_id",
                (SELECT Max(U0."pub_date") AS "last_date"
                 FROM   "ordering_article" U0
                 WHERE  ( U0."author_id" = ( "ordering_author"."id" )
                          AND Upper(U0."headline" :: text) LIKE Upper(
                              '%Article%') )
                 GROUP  BY U0."author_id") AS "last_date",
FROM   "ordering_author"
ORDER  BY "last_date" DESC NULLS FIRST

Reusing aliases and references seems like a good idea. It makes the query imo more readable and also means less parsing overhead for the server (probably negligible but still).

Last but not least, the reason I discovered this is the new code for psycopg3 support. Psycopg3 makes use of server side param binding which introduces a whole new class of issues. To be fair this is not postgres alone but also informix and others. When actually using server side params you run into funny things like this: Assume Django ends up generating a query like this:

cursor.execute("""
    SELECT left(name, %s) AS prefix, count(*) AS number
    FROM people
    GROUP BY left(name, %s)
    ORDER BY left(name, %s)
    """,
    [2, 2, 2])

This works perfectly fine with client side merging (simply replace %s with 2), but breaks done once you use server side binding because the server gets $1 to $3 for the params and doesn't know that they are actually the same and as such fails because it no longer knows if the group by is part of the select clause or not. Mind you, this example is rather simplistic but Django actually generates those like shown in Tim's example above.

To fix this we have two options:

  • Switch to named parameters and introduce even more logic to always write %(same_param)s in the previous example.
  • Switch our usage of order by / group by to use aliases and column numbers whenever possible.
Version 0, edited 2 years ago by Florian Apolloner (next)

comment:5 by Aniruddh Singh, 2 years ago

Owner: changed from nobody to Aniruddh Singh
Status: newassigned

Hello Everyone, I am new to open source contribution. But had used Django for the last 3 years. I want to contribute to the resolution of this bug so that I can start my journey as an open-source contributor to Django.

comment:6 by Simon Charette, 2 years ago

Aniruddh, we appreciate your interest but this issue is quite complex so you might want to focus on another issue instead for your first contributions.


Thanks for the great summary Florian!

Maybe, in hindsight that would probably should have been the way to implement unions (etc) in the first place. Maybe now is the time to switch to that :) What does a subquery mean in terms of performance -- I'd hope the database wouldn't care to much…

From some limited local testing on SQLite, PostgreSQL, and MySQL and looking at the generate EXPLAIN plan it doesn't seem like they care too much. But it might not be necessary after some more thoughts as the ORDER BY clause allows for ordering by aliases as well.

Reusing aliases and references seems like a good idea. It makes the query imo more readable and also means less parsing overhead for the server (probably negligible but still).

I completely agree with this statement but I don't understand how this relates to using selected field indexes instead of aliases?

From my understanding and testing engines locally it seems that all engines allow to refer to the first query member of the union by column aliases just like by index so once we've addressed the issue with ordering by selected references not being used we could entirely drop the specialized combinator logic that uses RawSQL to order by column index?

comment:7 by Florian Apolloner, 2 years ago

Hi Simon, sorry I lost track of this one in my mails.

From my understanding and testing engines locally it seems that all engines allow to refer to the first query member of the union by column aliases just like by index

Interesting, that would be great indeed. I think I might have had issues with UNION at the time I implemented it but if aliases work around the board, that would be preferably imo.

we could entirely drop the specialized combinator logic that uses RawSQL to order by column index?

Hopefully yes :)

comment:8 by Simon Charette, 2 years ago

Submitted a PR for further discussion on the subject.

comment:9 by Mariusz Felisiak, 2 years ago

Has patch: set
Owner: changed from Aniruddh Singh to Simon Charette
Triage Stage: AcceptedReady for checkin

comment:10 by Mariusz Felisiak <felisiak.mariusz@…>, 2 years ago

Resolution: fixed
Status: assignedclosed

In c58a8acd:

Fixed #33768 -- Fixed ordering compound queries by nulls_first/nulls_last on MySQL.

Columns of the left outer most select statement in a combined query
can be referenced by alias just like by index.

This removes combined query ordering by column index and avoids an
unnecessary usage of RawSQL which causes issues for backends that
specialize the treatment of null ordering.

comment:11 by Mariusz Felisiak <felisiak.mariusz@…>, 2 years ago

In f47fec3:

Refs #33768 -- Fixed ordering compound queries by NULLs on SQLite < 3.30.

The lack of support for native nulls last/first on SQLite 3.28 and 3.29
requires the compound query to be wrapped for emulation layer to work
properly.

Note: See TracTickets for help on using tickets.
Back to Top