Opened 22 months ago

Closed 22 months ago

Last modified 5 months ago

#31135 closed Bug (invalid)

OuterRef generates invalid SQL with __in filter.

Reported by: Bernd Wechner Owned by: nobody
Component: Database layer (models, ORM) Version: 3.0
Severity: Normal Keywords:
Cc: Simon Charette Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This is new to Django 3.0 and may relate to: https://code.djangoproject.com/ticket/31133

In migrating my project I found a SQL crash. I have traced it to a SubQuery that includes a filter using in on an OuterRef.

Here is the code for context:

        pfilter = Q(
            session__date_time=Subquery(
                (Performance.objects
                    .filter(Q(player__in=OuterRef('player')) & pfilter)
                    .values('player')
                    .annotate(max_date=Max('session__date_time'))
                    .values('max_date')[:1]
                ), output_field=models.DateField()
            )
        )

which in Django 2 produces this SQL (I remove irrelevant attributes on the first select that are identical across the two Django versions to help see the difference):

SELECT "Leaderboards_performance"."id" 
FROM "Leaderboards_performance"
INNER JOIN "Leaderboards_session" ON ("Leaderboards_performance"."session_id" = "Leaderboards_session"."id")
WHERE "Leaderboards_session"."date_time" =
        (SELECT MAX(U2."date_time") AS "max_date"
         FROM "Leaderboards_performance" U0
         INNER JOIN "Leaderboards_player" U1 ON (U0."player_id" = U1."id")
         INNER JOIN "Leaderboards_session" U2 ON (U0."session_id" = U2."id")
         WHERE (U0."player_id" IN ("Leaderboards_performance"."player_id")
                AND U2."game_id" = 29
                AND U2."date_time" <= '2019-11-02 08:30:00+00:00')
         GROUP BY U0."player_id",
                  U2."date_time",
                  U1."name_nickname"
         ORDER BY U2."date_time" DESC, U1."name_nickname" ASC
         LIMIT 1)
ORDER BY "Leaderboards_performance"."trueskill_eta_after" DESC

and in Django 3:

SELECT "Leaderboards_performance"."id"
FROM "Leaderboards_performance"
INNER JOIN "Leaderboards_session" ON ("Leaderboards_performance"."session_id" = "Leaderboards_session"."id")
WHERE "Leaderboards_session"."date_time" =
        (SELECT MAX(U2."date_time") AS "max_date"
         FROM "Leaderboards_performance" U0
         INNER JOIN "Leaderboards_player" U1 ON (U0."player_id" = U1."id")
         INNER JOIN "Leaderboards_session" U2 ON (U0."session_id" = U2."id")
         WHERE (U0."player_id" IN "Leaderboards_performance"."player_id"
                AND U2."game_id" = 29
                AND U2."date_time" <= '2019-11-02 08:30:00+00:00')
         GROUP BY U0."player_id",
                  U2."date_time",
                  U1."name_nickname"
         ORDER BY U2."date_time" DESC, U1."name_nickname" ASC
         LIMIT 1)
ORDER BY "Leaderboards_performance"."trueskill_eta_after" DESC

Essentially the clause:

Q(player__in=OuterRef('player'))

produces valid SQL syntax in 2 and invalid syntax in 3. It loses the braces around the OuterRef that the IN operator demands.

As it happens I can work around this by recasting this as:

Q(player=OuterRef('player'))

which is canonically more correct in any case given (on a code check) this Q object is only ever used inside a query that has a single player and not many and arguably was a code bug on this site. But the observation remains that Django3 is now producing invalid SQL for a query that Django2 produced valid SQL for and there is no mention or hint in the release notes as to an incompatible change in this area, so I've spent some considerable time drilling down to find the cause.

OuterRef should of course always be wrapped in braces when used with an IN ...

Change History (3)

comment:1 Changed 22 months ago by Mariusz Felisiak

Cc: Simon Charette added
Component: UncategorizedDatabase layer (models, ORM)
Resolution: invalid
Status: newclosed
Summary: OuterRef generates invalid SQL with __in filterOuterRef generates invalid SQL with __in filter.

Thanks for this report, behavior was changed in 3a505c70e7b228bf1212c067a8f38271ca86ce09, however as far as I'm concerned this usage is not supported (see documentation). So it revealed an issue in your code, not in Django.

comment:2 Changed 22 months ago by Bernd Wechner

Indeed an issue in my code. But also an issue with Django. There is no excuse to go from generating valid SQL to generating invalid SQL. If that usage is not supported, fine, but throw an exception before it bombs on a SQL syntax error. Half the raison d'etre of Django is surely to isolate a web designer from the innards of SQL queries as best possible, and not to thrust them into SQL land trying work out why their ORM is generating invalid SQL, when it could just as well through a lucid DJango exception.

To wit, I would suggest reopening it and flagging a fix if only for Django to provide more lucid feedback than a SQL syntax error.

comment:3 Changed 5 months ago by Martín Gaitán

Hi, I'm upgrading a project from 2.1 to 3.2 and also hit this issue.

My code does something like this

        agrupaciones_subquery = (
            AgrupacionCircuitos.objects.filter(id__in=self.agrupaciones_a_considerar())
            .filter(
                id__in=(OuterRef("carga__mesa_categoria__mesa__circuito__agrupaciones"))
            )
            .values_list("id", flat=True)
        )
        return (
            self.votos_reportados(categoria, mesas)
            .values_list("opcion__id")
            .annotate(id_agrupacion=Subquery(agrupaciones_subquery))
            .exclude(id_agrupacion__isnull=True)
            .annotate(sum_votos=Sum("votos"))
        )

I can't change id__in=(OuterRef("carga__mesa_categoria__mesa__circuito__agrupaciones")) by id=(OuterRef... ) because in my case, it may contains multiples related "agrupaciones" . Which should be the workaround?

Btw, I also agree with Bernd Wechner about the error raised should be more clear.

Last edited 5 months ago by Martín Gaitán (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top