Opened 80 minutes ago

Last modified 26 minutes ago

#37140 new Bug

NULL handling of exclude() with __in varies between nullable expressions versus literal None

Reported by: Sebastian Vera Owned by:
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: exclude in subquery NULL NOT IN
Cc: Simon Charette Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The in field lookup reference (docs/ref/models/querysets.txt) documents
passing a queryset on the right-hand side of __in, but does not mention
that exclude(field__in=qs) where qs.values(...) can return NULL produces
a silently-empty result set on Postgres (and any database following SQL's
three-valued logic).

This is the subquery counterpart of #20024 (literal-list flavor, fixed in
cec10f99). The subquery flavor was identified by Anssi Kääriäinen and Simon
Riggs on django-developers in 2013 [1], but the ORM-side fix never landed,
and there is currently no warning in the docs.

We hit this in production on a real invoice queryset and tracked it down to
the documented NOT IN / NULL semantics. The workaround we shipped is the
same one Simon Riggs recommended in the 2013 thread: rewrite to
~Exists(OuterRef(...)), which compiles to NULL-safe NOT EXISTS.

Proposal: add a .. warning:: block to the in field lookup reference that

  1. describes the failure mode (NULL in subquery -> NOT IN evaluates to UNKNOWN -> empty queryset)
  2. points readers to Exists + OuterRef as the NULL-safe replacement
  3. cross-references #20024 for the literal-list case

Plus a one-line cross-reference under the exclude() method docs pointing
to the new warning.

[1] https://groups.google.com/g/django-developers/c/OG1unUV-MOU

Change History (2)

comment:1 by Sebastian Vera, 65 minutes ago

Has patch: set
Version 0, edited 65 minutes ago by Sebastian Vera (next)

comment:2 by Jacob Walls, 26 minutes ago

Cc: Simon Charette added
Component: DocumentationDatabase layer (models, ORM)
Has patch: unset
Summary: Document NULL handling of exclude() with __in subqueriesNULL handling of exclude() with __in varies between nullable expressions versus literal None
Type: Cleanup/optimizationBug

Thanks for the suggestion. It's rare for us to document bugs and workarounds, but we have done it in some cases, and it might make sense here.

To decide that, I want to get a handle first on whether we even plan to fix this.

ticket:31667#comment:7 (2020) mentioned the fact that filter(field__in=list(nullable_subquery)) started behaving differently from filter(field__in=nullable_subquery) as a reason to revert that change, a decision we never took.

ticket:20024#comment:14 (2020) suggests we can live with that drift after all.

ticket:32043#comment:3 (2020) suggested we wouldn't even try to fix this for nullable expressions, due to difficulty, but in a comment around the same time (ticket:20024#comment:9) Simon provided some hints. I'm not sure if Simon gave that a shot at some point and decided it was not practical.

Given all that, I'd like to triage the bug aspect first before accepting a docs fix.


I checked the WIP for a nullable expression flag (#24267), and it doesn't currently handle this. In it, I see this explanation:

class Query(...
    def is_nullable(self, nullable_aliases):
        # A subquery can always return no rows and thus be nullable but the ORM
        # currently has some expectations with regards to IN vs ANY lookups that
        # must be revisited to allow this logic to work properly.
        return False
Note: See TracTickets for help on using tickets.
Back to Top