#31667 closed Cleanup/optimization (fixed)
Avoid passing NULL to the IN lookup
| Reported by: | Adam Johnson | Owned by: | Adam Johnson | 
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | dev | 
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Accepted | |
| Has patch: | yes | Needs documentation: | no | 
| Needs tests: | no | Patch needs improvement: | no | 
| Easy pickings: | no | UI/UX: | no | 
Description
Currently prefetch_related on a FK passes the NULL through to the database for e.g. author_id IN (NULL, 2). Passing NULL is always unnecessary, since it's not allowed in FK's. There's a small risk from passing NULL that it could lead to incorrect with complex prefetch querysets using PK refs because of NULL's weirdness in SQL.
For example with these models:
from django.db import models
class Author(models.Model):
    pass
class Book(models.Model):
    author = models.ForeignKey(Author, null=True, on_delete=models.DO_NOTHING)
Prefetching authors on Books, when at least one Book has author=None, uses IN (..., NULL, ...) in the query:
In [1]: from example.core.models import Author, Book
In [2]: a1 = Author.objects.create()
In [3]: Book.objects.create(author=a1)
Out[3]: <Book: Book object (3)>
In [4]: Book.objects.create(author=None)
Out[4]: <Book: Book object (4)>
In [5]: Book.objects.prefetch_related('author')
Out[5]: <QuerySet [<Book: Book object (3)>, <Book: Book object (4)>]>
In [6]: from django.db import connection
In [7]: connection.queries
Out[7]:
[{'sql': 'INSERT INTO "core_author" ("id") VALUES (NULL)', 'time': '0.001'},
 {'sql': 'INSERT INTO "core_book" ("author_id") VALUES (2)', 'time': '0.001'},
 {'sql': 'INSERT INTO "core_book" ("author_id") VALUES (NULL)',
  'time': '0.001'},
 {'sql': 'SELECT "core_book"."id", "core_book"."author_id" FROM "core_book" LIMIT 21',
  'time': '0.000'},
 {'sql': 'SELECT "core_author"."id" FROM "core_author" WHERE "core_author"."id" IN (NULL, 2)',
  'time': '0.000'}]
Maybe this could generally be extended to use of __in with non-nullable fields?
Change History (10)
comment:1 by , 5 years ago
| Triage Stage: | Unreviewed → Accepted | 
|---|
comment:4 by , 5 years ago
| Summary: | Avoid passing NULL to prefetch_related() queries → Avoid passing NULL to the IN lookup | 
|---|
comment:7 by , 5 years ago
As Adam Sołtysik pointed out in #20024.this change also introduced an inconsistency 
Foo.objects.filter(bar__in=list(Bar.objects.values_list('nullable_field', flat=True)))
Won't match
Foo.objects.filter(bar__in=Bar.objects.values_list('nullable_field', flat=True))
If any NULL values are returned.
We should make sure that a complete corrective for #20024 lands in 3.2 otherwise 5776a1660e54a95159164414829738b665c89916 should be reverted.
comment:8 by , 5 years ago
There is also an inconsistency between In and RelatedIn lookups. The latter seems to be working consistenly with this change (details in #20024).
Since
INtranslates toOR =for each elements andNULL != NULLI assume it could be done at the__inlookup level even for non-nullable fields.