Opened 11 months ago

Closed 2 days ago

Last modified 2 days ago

#36492 closed Bug (fixed)

Sqlite: lack of specificity when querying against booleans hurts the query-optimizer

Reported by: Klaas van Schelven Owned by: Simon Charette
Component: Database layer (models, ORM) Version: 5.2
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When doing queries against booleanfields Django generates SQL of the the shape "WHERE" and "WHERE NOT" rather than matching exactly on 1 and 0.
This makes it harder than necessary for the sqlite query optimizer to optimize, because the selection appears much more broad than it is in practice.

This is exactly like described in this issue which was acknowledged and fixed years ago. (though I don't know about the "regression" bit, it's been going on since at least django 4.2):

https://code.djangoproject.com/ticket/32691 "Performance regression in Exact lookup on BooleanField on MySQL."

Here's a reproducer:

https://github.com/bugsink/zeroforfalse

$ python manage.py print_zero_or_false
SELECT "myapp_mymodel"."id", "myapp_mymodel"."boolean_field" FROM "myapp_mymodel" WHERE NOT "myapp_mymodel"."boolean_field"
SELECT "myapp_mymodel"."id", "myapp_mymodel"."boolean_field" FROM "myapp_mymodel" WHERE "myapp_mymodel"."boolean_field"

Here's the context of discovery of this bug, as well as a workaround:
https://github.com/bugsink/bugsink/pull/139

Change History (8)

comment:1 by Simon Charette, 11 months ago

Component: UncategorizedDatabase layer (models, ORM)
Owner: set to Simon Charette
Status: newassigned
Triage Stage: UnreviewedAccepted
Type: UncategorizedBug

Managed to reproduce as well

sqlite> CREATE TABLE foo (a bool, b bool);
sqlite> CREATE INDEX some_idx ON foo (a, b);
sqlite> EXPLAIN SELECT * FROM foo WHERE NOT a AND NOT b;
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     13    0                    0   Start at 13
1     OpenRead       0     2     0     2              0   root=2 iDb=0; foo
2     Explain        2     0     0     SCAN foo       0
3     Rewind         0     12    0                    0
4       Column         0     0     1                    0   r[1]=foo.a
5       If             1     11    1                    0
6       Column         0     1     1                    0   r[1]=foo.b
7       If             1     11    1                    0
8       Column         0     0     2                    0   r[2]=foo.a
9       Column         0     1     3                    0   r[3]=foo.b
10      ResultRow      2     2     0                    0   output=r[2..3]
11    Next           0     4     0                    1
12    Halt           0     0     0                    0
13    Transaction    0     0     2     0              1   usesStmtJournal=0
14    Goto           0     1     0                    0
sqlite> EXPLAIN SELECT * FROM foo WHERE a=false AND b=false;
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     12    0                    0   Start at 12
1     OpenRead       1     3     0     k(3,,,)        2   root=3 iDb=0; some_idx
2     Explain        2     0     0     SEARCH foo USING COVERING INDEX some_idx (a=? AND b=?)  0
3     Integer        0     1     0                    0   r[1]=0
4     Integer        0     2     0                    0   r[2]=0
5     SeekGE         1     11    1     2              0   key=r[1..2]
6       IdxGT          1     11    1     2              0   key=r[1..2]
7       Column         1     0     3                    0   r[3]=foo.a
8       Column         1     1     4                    0   r[4]=foo.b
9       ResultRow      3     2     0                    0   output=r[3..4]
10    Next           1     6     1                    0
11    Halt           0     0     0                    0
12    Transaction    0     0     2     0              1   usesStmtJournal=0
13    Goto           0     1     0                    0

(notice how the NOT usage doesn't make use of the covering index)

Unfortunately Django doesn't have a has_native_boolean_field feature flag so the closest we could do here is override conditional_expression_supported_in_where_clause like we did in #32691.

I'll try to submit work that unifies the different code path we have in place today.

Last edited 11 months ago by Simon Charette (previous) (diff)

comment:2 by Simon Charette, 11 months ago

Has patch: set

Took the has_native_boolean_type approach as suggested on ticket:32691#comment:8

comment:3 by Simon Charette, 11 months ago

Patch needs improvement: set

comment:4 by Simon Charette, 9 months ago

Patch needs improvement: unset

comment:5 by Simon Charette, 5 months ago

Addressed the merge conflicts and confirmed that the surfaced problems with exclude(indexed_bool_field=False|True) on SQLite is a distinct issue.

comment:6 by Jacob Walls <jacobtylerwalls@…>, 2 days ago

Resolution: fixed
Status: assignedclosed

In 4bbc27c:

Fixed #36492 -- Restored exact boolean lookup against literals on SQLite.

Performance regression in 37e6c5b on SQLite. Just like MySQL, and presumably
Oracle, which don't have a native boolean type and incidently store booleans in
integer columns, indices on such columns cannot be used when explicit boolean
literal equalities are omitted.

Adapt the logic introduced by refs #32691 for MySQL to be used for all backends
that don't support native boolean fields instead of special casing MySQL,
SQLite, and Oracle in their own special way.

Note that review of this work surfaced that SQLite's query planner also cannot
make use of indices when dealing with expressions of form

WHERE NOT (indexed_bool_field = false)

but that's a long standing problem unrelated to the restorative work performed
in this patch.

Thanks Klaas van Schelven for the report.

comment:7 by Jacob Walls, 2 days ago

Triage Stage: AcceptedReady for checkin

comment:8 by Jacob Walls <jacobtylerwalls@…>, 2 days ago

In ecc51822:

[6.1.x] Fixed #36492 -- Restored exact boolean lookup against literals on SQLite.

Performance regression in 37e6c5b on SQLite. Just like MySQL, and presumably
Oracle, which don't have a native boolean type and incidently store booleans in
integer columns, indices on such columns cannot be used when explicit boolean
literal equalities are omitted.

Adapt the logic introduced by refs #32691 for MySQL to be used for all backends
that don't support native boolean fields instead of special casing MySQL,
SQLite, and Oracle in their own special way.

Note that review of this work surfaced that SQLite's query planner also cannot
make use of indices when dealing with expressions of form

WHERE NOT (indexed_bool_field = false)

but that's a long standing problem unrelated to the restorative work performed
in this patch.

Thanks Klaas van Schelven for the report.

Backport of 4bbc27c8686f10f9556cef02dbfa9f5157fbcf56 from main.

Note: See TracTickets for help on using tickets.
Back to Top