#36430 closed Cleanup/optimization (fixed)
bulk_batch_size() special-cases single field on SQLite according to outdated limit
| Reported by: | Jacob Walls | Owned by: | Jacob Walls |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | dev |
| Severity: | Normal | Keywords: | SQLITE_MAX_COMPOUND_SELECT, bulk_create |
| Cc: | Sage Abdullah | Triage Stage: | Ready for checkin |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description (last modified by )
On SQLite, bulk_batch_size() special-cases a field list of length 1 and applies the SQLITE_MAX_COMPOUND_SELECT limit to arrive at a value of 500.
I think this must date from before bulk inserts used VALUES syntax, which became available in SQLite in 2012, see discussion in Laravel.
When the list of fields exceeds 1, we go through the elif branch and arrive at much higher limits. I'm pretty sure this shows that the limit of 500 for fields=["pk"] is overly protective and can just be removed.
I don't have a unit test to provide, but you can play with changing the limit from 500 to 501 and see that test_large_delete still passes (no trouble selecting 501 objects).
(I found this while trying to refactor a different call site away from max_query_params in the hopes of just calling bulk_batch_size() until I saw how overly protective it was.)
I think this would be a good idea to resolve before anyone invests effort in reading the dynamic limit for this potentially irrelevant param.
Change History (12)
comment:1 by , 5 months ago
| Description: | modified (diff) |
|---|
comment:2 by , 5 months ago
| Description: | modified (diff) |
|---|
comment:3 by , 5 months ago
| Cc: | added |
|---|---|
| Triage Stage: | Unreviewed → Accepted |
comment:4 by , 5 months ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
I can take a swing at this given that we'll want to handle this mentioned cleanup at the same time.
comment:5 by , 5 months ago
Thanks Jacob for reporting this and thanks Sarah for cc-ing me to the ticket.
As mentioned, I did identify this in my previous PR. However, I wasn't sure why Django even uses SQLITE_MAX_COMPOUND_SELECT for a bulk query. I tried to look into where it might've come from but I didn't find any good leads.
I think this must date from before bulk inserts used
VALUESsyntax, which became available in SQLite in 2012, see discussion in Laravel.
Aaand, I think you're correct! I didn't know VALUES wasn't supported back then. My findings:
- The
500limit a.k.a.SQLITE_MAX_COMPOUND_SELECTwas added in 0a0a0d66b316598f7c296e8bf75749a14ce3ac49. As noted in the tests in that commit, Django usedUNIONed selects to do the bulk insert for some reason. - Related: a27582484cf814554907d2d1ad077852de36963f and #19351
- Looking at c27932ec938217d4fbb0adad23c0d0708f83f690 and #33460, yep. It's pretty recent.
If it's not needed anymore, I'm in favour of removing this. I'm not 100% sure, but it seems bulk_batch_size is only used for bulk_create, bulk_update, and deletion, so I don't think any of them needs to consider the "max items in a SELECT statement" limit on SQLite.
comment:7 by , 3 months ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
comment:8 by , 3 months ago
| Patch needs improvement: | set |
|---|---|
| Triage Stage: | Ready for checkin → Accepted |
comment:9 by , 3 months ago
| Patch needs improvement: | unset |
|---|
comment:10 by , 3 months ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
Here's a rough test after all to demo, just adjust
test_max_batch_sizelike this:tests/bulk_create/tests.py
name", "iso_two_letter", "description"]The test fails as it was *more* efficient than expected: