Opened 3 months ago

Last modified 3 weeks ago

#36430 closed Cleanup/optimization

bulk_batch_size() special-cases single field on SQLite according to outdated limit — at Version 2

Reported by: Jacob Walls Owned by:
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: SQLITE_MAX_COMPOUND_SELECT, bulk_create
Cc: Sage Abdullah Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Jacob Walls)

On SQLite, bulk_batch_size() special-cases a field list of length 1 and applies the SQLITE_MAX_COMPOUND_SELECT limit to arrive at a value of 500.

I think this must date from before bulk inserts used VALUES syntax, which became available in SQLite in 2012, see discussion in Laravel.

When the list of fields exceeds 1, we go through the elif branch and arrive at much higher limits. I'm pretty sure this shows that the limit of 500 for fields=["pk"] is overly protective and can just be removed.

I don't have a unit test to provide, but you can play with changing the limit from 500 to 501 and see that test_large_delete still passes (no trouble selecting 501 objects).

(I found this while trying to refactor a different call site away from max_query_params in the hopes of just calling bulk_batch_size() until I saw how overly protective it was.)

I think this would be a good idea to resolve before anyone invests effort in reading the dynamic limit for this potentially irrelevant param.

Change History (2)

comment:1 by Jacob Walls, 3 months ago

Description: modified (diff)

Here's a rough test after all to demo, just adjust test_max_batch_size like this:

  • tests/bulk_create/tests.py

    diff --git a/tests/bulk_create/tests.py b/tests/bulk_create/tests.py
    index d590a292de..fd30ca48f7 100644
    a b class BulkCreateTests(TestCase):  
    296296    @skipUnlessDBFeature("has_bulk_insert")
    297297    def test_max_batch_size(self):
    298298        objs = [Country(name=f"Country {i}") for i in range(1000)]
    299         fields = ["name", "iso_two_letter", "description"]
     299        fields = ["pk"]
    300300        max_batch_size = connection.ops.bulk_batch_size(fields, objs)
    301301        with self.assertNumQueries(ceil(len(objs) / max_batch_size)):
    302302            Country.objects.bulk_create(objs)

The test fails as it was *more* efficient than expected:

======================================================================
FAIL: test_max_batch_size (bulk_create.tests.BulkCreateTests.test_max_batch_size)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/.../django/tests/bulk_create/tests.py", line 301, in test_max_batch_size
    with self.assertNumQueries(ceil(len(objs) / max_batch_size)):
         ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 1 != 2 : 1 queries executed, 2 expected
Captured queries were:
1. INSERT INTO "bulk_create_country" ("name", "iso_two_letter", "description") VALUES ('Country 0', '', ''), ('Country 1', '', ''), ('Country 2', '', ''), ('Country 3', '', ...

----------------------------------------------------------------------
Ran 2 tests in 0.025s

FAILED (failures=1)

comment:2 by Jacob Walls, 3 months ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.
Back to Top