Opened 18 months ago

Last modified 8 months ago

#21160 new Bug

in_bulk() fails on SQLite when passing more than 999 ids

Reported by: NiGhTTraX Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: in_bulk sqlite 999 1000
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by EvilDMP)

As we all know, SQLite doesn't allow more than 999 variables in a query, failing with the 'too many SQL variables' error if you pass in more. This is a problem when using the in_bulk() method.

I've looked through the code and found out that all backends define a bulk_batch_size ops feature and it is set to 999 in the SQLite backend.
https://github.com/django/django/blob/master/django/db/backends/sqlite3/base.py#L140

This ops feature is only used in _batched_insert. I've written a patch that makes use of it in the in_bulk method as well. You can check it here: https://github.com/django/django/pull/1679

I've also written two tests that test passing in a large number of ids and also the efficiency in this case. The efficiency test runs only on databases that don't support 1000 variables in a query.

Any comments are welcome.

Change History (6)

comment:1 Changed 18 months ago by NiGhTTraX

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

Pull request: https://github.com/django/django/pull/1679

For the love of me I can't find the way to edit the original description.

comment:2 Changed 18 months ago by timo

  • Severity changed from Release blocker to Normal

comment:3 Changed 18 months ago by EvilDMP

  • Description modified (diff)

comment:4 Changed 18 months ago by aaugustin

  • Triage Stage changed from Unreviewed to Accepted

At first sight, the patch looks reasonable.

comment:5 Changed 15 months ago by NiGhTTraX

Any other thoughts on this?

comment:6 Changed 8 months ago by akaariai

  • Patch needs improvement set

Updated patch can be found from https://github.com/django/django/pull/3046.

The usage of bulk_batch_size() is incorrect in the patch. The method is documented to receive a list of objects, but for the use case in this ticket it receives an iterable of field values. We can either change the method definition so that it can receive either an iterable of field values, or a list of objects. Or we can just add a new feature "max_query_params" and use that instead.

I slightly prefer the approach of adding max_query_params in to database features. Then we can make supports_1000_query_params to be a property which is calculated based on max_query_params, and also make the base bulk_batch_size() to use the max_query_params feature.

Note: See TracTickets for help on using tickets.
Back to Top