Opened 10 years ago

Closed 7 years ago

#21160 closed Bug (fixed)

in_bulk() fails on SQLite when passing more than 999 ids

Reported by: Andrei Picus Owned by: Mariusz Felisiak
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: in_bulk sqlite 999 1000
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Daniele Procida)

As we all know, SQLite doesn't allow more than 999 variables in a query, failing with the 'too many SQL variables' error if you pass in more. This is a problem when using the in_bulk() method.

I've looked through the code and found out that all backends define a bulk_batch_size ops feature and it is set to 999 in the SQLite backend.
https://github.com/django/django/blob/master/django/db/backends/sqlite3/base.py#L140

This ops feature is only used in _batched_insert. I've written a patch that makes use of it in the in_bulk method as well. You can check it here: https://github.com/django/django/pull/1679

I've also written two tests that test passing in a large number of ids and also the efficiency in this case. The efficiency test runs only on databases that don't support 1000 variables in a query.

Any comments are welcome.

Change History (11)

comment:1 by Andrei Picus, 10 years ago

Pull request: https://github.com/django/django/pull/1679

For the love of me I can't find the way to edit the original description.

comment:2 by Tim Graham, 10 years ago

Severity: Release blockerNormal

comment:3 by Daniele Procida, 10 years ago

Description: modified (diff)

comment:4 by Aymeric Augustin, 10 years ago

Triage Stage: UnreviewedAccepted

At first sight, the patch looks reasonable.

comment:5 by Andrei Picus, 10 years ago

Any other thoughts on this?

comment:6 by Anssi Kääriäinen, 10 years ago

Patch needs improvement: set

Updated patch can be found from https://github.com/django/django/pull/3046.

The usage of bulk_batch_size() is incorrect in the patch. The method is documented to receive a list of objects, but for the use case in this ticket it receives an iterable of field values. We can either change the method definition so that it can receive either an iterable of field values, or a list of objects. Or we can just add a new feature "max_query_params" and use that instead.

I slightly prefer the approach of adding max_query_params in to database features. Then we can make supports_1000_query_params to be a property which is calculated based on max_query_params, and also make the base bulk_batch_size() to use the max_query_params feature.

comment:7 by Mariusz Felisiak, 7 years ago

Owner: changed from nobody to Mariusz Felisiak
Status: newassigned

comment:8 by GitHub <noreply@…>, 7 years ago

In f42c7cc:

Refs #21160 -- Replaced DatabaseFeatures.supports_1000_query_parameters by a DatabaseFeatures.max_query_params.

comment:9 by Mariusz Felisiak, 7 years ago

Patch needs improvement: unset

comment:10 by Tim Graham, 7 years ago

Triage Stage: AcceptedReady for checkin

comment:11 by GitHub <noreply@…>, 7 years ago

Resolution: fixed
Status: assignedclosed

In 1b6f05e9:

Fixed #21160 -- Fixed QuerySet.in_bulk() crash on SQLite when requesting more than 999 ids.

Thanks Andrei Picus and Anssi Kääriäinen for the initial patch
and Tim Graham for the review.

Note: See TracTickets for help on using tickets.
Back to Top