Opened 11 years ago
Closed 8 years ago
#21160 closed Bug (fixed)
in_bulk() fails on SQLite when passing more than 999 ids
Reported by: | Andrei Picus | Owned by: | Mariusz Felisiak |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | in_bulk sqlite 999 1000 |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
As we all know, SQLite doesn't allow more than 999 variables in a query, failing with the 'too many SQL variables' error if you pass in more. This is a problem when using the in_bulk() method.
I've looked through the code and found out that all backends define a bulk_batch_size ops feature and it is set to 999 in the SQLite backend.
https://github.com/django/django/blob/master/django/db/backends/sqlite3/base.py#L140
This ops feature is only used in _batched_insert. I've written a patch that makes use of it in the in_bulk method as well. You can check it here: https://github.com/django/django/pull/1679
I've also written two tests that test passing in a large number of ids and also the efficiency in this case. The efficiency test runs only on databases that don't support 1000 variables in a query.
Any comments are welcome.
Change History (11)
comment:1 by , 11 years ago
comment:2 by , 11 years ago
Severity: | Release blocker → Normal |
---|
comment:3 by , 11 years ago
Description: | modified (diff) |
---|
comment:4 by , 11 years ago
Triage Stage: | Unreviewed → Accepted |
---|
At first sight, the patch looks reasonable.
comment:6 by , 10 years ago
Patch needs improvement: | set |
---|
Updated patch can be found from https://github.com/django/django/pull/3046.
The usage of bulk_batch_size() is incorrect in the patch. The method is documented to receive a list of objects, but for the use case in this ticket it receives an iterable of field values. We can either change the method definition so that it can receive either an iterable of field values, or a list of objects. Or we can just add a new feature "max_query_params" and use that instead.
I slightly prefer the approach of adding max_query_params in to database features. Then we can make supports_1000_query_params to be a property which is calculated based on max_query_params, and also make the base bulk_batch_size() to use the max_query_params feature.
comment:7 by , 8 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:9 by , 8 years ago
Patch needs improvement: | unset |
---|
comment:10 by , 8 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
Pull request: https://github.com/django/django/pull/1679
For the love of me I can't find the way to edit the original description.