Opened 9 years ago
Last modified 3 weeks ago
#27833 new Bug
prefetch_related fails with SQLite when used with 1000 parent records
| Reported by: | Jason Barnabe | Owned by: | |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 1.10 |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Accepted | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | yes |
| Easy pickings: | no | UI/UX: | no |
Description (last modified by )
This is described in ticket:16937#comment:3, but I don't see any issue filed for it.
"[prefetch_related] will not work on some backends if you have a lot of objects in your queryset. For example the SQLite backend has a limitation of 999 parameters to single SQL query, so if you have 1000 objects, prefetch_related will fail as you need to supply 1000 id values to the query."'
Batch it up like in #16426 and #17788, which dealt with the same limitation?
Change History (15)
comment:1 by , 9 years ago
| Description: | modified (diff) |
|---|---|
| Triage Stage: | Unreviewed → Accepted |
comment:2 by , 9 years ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:3 by , 9 years ago
| Has patch: | set |
|---|
comment:4 by , 9 years ago
| Patch needs improvement: | set |
|---|
comment:5 by , 9 years ago
| Patch needs improvement: | unset |
|---|
comment:6 by , 8 years ago
| Patch needs improvement: | set |
|---|
comment:7 by , 8 years ago
| Patch needs improvement: | unset |
|---|
comment:8 by , 8 years ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
comment:9 by , 8 years ago
| Triage Stage: | Ready for checkin → Accepted |
|---|
comment:10 by , 8 years ago
| Patch needs improvement: | set |
|---|
comment:11 by , 3 years ago
| Owner: | removed |
|---|---|
| Status: | assigned → new |
comment:12 by , 2 years ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
comment:13 by , 10 months ago
| Owner: | removed |
|---|---|
| Status: | assigned → new |
Two notes worth of mention here.
- In the case of records referenced through multiple fields (e.g.
ForeignObject(from_felds, to_fields)where more than one field is involved) then the problem can manifest itself with(1000 / len(fields)) + 1records as SQLite doesn't support tuple comparison and thus must expand(field0, ..., fieldN) IN %sinto(field0 = v0_0 ..., fieldN = vn_0) ... OR ... (field0 = v0_m ..., fieldN = vn_m). See #36116 where this was uncovered. - Now that we changed the prefetching interface from
get_prefetch_querysettoget_prefetch_querysetsin #33651 it paves the way for prefetcher implementations to be adapted to sliceinstancesbased onconnection.ops.bulk_batch_sizewhich makes solving this ticket way easier.
comment:14 by , 10 months ago
Another important point is that SQLite 3.32 (released on 2020-05-22) increased the default upper bound on the number of parameters from 999 to 32766 so while there is still a limit that warrants the usage of bulk_batch_size users are way less likely to run into this issue than before.
This was discovered by Sarah while working on a patch for ticket-36118.
As mentioned in the PR, I think there is one major issue:
prefetch_relatedexpects the results of the prefetch query to be unique.Splitting the query in batches do not enforce uniqueness and can lead to incorrect results. In short, if related objects are shared between several instances, when instances are split into batches, nothing prevents a related objects from appearing twice in the results. For example:
Prefetching in batches of 2 results in prefetch queries for:
[A, B]then[C].Which gives the following mapping of related objects:
[(A, 1), (B, 2)]then[(A, 1)]. Therel_obj_cachewould then be:A more detailed example is available on the PR.
Maybe the way forward is to use a set or a dict to store the related queries' results (either for
all_related_objectsor forrel_obj_cache).