Opened 13 years ago
Closed 13 years ago
#16378 closed Bug (fixed)
locmem should use pickle.HIGHEST_PROTOCOL to match memcached et. al.
Reported by: | Paul McMillan | Owned by: | Paul McMillan |
---|---|---|---|
Component: | Core (Cache system) | Version: | dev |
Severity: | Release blocker | Keywords: | |
Cc: | kubo@… | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | yes | UI/UX: | no |
Description
Currently the locmem cache backend does not specify a pickle protocol, meaning it defaults to the first version. There are subtle differences between the protocol versions, with some objects pickleable in one version but not another (see #15863). Since HIGHEST_PROTOCOL is what we expect to use in production, we should make sure that all the caching backends behave the same way.
Attachments (1)
Change History (13)
comment:1 by , 13 years ago
Easy pickings: | set |
---|---|
Has patch: | set |
Triage Stage: | Unreviewed → Accepted |
by , 13 years ago
Attachment: | 16378.diff added |
---|
comment:2 by , 13 years ago
Cc: | added |
---|
comment:3 by , 13 years ago
Has patch: | unset |
---|---|
Triage Stage: | Accepted → Unreviewed |
Version: | 1.3 → SVN |
It seems like this ticket should be expanded to change all uses of pickle.dumps() rather than just locmem and dbcache - neither of which are ever used by performance-sensitive people anyway.
comment:4 by , 13 years ago
Triage Stage: | Unreviewed → Accepted |
---|
@adamnelson: Changing a ticket from Accepted to Unreviewed isn't usually done unless the original acceptance was grossly wrong.
I agree with the expansion of the scope - Django should use pickle.HIGHEST_PROTOCOL everywhere it uses pickle.
The reason this ticket was opened re: locmem and dbcache was not about performance, but consistency. In the current configuration, it's possible to have code that works just fine in testing with locmem that mysteriously stops working in production on memcached.
comment:5 by , 13 years ago
There's no other solution to this situation with the current triage process.
Anyway, it looks like we're in agreement. I only expanded the scope because it didn't make sense to talk about consistency while at the same time having hard coded protocol numbers (or default values) scattered around the codebase.
comment:6 by , 13 years ago
Usually I just leave the ticket as accepted and add the suggestion. I'm glad you pointed that we have this problem in other places... I'm a big fan of consistency.
comment:8 by , 13 years ago
Interestingly the base64 of the binary pickle versions is actually larger than the older text-formats, based on some rude tests i made up. If any change were made to the pickle format used in the cache, it should be to not base64 the DB cache backend and store the pickle data into proper blobs that can handle binary data.
comment:9 by , 13 years ago
Owner: | changed from | to
---|
comment:10 by , 13 years ago
Owner: | changed from | to
---|
comment:11 by , 13 years ago
Owner: | changed from | to
---|---|
Severity: | Normal → Release blocker |
Marking this as a release blocker and assigning to myself.
Python's docs give no hint that some objects may be serializable with protocol 0 but not with protocol 2; if this occurs, it's a bug in Python. For this reason, I think we can change the protocol — it should have been set to
pickle.HIGHEST_PROTOCOL
in the first place — and we don't need to mention it in the release notes.Attached patch makes all calls to
pickle.dumps
consistent.The constant
2
was used indjango/core/cache/backends/db.py
. It's been there "forever" (> 5 years). Until Python 2.3pickle.HIGHEST_PROTOCOL
didn't exist; it must have been common practice at the time to use2
. I didn't find any reason why we shouldn't switch to protocol3
, if it was ever introduced, so I putpickle.HIGHEST_PROTOCOL
instead.