sqlite backend is using row_factory when it should be using text_factory
|Reported by:||(removed)||Owned by:||adrian|
|Component:||Database layer (models, ORM)||Version:||master|
|Has patch:||yes||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||yes|
currently, sqlite has
def utf8rowFactory(cursor, row): def utf8(s): if type(s) == unicode: return s.encode("utf-8") return s return [utf8(r) for r in row]
for row_factory; problem here is that it's rebuilding each record regardless of whether or not the utf8 conversion is required. doing
Database.text_factory = lambda s:s.decode("utf-8")
limits the conversion to just TEXT objects.
This is a bit faster; that said, I'm wondering why the forced conversion- sqlite stores data in utf8, if
Database.text_factory = str
ware set, the whole decoding/encoding would be bypassed, and the native encoding (utf8) would be passed back.
In terms of performance, using Database.text_factory = lambda s:s.decode("utf-8") gains are dependant upon the column types; greater # of non-text fields, greater the gain.
Real gain is via turning off the encode/decode and using str directly (underlying utf8); same gain in terms of avoiding extra inspection, but avoids all the extra work.
Only downside to either change I can see is that raw sql queries would return str instead of sqlites unicode. Not really sure if this is an actual issue however (don't see any other such limitation in the backends).
Patch is attached for the encode/decode variant; unless there are good reasons, would just bypass the encoding/decoding entirely.
Change History (7)
Changed 9 years ago by (removed)
comment:1 Changed 9 years ago by Simon G. <dev@…>
- Needs documentation unset
- Needs tests unset
- Patch needs improvement unset
- Triage Stage changed from Unreviewed to Design decision needed
comment:2 Changed 9 years ago by mtredinnick
- Patch needs improvement set
- Triage Stage changed from Design decision needed to Accepted
comment:5 Changed 8 years ago by mtredinnick
- Resolution set to fixed
- Status changed from new to closed