Lazy evalutation doesn't work properly for returned Unicode objects
|Reported by:||Noah Slater||Owned by:||hugo|
|Has patch:||no||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||no|
When you use utils.functional.lazy() to defer the return of a Unicode object Python's deep Unicode magic screws up data access.
Python uses a backdoor to check that a class is a Unicode object and will call special methods if this is so.
The following will work (the Unicode string is not encoded):
unicode_string = u"\xe9" another_unicode_string = "%s" % unicode_string
This is because in this instance Python spots the Unicode object and calls the
The following will not work (Python tries to encode the Unicode string to 'ASCII' or whatever the default is)
from django.utils.functional import lazy def get_text(text) return text lazy_get_text = lazy(get_text, Unicode) lazy_unicode_string = lazy_get_text(u"\xe9") another_unicode_string = "%s" % lazy_unicode_string
The reason this doesn't work is because the ACTUAL type of lazy_unicode_string is
__proxy__ or Promise. Python sees this and calls the
__str__ method on the Unicode object which forces an attempt to encode to 'ASCII' or whatever the default is.
I can't see a solution to this - but I thought I would raise a ticket just so you guys were aware. I have seen that the issue of full internal Unicode usage is being discussed and this is quiet relevant I should imagine.
This issue is covered in PEP 349. It's worth noting that a patch DOES exist for Python which would remove these problems - though I do not know when, or in fact it, it will be rolled into Python proper.