Lazy evalutation doesn't work properly for returned Unicode objects
|Reported by:||Noah Slater||Owned by:||hugo|
|Has patch:||no||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||no|
When you use utils.functional.lazy() to defer the return of a Unicode object Python's deep Unicode magic screws up data access.
Python uses a backdoor to check that a class is a Unicode object and will call special methods if this is so.
The following will work (the Unicode string is not encoded):
unicode_string = u"\xe9" another_unicode_string = "%s" % unicode_string
This is because in this instance Python spots the Unicode object and calls the __unicode__ method.
The following will not work (Python tries to encode the Unicode string to 'ASCII' or whatever the default is)
from django.utils.functional import lazy def get_text(text) return text lazy_get_text = lazy(get_text, Unicode) lazy_unicode_string = lazy_get_text(u"\xe9") another_unicode_string = "%s" % lazy_unicode_string
The reason this doesn't work is because the ACTUAL type of lazy_unicode_string is __proxy__ or Promise. Python sees this and calls the __str__ method on the Unicode object which forces an attempt to encode to 'ASCII' or whatever the default is.
I can't see a solution to this - but I thought I would raise a ticket just so you guys were aware. I have seen that the issue of full internal Unicode usage is being discussed and this is quiet relevant I should imagine.
This issue is covered in PEP 349. It's worth noting that a patch DOES exist for Python which would remove these problems - though I do not know when, or in fact it, it will be rolled into Python proper.