Opened 18 years ago

Closed 17 years ago

#1664 closed defect (wontfix)

Lazy evalutation doesn't work properly for returned Unicode objects

Reported by: Noah Slater Owned by: hugo
Component: Internationalization Version:
Severity: normal Keywords:
Cc: nslater@… Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When you use utils.functional.lazy() to defer the return of a Unicode object Python's deep Unicode magic screws up data access.

Python uses a backdoor to check that a class is a Unicode object and will call special methods if this is so.

The following will work (the Unicode string is not encoded):

unicode_string = u"\xe9"
another_unicode_string = "%s" % unicode_string

This is because in this instance Python spots the Unicode object and calls the __unicode__ method.

The following will not work (Python tries to encode the Unicode string to 'ASCII' or whatever the default is)

from django.utils.functional import lazy

def get_text(text)
    return text

lazy_get_text = lazy(get_text, Unicode)

lazy_unicode_string = lazy_get_text(u"\xe9")

another_unicode_string = "%s" % lazy_unicode_string 

The reason this doesn't work is because the ACTUAL type of lazy_unicode_string is __proxy__ or Promise. Python sees this and calls the __str__ method on the Unicode object which forces an attempt to encode to 'ASCII' or whatever the default is.

I can't see a solution to this - but I thought I would raise a ticket just so you guys were aware. I have seen that the issue of full internal Unicode usage is being discussed and this is quiet relevant I should imagine.

This issue is covered in PEP 349. It's worth noting that a patch DOES exist for Python which would remove these problems - though I do not know when, or in fact it, it will be rolled into Python proper.

Attachments (1)

unicode.py (1.3 KB ) - added by Noah Slater 18 years ago.
A set of classes and tests for lazy Unicode evaluation

Download all attachments as: .zip

Change History (4)

comment:1 by anonymous, 18 years ago

Cc: nslater@… added

comment:2 by Noah Slater, 18 years ago

I have been struggling with this for a few days now.

I have a solution now which doesn't involve using the standard Django lazy method. Instead I have created a set of classes for Unicode containment and lazy evaluation.

I will attach a sample of my solution in case this helps you guys.

by Noah Slater, 18 years ago

Attachment: unicode.py added

A set of classes and tests for lazy Unicode evaluation

comment:3 by hugo, 17 years ago

Resolution: wontfix
Status: newclosed

well, changing to a completely different lazy handling with wrapper classes doesn#t cut the cake in a nice way - and the other solution, fixing python with a patch, is outside the scope of Django. I close this ticket for now, since it will be either fixed with the switch to internal unicode handling or maybe some future python release. Since Django for now internally mostly uses utf-8 encoded bytestrings, this shouldn't be a big problem anyway (and I haven't had any report on it beside this ticket).

Note: See TracTickets for help on using tickets.
Back to Top