Opened 9 years ago

Closed 9 years ago

#1664 closed defect (wontfix)

Lazy evalutation doesn't work properly for returned Unicode objects

Reported by: Noah Slater Owned by: hugo
Component: Internationalization Version:
Severity: normal Keywords:
Cc: nslater@… Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

When you use utils.functional.lazy() to defer the return of a Unicode object Python's deep Unicode magic screws up data access.

Python uses a backdoor to check that a class is a Unicode object and will call special methods if this is so.

The following will work (the Unicode string is not encoded):

unicode_string = u"\xe9"
another_unicode_string = "%s" % unicode_string

This is because in this instance Python spots the Unicode object and calls the __unicode__ method.

The following will not work (Python tries to encode the Unicode string to 'ASCII' or whatever the default is)

from django.utils.functional import lazy

def get_text(text)
    return text

lazy_get_text = lazy(get_text, Unicode)

lazy_unicode_string = lazy_get_text(u"\xe9")

another_unicode_string = "%s" % lazy_unicode_string 

The reason this doesn't work is because the ACTUAL type of lazy_unicode_string is __proxy__ or Promise. Python sees this and calls the __str__ method on the Unicode object which forces an attempt to encode to 'ASCII' or whatever the default is.

I can't see a solution to this - but I thought I would raise a ticket just so you guys were aware. I have seen that the issue of full internal Unicode usage is being discussed and this is quiet relevant I should imagine.

This issue is covered in PEP 349. It's worth noting that a patch DOES exist for Python which would remove these problems - though I do not know when, or in fact it, it will be rolled into Python proper.

Attachments (1)

unicode.py (1.3 KB) - added by Noah Slater 9 years ago.
A set of classes and tests for lazy Unicode evaluation

Download all attachments as: .zip

Change History (4)

comment:1 Changed 9 years ago by anonymous

  • Cc nslater@… added

comment:2 Changed 9 years ago by Noah Slater

I have been struggling with this for a few days now.

I have a solution now which doesn't involve using the standard Django lazy method. Instead I have created a set of classes for Unicode containment and lazy evaluation.

I will attach a sample of my solution in case this helps you guys.

Changed 9 years ago by Noah Slater

A set of classes and tests for lazy Unicode evaluation

comment:3 Changed 9 years ago by hugo

  • Resolution set to wontfix
  • Status changed from new to closed

well, changing to a completely different lazy handling with wrapper classes doesn#t cut the cake in a nice way - and the other solution, fixing python with a patch, is outside the scope of Django. I close this ticket for now, since it will be either fixed with the switch to internal unicode handling or maybe some future python release. Since Django for now internally mostly uses utf-8 encoded bytestrings, this shouldn't be a big problem anyway (and I haven't had any report on it beside this ticket).

Note: See TracTickets for help on using tickets.
Back to Top