Django

Code

Ticket #1664 (closed: wontfix)

Opened 3 years ago

Last modified 2 years ago

Lazy evalutation doesn't work properly for returned Unicode objects

Reported by: Noah Slater Assigned to: hugo
Milestone: Component: Internationalization
Version: Keywords:
Cc: nslater@gmail.com Triage Stage: Unreviewed
Has patch: 0 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 0

Description

When you use utils.functional.lazy() to defer the return of a Unicode object Python's deep Unicode magic screws up data access.

Python uses a backdoor to check that a class is a Unicode object and will call special methods if this is so.

The following will work (the Unicode string is not encoded):

unicode_string = u"\xe9"
another_unicode_string = "%s" % unicode_string

This is because in this instance Python spots the Unicode object and calls the __unicode__ method.

The following will not work (Python tries to encode the Unicode string to 'ASCII' or whatever the default is)

from django.utils.functional import lazy

def get_text(text)
    return text

lazy_get_text = lazy(get_text, Unicode)

lazy_unicode_string = lazy_get_text(u"\xe9")

another_unicode_string = "%s" % lazy_unicode_string 

The reason this doesn't work is because the ACTUAL type of lazy_unicode_string is __proxy__ or Promise. Python sees this and calls the __str__ method on the Unicode object which forces an attempt to encode to 'ASCII' or whatever the default is.

I can't see a solution to this - but I thought I would raise a ticket just so you guys were aware. I have seen that the issue of full internal Unicode usage is being discussed and this is quiet relevant I should imagine.

This issue is covered in PEP 349. It's worth noting that a patch DOES exist for Python which would remove these problems - though I do not know when, or in fact it, it will be rolled into Python proper.

Attachments

unicode.py (1.3 kB) - added by Noah Slater on 04/22/06 11:48:35.
A set of classes and tests for lazy Unicode evaluation

Change History

04/20/06 20:41:52 changed by anonymous

  • cc set to nslater@gmail.com.

04/22/06 11:47:32 changed by Noah Slater

I have been struggling with this for a few days now.

I have a solution now which doesn't involve using the standard Django lazy method. Instead I have created a set of classes for Unicode containment and lazy evaluation.

I will attach a sample of my solution in case this helps you guys.

04/22/06 11:48:35 changed by Noah Slater

  • attachment unicode.py added.

A set of classes and tests for lazy Unicode evaluation

11/03/06 07:24:25 changed by hugo

  • status changed from new to closed.
  • resolution set to wontfix.

well, changing to a completely different lazy handling with wrapper classes doesn#t cut the cake in a nice way - and the other solution, fixing python with a patch, is outside the scope of Django. I close this ticket for now, since it will be either fixed with the switch to internal unicode handling or maybe some future python release. Since Django for now internally mostly uses utf-8 encoded bytestrings, this shouldn't be a big problem anyway (and I haven't had any report on it beside this ticket).


Add/Change #1664 (Lazy evalutation doesn't work properly for returned Unicode objects)




Change Properties
Action