Opened 9 years ago

Closed 9 years ago

#5640 closed (fixed)

Request to improve error handling in utils.encoding.force_unicode.

Reported by: frank.hoffsummer@… Owned by: Malcolm Tredinnick
Component: Core (Other) Version: master
Severity: Keywords: unicode, force_unicode, UnicodeDecodeError
Cc: Triage Stage: Design decision needed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description (last modified by Malcolm Tredinnick)

migrating a large django project to trunk version > 5609 (i.e. after Unicode branch was merged) can be very painful due to a bug(?!) in utils.encoding.force_unicode

basically, following the instructions here is the right thing to do, but it is very difficult in a larger project to catch all the strings that might need and u'' before them.

Make the slightest oversight, and you lose the ability to debug your models/views/templates as utils.encoding.force_unicode will throw an UnicodeDecodeError exception like this one:


ProcessId:      9493
Interpreter:    'mcc'

ServerName:     ''
DocumentRoot:   '/Library/WebServer/Documents'

URI:            '/kluster/errors/'
Location:       '/kluster'
Directory:      None
Filename:       '/Library/WebServer/Documents/kluster'
PathInfo:       '/errors/'

Phase:          'PythonHandler'
Handler:        'django.core.handlers.modpython'

Traceback (most recent call last):

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/mod_python/", line 1537, in HandlerDispatch
    default=default_handler, arg=req, silent=hlist.silent)

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/mod_python/", line 1229, in _process_target
    result = _execute_target(config, req, object, arg)

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/mod_python/", line 1128, in _execute_target
    result = object(arg)

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/core/handlers/", line 178, in handler
    return ModPythonHandler()(req)

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/core/handlers/", line 151, in __call__
    response = self.get_response(request)

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/core/handlers/", line 53, in get_response
    response = self._real_get_response(request)

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/core/handlers/", line 115, in _real_get_response
    return debug.technical_500_response(request, *sys.exc_info())

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/views/", line 151, in technical_500_response
    return HttpResponseServerError(t.render(c), mimetype='text/html')

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/template/", line 176, in render
    return self.nodelist.render(context)

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/template/", line 732, in render
    bits.append(self.render_node(node, context))

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/template/", line 745, in render_node
    return node.render(context)

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/template/", line 135, in render

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/template/", line 229, in render
    return self.nodelist_true.render(context)

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/template/", line 732, in render
    bits.append(self.render_node(node, context))

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/template/", line 745, in render_node
    return node.render(context)

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/template/", line 135, in render

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/template/", line 781, in render
    return self.filter_expression.resolve(context)

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/template/", line 599, in resolve
    obj = func(obj, *arg_vals)

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/template/", line 25, in _dec
    args[0] = force_unicode(args[0])

  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/utils/", line 41, in force_unicode
    s = unicode(s, encoding, errors)

UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3679-3681: invalid data

from this traceback, it is very difficult for laymen (like me) to figure out what is wrong and trace the cause of the error (in my case: a non-ascii string in that didn't have a u in front of it). This severly hampers the migration of larger projects to django trunk post r5609.

My patch just catches the exception in force_unicode and enforces unicode conversion with the 'ignore' flag. This is certainly not the optimal solution, but at least it allows for debugging models/views/templates!

Attachments (1)

force_unicode.diff (588 bytes) - added by frank.hoffsummer@… 9 years ago.
patch for utils.encoding.force_unicode

Download all attachments as: .zip

Change History (3)

Changed 9 years ago by frank.hoffsummer@…

Attachment: force_unicode.diff added

patch for utils.encoding.force_unicode

comment:1 Changed 9 years ago by Malcolm Tredinnick

Description: modified (diff)
Has patch: unset
Needs documentation: unset
Needs tests: unset
Owner: changed from nobody to Malcolm Tredinnick
Patch needs improvement: unset
Summary: utils.encoding.force_unicode should never throw UnicodeDecodeErrorRequest to improve error handling in utils.encoding.force_unicode.
Triage Stage: UnreviewedDesign decision needed

Hiding genuine errors is not the solution here, so this patch can't go in. Porting an application is a one-time job, so I'm reluctant to even add a setting for this.

The solution is probably to write your own version of force_unicode and assign that to encoding.force_unicode during your porting phase. We might be able to make the error a bit more self-helpful, though.

Changing the title to reflect the real issue, since force_unicode should raise UnicodeDecodeError when there's a problem. That's not a bug; the data it's being given is invalid.

comment:2 Changed 9 years ago by Malcolm Tredinnick

Resolution: fixed
Status: newclosed

(In [6649]) Fixed #5640 -- Added some extra error reporting when smart_unicode() or
force_unicode() raise a UnicodeDecodeError. This should at least help people
identify which is the bad piece of data they passed in. About the best we can
do here.

Note: See TracTickets for help on using tickets.
Back to Top