Opened 8 years ago

Closed 7 years ago

#26731 closed Bug (wontfix)

UnicodeDecodeError when writing unicode to stdout of management command

Reported by: Darren Hobbs Owned by: nobody
Component: Core (Management commands) Version: 1.8
Severity: Normal Keywords: py2
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by Darren Hobbs)

In a management command in Python 2.7, if you include unicode characters when writing to stdout (with self.stdout.write) you will get a UnicodeDecodeError

# coding=utf-8
from __future__ import absolute_import, unicode_literals

import sys

import pytest
from django.core.management.base import OutputWrapper
from django.utils.encoding import smart_bytes


def test_bad_unicode_names():
    bad_name = smart_bytes(u'£')
    ow = OutputWrapper(sys.stdout)
    with pytest.raises(UnicodeDecodeError):
        ow.write(bad_name)

Change History (17)

comment:1 by Darren Hobbs, 8 years ago

Description: modified (diff)

comment:2 by Tim Graham, 8 years ago

How do you end up with a situation where you cast a unicode string with non-ASCII characters to bytes?

comment:3 by Darren Hobbs, 8 years ago

The string came from the db. The actual error came from django/core/management/base.py", line 111, in write.

I fixed my specific issue by importing unicode literals and using self.stdout.write('{}'.format(possibly_unicode_string_from_db)). I'm afraid my understanding of python's unicode string handling isn't great. Perhaps the answer is to update the documentation to suggest using unicode literals in management commands - the alternative is a nasty surprise waiting to happen in production (as it did to me!)

Last edited 8 years ago by Darren Hobbs (previous) (diff)

comment:4 by Tim Graham, 8 years ago

So the broken code is self.stdout.write('{}'.format(possibly_unicode_string_from_db)) without unicode_literals?

comment:5 by Claude Paroz, 8 years ago

Apart from the content of a BinaryField, I don't see how any non-ASCII bytestring can come from the database.

comment:6 by Tim Graham, 8 years ago

The issue is that the non-ASCII Unicode string from the database is coerced into the bytestring '{}' (basically the same situation as #21933).

comment:7 by Darren Hobbs, 8 years ago

It's also compounded by the fact that sys.stdout.write copes with it but self.stdout.write doesn't.

Last edited 8 years ago by Darren Hobbs (previous) (diff)

comment:8 by Tim Graham, 8 years ago

It's because OutputWrapper's default ending is u'\n' so we end up comparing bytestring to Unicode in msg.endswith(ending). I'll leave it up to Claude or another Unicode expert about the correct resolution for this.

comment:9 by Claude Paroz, 8 years ago

@dhobbs It's still a bit mysterious for us how you got the non-ASCII bytestring, that *might* be the bug in the first place. Could you develop a bit more about your use case?

comment:10 by Tim Graham, 8 years ago

'{}'.format(possibly_unicode_string_from_db) gives str on Python 2.

comment:11 by Claude Paroz, 8 years ago

>>> print('{}'.format(u'un café ?'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 6: ordinal not in range(128)

comment:12 by Tim Graham, 8 years ago

I'm using this management command:

# -*- coding: utf-8 -*-
from django.core.management.base import BaseCommand

from polls.models import Question

class Command(BaseCommand):

    def handle(self, *args, **options):
        v = 'Output: %s'.format(Question.objects.latest('id'))
        print(type(v))
        print(v)
        self.stdout.write(v)

with a question with some non-ASCII chars in the name.

comment:13 by Claude Paroz, 8 years ago

Component: UncategorizedCore (Management commands)
Triage Stage: UnreviewedAccepted
Type: UncategorizedBug

Wow, I realize now that format or % (mod) are calling the __str__ of the model. Please, Python 3, come soon!

comment:14 by Claude Paroz, 8 years ago

Has patch: set

comment:15 by Tim Graham, 8 years ago

Patch needs improvement: set

Tests aren't passing on Windows.

comment:16 by Tim Graham, 8 years ago

Keywords: py2 added

If someone is interested in the fix that Claude proposed, they'll need to debug the Windows test failures and propose an updated patch.

comment:17 by Tim Graham, 7 years ago

Resolution: wontfix
Status: newclosed

Closing due to the end of Python 2 support in master in a couple weeks.

Note: See TracTickets for help on using tickets.
Back to Top