Opened 10 years ago

Closed 10 years ago

Last modified 10 years ago

#23265 closed Bug (fixed)

runserver crashes with some locales on Python 2

Reported by: SpaceFox Owned by: nobody
Component: Core (Management commands) Version: 1.6
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

On Windows / Python 2.7, if Django is set to use the French locale ("fra"), an UTF8 encoding error prevents it to start with this stack trace:

Validating models...

0 errors found
Unhandled exception in thread started by <function wrapper at 0x0000000003F33B38>
Traceback (most recent call last):
  File "C:\Users\SpaceFox\.virtualenvs\zdsenv\lib\site-packages\django\utils\autoreload.py", line 93, in wrapper
    fn(*args, **kwargs)
  File "C:\Users\SpaceFox\.virtualenvs\zdsenv\lib\site-packages\django\core\management\commands\runserver.py", line 104,
 in inner_run
    now = now.decode('utf-8')
  File "C:\Users\SpaceFox\.virtualenvs\zdsenv\lib\encodings\utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xfb in position 2: invalid start byte

How to reproduce this:

  • Windows (tested on Windows 8.1 x64), all updates OK
  • Python 2.7 (Python 2.7.8 (default, Jun 30 2014, 16:08:48) [MSC v.1500 64 bit (AMD64)] on win32)
  • Django 1.6.5
  • In settings.py, set locale to "fra": locale.setlocale(locale.LC_TIME, 'fra')
  • Try to launch Django with python manage.py runserver. Kabooom!

The problem comes for the django/core/management/commands/runserver.py line 102 to 104 :

        now = datetime.now().strftime('%B %d, %Y - %X')
        if six.PY2:
            now = now.decode('utf-8')

These 3 lines suppose the strftime method returns an UTF-8 string... this is false when Django runs on Windows with Python 2.7.

In French language, 3 month names have non-ASCII characters: "février", "août" and "décembre" (February, August and December).

With described parameters, now is set as "ao¹t 09, 2014 - 00:56:41". The crap that replaces the "û" character in "août" contains the 0xfb character, which is illegal in UTF-8. This create the crash.

The problem is the same in February and December due to the 0xe9 byte in now, caused by the "é" character.

With other non-ASCII characters, this may "work" as long as there is no illegal byte in the given string: the output of this function will be broken (not what is expected) but there will be no exception thrown.

I don't know how to correct this, therefore I can't provide any patch.

Change History (11)

in reply to:  description comment:1 by SpaceFox, 10 years ago

Summary: Django don't start on Windows with Python 2.7 (but only in august!)Django don't start on French Windows with Python 2.7 (but only in February, August and December!)

Replying to SpaceFox:

On Windows / Python 2.7, if Django is set to use the French locale ("fra"), an UTF8 encoding error prevents it to start with this stack trace:

Validating models...

0 errors found
Unhandled exception in thread started by <function wrapper at 0x0000000003F33B38>
Traceback (most recent call last):
  File "C:\Users\SpaceFox\.virtualenvs\zdsenv\lib\site-packages\django\utils\autoreload.py", line 93, in wrapper
    fn(*args, **kwargs)
  File "C:\Users\SpaceFox\.virtualenvs\zdsenv\lib\site-packages\django\core\management\commands\runserver.py", line 104,
 in inner_run
    now = now.decode('utf-8')
  File "C:\Users\SpaceFox\.virtualenvs\zdsenv\lib\encodings\utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xfb in position 2: invalid start byte

How to reproduce this:

  • Windows (tested on Windows 8.1 x64), all updates OK
  • Python 2.7 (Python 2.7.8 (default, Jun 30 2014, 16:08:48) [MSC v.1500 64 bit (AMD64)] on win32)
  • Django 1.6.5
  • In settings.py, set locale to "fra": locale.setlocale(locale.LC_TIME, 'fra')
  • Try to launch Django with python manage.py runserver. Kabooom!

The problem comes for the django/core/management/commands/runserver.py line 102 to 104 :

        now = datetime.now().strftime('%B %d, %Y - %X')
        if six.PY2:
            now = now.decode('utf-8')

These 3 lines suppose the strftime method returns an UTF-8 string... this is false when Django runs on Windows with Python 2.7.

In French language, 3 month names have non-ASCII characters: "février", "août" and "décembre" (February, August and December).

With described parameters, now is set as "ao¹t 09, 2014 - 00:56:41". The crap that replaces the "û" character in "août" contains the 0xfb character, which is illegal in UTF-8. This create the crash.

The problem is the same in February and December due to the 0xe9 byte in now, caused by the "é" character.

With other non-ASCII characters, this may "work" as long as there is no illegal byte in the given string: the output of this function will be broken (not what is expected) but there will be no exception thrown.

I don't know how to correct this, therefore I can't provide any patch.

Version 0, edited 10 years ago by SpaceFox (next)

comment:2 by Aymeric Augustin, 10 years ago

This problem was introduced in [cb1614f7b30f336db2a807b43696e20fdab7b78c] to implement feature request #18611. It affects Django 1.5+.

A first attempt at fixing it was made in #21358 for Django 1.6+ but it was incomplete.

Apparently we should use CP1252 for decoding in that case:

>>> print 'ao\xfbt'.decode('cp1252')
août

Which raises the more general question of what encoding to use. The best answer I found was on StackOverflow: http://stackoverflow.com/questions/19412915/how-determine-encoding-of-datetime-strftime-in-python

Can you provide the output of locale.getlocale(locale.LC_TIME) and of locale.getpreferredencoding() on your system?

in reply to:  2 comment:3 by SpaceFox, 10 years ago

Replying to aaugustin:

Can you provide the output of locale.getlocale(locale.LC_TIME) and of locale.getpreferredencoding() on your system?

>>> locale.getlocale(locale.LC_TIME)
('fr_FR', 'cp1252')
>>> locale.getpreferredencoding()
'cp1252'

comment:4 by Claude Paroz, 10 years ago

I'd suggest to use the django.utils.encoding.get_system_encoding function instead of the hardcoded 'utf-8'.

comment:5 by Claude Paroz, 10 years ago

SpaceFox, could you test if this patch solves your issue?

diff --git a/django/core/management/commands/runserver.py b/django/core/management/commands/runserver.py
index 503cff2..dfff57d 100644
--- a/django/core/management/commands/runserver.py
+++ b/django/core/management/commands/runserver.py
@@ -11,6 +11,7 @@ import socket
 from django.core.management.base import BaseCommand, CommandError
 from django.core.servers.basehttp import run, get_internal_wsgi_application
 from django.utils import autoreload
+from django.utils.encoding import get_system_encoding
 from django.utils import six
 
 naiveip_re = re.compile(r"""^(?:
@@ -101,7 +102,7 @@ class Command(BaseCommand):
         self.validate(display_num_errors=True)
         now = datetime.now().strftime('%B %d, %Y - %X')
         if six.PY2:
-            now = now.decode('utf-8')
+            now = now.decode(get_system_encoding())
 
         self.stdout.write((
             "%(started_at)s\n"

comment:6 by Tim Graham, 10 years ago

Summary: Django don't start on French Windows with Python 2.7 (but only in February, August and December!)runserver crashes with some locales on Python 2
Triage Stage: UnreviewedAccepted
Type: UncategorizedBug

comment:7 by SpaceFox, 10 years ago

claudep,

The server starts with your patch and works perfectly.

The only detail is that now the server tries to process UTF-8 characters as cp1252, which displays strange characters in Windows Powershell:

ao├╗t 13, 2014 - 19:56:34

I don't know if this detail is a problem for Django Team - this is OK for me.

comment:8 by Claude Paroz, 10 years ago

Thanks for checking.
I think the output problem is another issue, probably due to OutputWrapper.write using force_str with default encoding of utf-8. In that case, using get_system_encoding might be more problematic as the output may be redirected to some file where the desired encoding might be still utf-8. I'd like to get more opinions about the output encoding of non-ASCII content from management commands on Windows.

But I plan to fix ASAP the decoding issue originally reported.

comment:9 by Claude Paroz <claude@…>, 10 years ago

Resolution: fixed
Status: newclosed

In 055d95fce0668e11f2dae48d2439f378349d2524:

Fixed #23265 -- Used system-specific encoding in runserver

Thanks SpaceFox for the report.

comment:10 by Claude Paroz <claude@…>, 10 years ago

In 63ccf64079f833ff93a3fe9d158df2ec99015147:

[1.7.x] Fixed #23265 -- Used system-specific encoding in runserver

Thanks SpaceFox for the report.
Backport of 055d95fce066 from master.

comment:11 by Claude Paroz <claude@…>, 10 years ago

In 99b5567796b2a4b96547a51e630f8c5b23d78531:

[1.6.x] Fixed #23265 -- Used system-specific encoding in runserver

Thanks SpaceFox for the report.
Backport of 055d95fce066 from master.

Note: See TracTickets for help on using tickets.
Back to Top