Opened 9 years ago

Closed 8 years ago

#24324 closed Bug (fixed)

Crashes when project path or path to Django install contains a non-ascii character

Reported by: notsqrt Owned by: Tim Graham
Component: Core (Management commands) Version: 1.8
Severity: Release blocker Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hi,

Checked on linux, python 2.7, with Django 1.7.4.

Steps to reproduce:

mktmpenv 
pip install Django
cd /tmp/
mkdir hého
cd hého/
django-admin startproject project
cd project/
python manage.py startapp app
# add app to INSTALLED_APPS
# add model to app.models
python manage.py makemigrations app

Location:

django/db/migrations/writer.py", line 224, in path
    return os.path.join(basedir, self.filename)

Root of the bug: just a mix of bytes and text:

>>> import os
>>> os.path.join(b'/tmp/hého', u'test')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6: ordinal not in range(128)

Attachments (1)

24324.diff (1010 bytes ) - added by Tim Graham 9 years ago.

Download all attachments as: .zip

Change History (35)

comment:1 by notsqrt, 9 years ago

As a side-note, doing the following:

cd /tmp
mkdir éhé
cd éhé
git clone https://github.com/django/django/
cd tests
pip install -r requirements/py2.txt  # this actually fails with pip < 6.0, due to the accents
PYTHONPATH=..:$PYTHONPATH ./runtests.py

fails immediately ..

So there's probably a massive work to fix all problems..

comment:2 by notsqrt, 9 years ago

#19357 supposedly fixed all the problems in django 1.5, but I also found that the template loader fails with non-ASCII path.

Down the rabbit hole !

comment:3 by Claude Paroz, 9 years ago

Severity: NormalRelease blocker
Triage Stage: UnreviewedAccepted

comment:4 by Tim Graham, 9 years ago

Adding a patch that fixes the reported instance, but there seems to be more work to do as the test suite still fails quite a bit.

To prevent future regressions, we could setup a Jenkins build with a name that includes some non-ASCII character like "djangō".

by Tim Graham, 9 years ago

Attachment: 24324.diff added

comment:5 by Aymeric Augustin, 9 years ago

When ojii and I maintained the previous iteration of the CI infrastructure, we used a build name that contained a space and a non-ASCII character in order to catch such issues.

comment:6 by Tim Graham, 9 years ago

Okay, I tried to setup a build with a nonascii path, but virtualenv has some problems with it. Will look into that. Similarly, I encountered virtualenv problems with spaces in the path when setting up the current CI machines.

PR to fix this ticket and some related issues. 600+ test failures remain when running test suite on a non-ascii path, but I guess it's probably only a handful of fixes. Should we try to fix all these issues on 1.7?

comment:7 by Tim Graham, 9 years ago

Has patch: set
Patch needs improvement: set

I've added more commits to the PR and am down to 8 failures on Python 2 on master. Investigation continues tomorrow...

virtualenv 12.0.7 (latest as of now) still seems to have trouble with non-ascii chars in the path: https://github.com/pypa/virtualenv/issues/457 so I think we are out of look unless we add some exceptions to our normal build script for the new build.

comment:8 by Tim Graham, 9 years ago

Patch needs improvement: unset

Tests are passing with the latest version of the patch.

As these issues are Python 2 only and no one has complained until five months after the 1.7 release, I think we can skip fixing these issues there (absent other opinions).

comment:9 by Tim Graham, 9 years ago

Owner: changed from nobody to Tim Graham
Status: newassigned
Summary: makemigrations fails with UnicodeDecodeError when path to project contains special charactersCrashes when project path or path to Django install contains a non-ascii character
Version: 1.71.8alpha1

comment:10 by Tim Graham <timograham@…>, 9 years ago

In 4a0aeac1b5cfb7b6229a01119a596afb38d8a2a0:

Refs #24324 -- Fixed management command discovery on non-ASCII paths.

comment:11 by Tim Graham <timograham@…>, 9 years ago

In bcb3bfa5a2716454e15ca0203e0debf497b14273:

[1.8.x] Refs #24324 -- Fixed management command discovery on non-ASCII paths.

Backport of 4a0aeac1b5cfb7b6229a01119a596afb38d8a2a0 from master

comment:12 by Tim Graham <timograham@…>, 9 years ago

In d316b43d0ab9db0f9913b094b84b11362d36d054:

Refs #24324 -- Fixed UnicodeDecodeError in model_regress test on non-ASCII path.

comment:13 by Tim Graham <timograham@…>, 9 years ago

In b2f7daa4a6ba4f463dd79b19c337c738201479ad:

[1.8.x] Refs #24324 -- Fixed UnicodeDecodeError in model_regress test on non-ASCII path.

Backport of d316b43d0ab9db0f9913b094b84b11362d36d054 from master

comment:14 by Tim Graham <timograham@…>, 9 years ago

In 81a94cc616ab80decaa495cfa1c0c623527fc0e7:

Refs #24324 -- Fixed makemessages crash when Django is installed in a non-ASCII path.

comment:15 by Tim Graham <timograham@…>, 9 years ago

In 9dba901d9c44a117b35003e0c239476536c259aa:

[1.8.x] Refs #24324 -- Fixed makemessages crash when Django is installed in a non-ASCII path.

Backport of 81a94cc616ab80decaa495cfa1c0c623527fc0e7 from master

comment:16 by Tim Graham <timograham@…>, 9 years ago

In 63c5c9870129f6b81358c1ed7ed2392bbc46f77d:

Refs #24324 -- Fixed UnicodeEncodeError in SQLite backend while testing.

If 'name' contained non-ASCII characters, the comparison raised a
UnicodeEncodeError on Python 2.

comment:17 by Tim Graham <timograham@…>, 9 years ago

In 4f43e5c4353325eb8d1c455c58e299ef95e2e422:

[1.8.x] Refs #24324 -- Fixed UnicodeEncodeError in SQLite backend while testing.

If 'name' contained non-ASCII characters, the comparison raised a
UnicodeEncodeError on Python 2.

Backport of 63c5c9870129f6b81358c1ed7ed2392bbc46f77d from master

comment:18 by Tim Graham <timograham@…>, 9 years ago

In c9ece2e6b9365fa4be16bd0de25dd7b68c8dc97e:

Refs #24324 -- Fixed UnicodeDecodeError in makemigrations.

If the project path contained a non-ASCII character, Python 2 crashed.

comment:19 by Tim Graham <timograham@…>, 9 years ago

In ba3a7636f1bb8c02eaabbeff9a3731ad27a82c5d:

[1.8.x] Refs #24324 -- Fixed UnicodeDecodeError in makemigrations.

If the project path contained a non-ASCII character, Python 2 crashed.

Backport of c9ece2e6b9365fa4be16bd0de25dd7b68c8dc97e from master

comment:20 by Tim Graham <timograham@…>, 9 years ago

In bad6280c4e3f75f3ccd27f8fd85a4043bb296128:

Refs #24324 -- Fixed get_app_template_dirs() UnicodeDecodeError on Python 2.

The function implemented most of upath(), but skipped the check for
strings that are already unicode.

comment:21 by Tim Graham <timograham@…>, 9 years ago

In a1fa0135ecb28911f31af4df994be26db59355e4:

[1.8.x] Refs #24324 -- Fixed get_app_template_dirs() UnicodeDecodeError on Python 2.

The function implemented most of upath(), but skipped the check for
strings that are already unicode.

Backport of bad6280c4e3f75f3ccd27f8fd85a4043bb296128 from master

comment:22 by Tim Graham <timograham@…>, 9 years ago

In bebc1e53a3ab059849e5c4e5a55b2f5e68b67169:

Refs #24324 -- Fixed UnicodeDecodeError in template_backends tests

The message for the SuspiciousFileOperation exception needs to
be a unicode string.

comment:23 by Tim Graham <timograham@…>, 9 years ago

In fa66ea75326e669cd3d51fb926a4364b8ba08959:

Refs #24324 -- Fixed UnicodeDecodeError in MigrationWriter on Python 2.

comment:24 by Tim Graham <timograham@…>, 9 years ago

In f9a99c410e4ccc2ca89fc6006c48a23b02bae873:

[1.8.x] Refs #24324 -- Fixed UnicodeDecodeError in template_backends tests

The message for the SuspiciousFileOperation exception needs to
be a unicode string.

Backport of bebc1e53a3ab059849e5c4e5a55b2f5e68b67169 from master

comment:25 by Tim Graham <timograham@…>, 9 years ago

In 09da1b465ea8ba9ecb99b1cd02a689bb831d0e1b:

[1.8.x] Refs #24324 -- Fixed UnicodeDecodeError in MigrationWriter on Python 2.

Backport of fa66ea75326e669cd3d51fb926a4364b8ba08959 from master

comment:26 by Tim Graham <timograham@…>, 9 years ago

In 307c0f299a6c26f5231d3516df5b4edc54b36553:

Refs #24324 -- Fixed Python 2 test failures when path to Django source contains non-ASCII characters.

comment:27 by Tim Graham <timograham@…>, 9 years ago

In 2aa06e439a29a1c24fa03744395cc57787e7198e:

[1.8.x] Refs #24324 -- Fixed Python 2 test failures when path to Django source contains non-ASCII characters.

Backport of 307c0f299a6c26f5231d3516df5b4edc54b36553 from master

comment:28 by Tim Graham <timograham@…>, 9 years ago

In 098fa12dd390e733c7568d824eea2c346550c75a:

Refs #24324 -- Fixed crash in {% debug %} tag on Python 2.

If Django is installed in a path that contains non-ASCII characters,
the tag failed with UnicodeDecodeError.

comment:29 by Tim Graham <timograham@…>, 9 years ago

In b8d6cdbcc90ff8af781d13131b79ce88a9eff66d:

Refs #24324 -- Skipped fixtures_regress tests that fail on Python 2 on a non-ASCII path.

comment:30 by Tim Graham <timograham@…>, 9 years ago

In 1153bccc1bc654a547a310d0614b989606d25950:

[1.8.x] Refs #24324 -- Fixed crash in {% debug %} tag on Python 2.

If Django is installed in a path that contains non-ASCII characters,
the tag failed with UnicodeDecodeError.

Backport of 098fa12dd390e733c7568d824eea2c346550c75a from master

comment:31 by Tim Graham <timograham@…>, 9 years ago

In 5068a51d88a7084bd0349d150816fc6041caa224:

[1.8.x] Refs #24324 -- Skipped fixtures_regress tests that fail on Python 2 on a non-ASCII path.

Backport of b8d6cdbcc90ff8af781d13131b79ce88a9eff66d from master

comment:32 by Tim Graham, 9 years ago

Resolution: fixed
Status: assignedclosed

Had trouble on the Jenkins build on Python 2 when running ./runtests.py:

Traceback (most recent call last):
  File "./runtests.py", line 431, in <module>
    options.debug_sql)
  File "./runtests.py", line 253, in django_tests
    extra_tests=extra_tests,
  File "/home/jenkins/workspace/master-ἥoἥascii-path/database/sqlite3/label/trusty/python/python2.7/django/test/runner.py", line 209, in run_tests
    suite = self.build_suite(test_labels, extra_tests)
  File "/home/jenkins/workspace/master-ἥoἥascii-path/database/sqlite3/label/trusty/python/python2.7/django/test/runner.py", line 150, in build_suite
    tests = self.test_loader.discover(start_dir=label, **kwargs)
  File "/usr/lib/python2.7/unittest/loader.py", line 206, in discover
    tests = list(self._find_tests(start_dir, pattern))
  File "/usr/lib/python2.7/unittest/loader.py", line 267, in _find_tests
    raise ImportError(msg % (mod_name, module_dir, expected_dir))
ImportError: u'tests' module incorrectly imported from '/home/jenkins/workspace/master-\xe1\xbc\xa5o\xe1\xbc\xa5ascii-path/database/sqlite3/label/trusty/python/python2.7/tests/shortcuts'. Expected u'/home/jenkins/workspace/master-\u1f25o\u1f25ascii-path/database/sqlite3/label/trusty/python/python2.7/tests/shortcuts'. Is this module globally installed?

but ./tests/runtests.py works so I'm using a different build script for the build with that invocation (and also using a different virtualenv path so we avoid the non-ASCII chars). It's green now!

comment:33 by Kiss György, 8 years ago

Resolution: fixed
Status: closednew

I don't think the solution to the problem should be "Just don't use nonascii characters in path names".
I'm not totally sure, but I suppose the problem is the unicode_literals future import.

See Armin's opinion here: https://github.com/PythonCharmers/python-future/issues/22

I got a traceback like this:

a = '/var/lib/jenkins/jobs/K\xc3\xa1rtyarendel\xc5\x91/workspace/orders'
p = ('management.py',)
path = '/var/lib/jenkins/jobs/K\xc3\xa1rtyarendel\xc5\x91/workspace/orders'
b = 'management.py'

    def join(a, *p):
        """Join two or more pathname components, inserting '/' as needed.
        If any component is an absolute path, all previous path components
        will be discarded.  An empty last part will result in a path that
        ends with a separator."""
        path = a
        for b in p:
            if b.startswith('/'):
                path = b
            elif path == '' or path.endswith('/'):
                path +=  b
            else:
>               path += '/' + b
E               UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 23: ordinal not in range(128)

/usr/lib64/python2.7/posixpath.py:80: UnicodeDecodeError

As far as I understand os.path.join can't handle unicode inputs, but when you use the unicode_literals, everything will be unicode.

comment:34 by Tim Graham, 8 years ago

Resolution: fixed
Status: newclosed
Version: 1.8alpha11.8

The issue reported in this ticket is fixed in Django 1.8. Please open a new bug with steps to reproduce if you are encountering a different issue. Thanks!

Note: See TracTickets for help on using tickets.
Back to Top