#19397 closed Bug (fixed)
UnicodeDecodeError on binary file when using custom project template/skeleton
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Core (Management commands) | Version: | dev |
Severity: | Release blocker | Keywords: | project template, skeleton, utf8 |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
There is a regression in current development Django 1.5 version when using startproject
(and startapp
) with custom project/app template/skeleton directory.
In Django 1.4 the following worked flawlessly, but in current master version an error happens during processing of binary files (that should imho not be parsed if not explicitly requested). Steps to repeat:
$ virtualenv --no-site-packages testdj15; cd testdj15/; . bin/activate ... $ pip install git+http://github.com/django/django.git ... $ mkdir skeleton; dd if=/dev/urandom of=skeleton/test.png bs=1M count=1 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.0850216 s, 12.3 MB/s $ django-admin.py startproject --template skeleton abcproject UnicodeDecodeError: 'utf8' codec can't decode byte 0x93 in position 3: invalid start byte $ ls -al abcproject/test.png skeleton/test.png ls: cannot access abcproject/test.png: No such file or directory -rw-r--r-- 1 gw gw 1048576 Nov 30 13:47 skeleton/test.png
Change History (6)
comment:1 by , 12 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 12 years ago
Use case for PNG, ICO or similar are in project template is when someone is creating an educational template for a series of very similar projects and wants to put everything in it , such as apple-touch-icon.png
and favicon.ico
which are binary.
Anyway a solution would be to decide based on the extension if it is binary or not like that:
django/django/core/management/templates.py: if filename.endswith(extensions) or filename in extra_files: ... codecs.open(old_file,.., 'utf-8'), read, template.render, codecs.open(new_file,.., 'utf-8'), write ... else: ... use a binary file copying method without rendering, eg. with shutil.copyfile(old_file, new_file) ...
comment:3 by , 12 years ago
First, that commit isn't correct; it should have used settings.FILE_CHARSET instead of hardcoding utf-8
.
I propose to load the file contents as a bytestring, attempt to decode it with FILE_CHARSET
, and skip the file if that raises a UnicodeDecodeError
.
comment:4 by , 12 years ago
Yay for speaking too fast...
Settings aren't avaisable in startproject
. utf-8
is a reasonnable default; if that's a problem, we could add an option to specify a different charset.
There's a whitelist of file extensions that can be processed. The easiest solution is to decode/render/encode only those.
comment:5 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
While the example of using a PNG as a project template may not make sense at first glance, I've verified the error happens if you have a PNG inside a zipped tar as well - which is entirely possible.
This is in fact a regression introduced in https://github.com/django/django/commit/3afb5916b215c79e36408b729c9516bc435f5cb7
We will probably have to come up with a way of checking each file walked in the template.
http://stackoverflow.com/questions/898669/how-can-i-detect-if-a-file-is-binary-non-text-in-python
has some food for thought.