Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#19397 closed Bug (fixed)

UnicodeDecodeError on binary file when using custom project template/skeleton

Reported by: gw.2012@… Owned by: nobody
Component: Core (Management commands) Version: master
Severity: Release blocker Keywords: project template, skeleton, utf8
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

There is a regression in current development Django 1.5 version when using startproject (and startapp) with custom project/app template/skeleton directory.

In Django 1.4 the following worked flawlessly, but in current master version an error happens during processing of binary files (that should imho not be parsed if not explicitly requested). Steps to repeat:

$ virtualenv --no-site-packages testdj15; cd testdj15/; . bin/activate
...
$ pip install git+http://github.com/django/django.git
...
$ mkdir skeleton; dd if=/dev/urandom of=skeleton/test.png bs=1M count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0850216 s, 12.3 MB/s
$ django-admin.py startproject --template skeleton abcproject
UnicodeDecodeError: 'utf8' codec can't decode byte 0x93 in position 3: invalid start byte
$ ls -al abcproject/test.png skeleton/test.png
ls: cannot access abcproject/test.png: No such file or directory
-rw-r--r-- 1 gw gw 1048576 Nov 30 13:47 skeleton/test.png

Change History (6)

comment:1 Changed 3 years ago by ptone

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

While the example of using a PNG as a project template may not make sense at first glance, I've verified the error happens if you have a PNG inside a zipped tar as well - which is entirely possible.

This is in fact a regression introduced in https://github.com/django/django/commit/3afb5916b215c79e36408b729c9516bc435f5cb7

We will probably have to come up with a way of checking each file walked in the template.

http://stackoverflow.com/questions/898669/how-can-i-detect-if-a-file-is-binary-non-text-in-python

has some food for thought.

comment:2 Changed 3 years ago by gw.2012@…

Use case for PNG, ICO or similar are in project template is when someone is creating an educational template for a series of very similar projects and wants to put everything in it , such as apple-touch-icon.png and favicon.ico which are binary.

Anyway a solution would be to decide based on the extension if it is binary or not like that:

django/django/core/management/templates.py:
if filename.endswith(extensions) or filename in extra_files:
    ... codecs.open(old_file,.., 'utf-8'), read, template.render, codecs.open(new_file,.., 'utf-8'), write ...
else:
    ... use a binary file copying method without rendering, eg. with shutil.copyfile(old_file, new_file) ...

comment:3 Changed 3 years ago by aaugustin

First, that commit isn't correct; it should have used settings.FILE_CHARSET instead of hardcoding utf-8.

I propose to load the file contents as a bytestring, attempt to decode it with FILE_CHARSET, and skip the file if that raises a UnicodeDecodeError.

comment:4 Changed 3 years ago by aaugustin

Yay for speaking too fast...

Settings aren't avaisable in startproject. utf-8 is a reasonnable default; if that's a problem, we could add an option to specify a different charset.

There's a whitelist of file extensions that can be processed. The easiest solution is to decode/render/encode only those.

comment:5 Changed 3 years ago by Aymeric Augustin <aymeric.augustin@…>

  • Resolution set to fixed
  • Status changed from new to closed

In c9a47fb379cab4c0fe9be27c9924236e75327bd0:

[1.5.x] Fixed #19397 -- Crash on binary files in project templates.

Thanks gw 2012 at tnode com for the report.

Backport of baae4b8.

comment:6 Changed 3 years ago by Aymeric Augustin <aymeric.augustin@…>

In baae4b818778180fedfcfcfc7aa77acfb9b237fb:

Fixed #19397 -- Crash on binary files in project templates.

Thanks gw 2012 at tnode com for the report.

Note: See TracTickets for help on using tickets.
Back to Top