#9696 closed (invalid)
FileField raises unhandled exception when filename contains non-ascii characters
| Reported by: | magarac | Owned by: | Karen Tracey |
|---|---|---|---|
| Component: | File uploads/storage | Version: | 1.0 |
| Severity: | Keywords: | ||
| Cc: | liangent@… | Triage Stage: | Accepted |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
When model has models.FileField and one tries to upload a file with non-ascii characters in filename, django crashes with unhandled exception.
Change History (11)
comment:1 by , 17 years ago
| milestone: | post-1.0 |
|---|---|
| Resolution: | → worksforme |
| Status: | new → closed |
comment:2 by , 17 years ago
| Resolution: | worksforme |
|---|---|
| Status: | closed → reopened |
I am having this problem also. Reared its head in pinax svn version when using the avatar upload feature
. Filename is in Japanese. The file was uploaded via Firefox 3.
- Aaron
Traceback (most recent call last): File "/usr/lib/python2.5/site-packages/django/core/handlers/base.py", line 86, in get_response response = callback(request, *callback_args, **callback_kwargs) File "/usr/lib/python2.5/site-packages/django/contrib/auth/decorators.py", line 67, in __call__ return self.view_func(request, *args, **kwargs) File "/var/www/django/pinax0.5.0/apps/external_apps/avatar/views.py", line 48, in change File "/usr/lib/python2.5/site-packages/django/core/files/storage.py", line 44, in save name = self.get_available_name(name) File "/usr/lib/python2.5/site-packages/django/core/files/storage.py", line 66, in get_available_nam e while self.exists(name): File "/usr/lib/python2.5/site-packages/django/core/files/storage.py", line 188, in exists return os.path.exists(self.path(name)) File "/usr/lib/python2.5/posixpath.py", line 171, in exists st = os.stat(path) UnicodeEncodeError: 'ascii' codec can't encode characters in position 67-69: ordinal not in range(128 )
comment:3 by , 17 years ago
Sorry, I was wrong about the browser in the last report. I thought it was Firefox (reported via a customer) but turns out it was Lunascape. I have been unable to replicate this problem in Firefox, Chrome or IE7.
I haven't had a chance to install the browser but I expect its reporting a different encoding to the one it is actually sending the filename in however I can't find any other reports of this on the web.
comment:4 by , 17 years ago
| milestone: | → 1.1 |
|---|---|
| Triage Stage: | Unreviewed → Accepted |
comment:5 by , 17 years ago
| Cc: | added |
|---|
comment:6 by , 17 years ago
I have this issue as well.
I reproduced this by uploading a file with a filename containing unicode characters: "Peña.cbz"
Environment:
Request Method: POST
Request URL: http://staging.getcomicstrips.com/library/upload
Django Version: 1.0.2 final
Python Version: 2.5.2
Installed Applications:
['django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.sites',
'django.contrib.admin',
'getcomics.library',
'getcomics.downloader',
'getcomics.rest_ws',
'getcomics.jobserver']
Installed Middleware:
('django.middleware.common.CommonMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware')
Traceback:
File "/usr/lib/python2.5/site-packages/django/core/handlers/base.py" in get_response
86. response = callback(request, *callback_args, **callback_kwargs)
File "/usr/lib/python2.5/site-packages/django/contrib/auth/decorators.py" in __call__
67. return self.view_func(request, *args, **kwargs)
File "/u/apps/getcomics/releases/20090325215603/getcomics/../getcomics/library/views.py" in user_upload
50. pub = new_form.save()
File "/usr/lib/python2.5/site-packages/django/forms/models.py" in save
319. return save_instance(self, self.instance, self._meta.fields, fail_message, commit)
File "/usr/lib/python2.5/site-packages/django/forms/models.py" in save_instance
61. f.save_form_data(instance, cleaned_data[f.name])
File "/usr/lib/python2.5/site-packages/django/db/models/fields/files.py" in save_form_data
192. getattr(instance, self.name).save(data.name, data, save=False)
File "/usr/lib/python2.5/site-packages/django/db/models/fields/files.py" in save
74. self._name = self.storage.save(name, content)
File "/usr/lib/python2.5/site-packages/django/core/files/storage.py" in save
44. name = self.get_available_name(name)
File "/usr/lib/python2.5/site-packages/django/core/files/storage.py" in get_available_name
66. while self.exists(name):
File "/usr/lib/python2.5/site-packages/django/core/files/storage.py" in exists
188. return os.path.exists(self.path(name))
File "/usr/lib/python2.5/posixpath.py" in exists
171. st = os.stat(path)
Exception Type: UnicodeEncodeError at /library/upload
Exception Value: 'ascii' codec can't encode character u'\xf1' in position 49: ordinal not in range(128)
comment:7 by , 17 years ago
| Owner: | changed from to |
|---|---|
| Status: | reopened → new |
This looks similar to #9579. We can't necessarily pass unicode into Python's os.path routines. In fact I see there is an os.path.supports_unicode_filenames that may be what we are supposed to consult to determine whether we can pass unicode in or if we must encode to a bytestring before calling the os.path routines.
comment:8 by , 17 years ago
For people who are experiencing this, have you (or the code you are using) done something to override the default django.core.files.Storage get_valid_name method? That routine strips all "non-filename-safe" characters from the name, so I cannot easily recreate the exception you are seeing by simply, say, trying to upload a file using admin and a model that contains a simple FileField. (Which indicates the 2nd part of #6009 had not been fixed, despite the testcases added, but that's a different issue.) It would help if someone could provide a small reproducible testcase that demonstrates this exception, as it involves more than just uploading a file with unicode chars in the name using all normal defaults in admin...when doing that, the unicode chars are stripped and the exception does not happen.
comment:9 by , 17 years ago
I fixed my problem. It was a server configuration issue. For some reason, the LANG environment variable wasn't being set.
I could not reproduce the issue on my development machine, which had me look at my production server's environment.
The unicode characters are not stripped, but its fine now that env vars are set properly.
comment:10 by , 17 years ago
| Resolution: | → invalid |
|---|---|
| Status: | new → closed |
OK, this is not in fact like #9579. os.stat accepts unicode paths just fine, so long as the LANG environment variable is set correctly. When it is not, for example if it set to "C", things like os.getfilesystemencoding() return odd values like 'ANSI_X3.4-1968', which is apparently a fancy way to spell 'ASCII', and os.stat runs into trouble attempting to encode the unicode path value into the supposed preferred fs encoding. The correct fix is to ensure that LANG is set properly.
Unfortunately LANG is often set incorrectly when running under Apache. Documenting the need to set LANG properly under Apache is the subject of #10426, so it doesn't need this ticket as well to track it.
[Also, the stripping of unicode chars from file names is covered by #10254. Not sure why the reporters of this problem don't see that, but I had to modify get_valid_filename as mentioned in a comment on that ticket (and run under Apache where the LANG setting was wrong) to even recreate this error. But that other issue I noticed is also covered by another ticket.]
So, since ultimately the exception here is due to a config error, and there's already another ticket to cover documenting the config requirements better, I'm closing this one as invalid.
I can't reproduce the error you're describing. If you're sure this is a bug, please reopen this ticket providing more informations like showing us your model (at least the relevant portion of it) and/or the file's name that's causing the issue.