Code

Opened 17 months ago

Closed 17 months ago

Last modified 17 months ago

#19367 closed Bug (fixed)

ContentFile fails to save with filesystem storage when initialized with unicode string

Reported by: void Owned by: nobody
Component: File uploads/storage Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Attached test fails for me on python 2.7 with the following traceback:

Traceback (most recent call last):
  File "/Users/voidus/Documents/workspace/django-trunk/tests/regressiontests/file_storage/tests.py", line 590, in test_content_saving
    self.storage.save('unicode.txt', ContentFile("español"))
  File "/Users/voidus/Documents/workspace/django-trunk/django/core/files/storage.py", line 48, in save
    name = self._save(name, content)
  File "/Users/voidus/Documents/workspace/django-trunk/django/core/files/storage.py", line 206, in _save
    _file.write(chunk)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 4: ordinal not in range(128)

Attachments (3)

ticket19367.txt (1.3 KB) - added by void 17 months ago.
Failing test
19367-2a.diff (2.7 KB) - added by claudep 17 months ago.
Approach 1: Different behaviour depending on python 2 or 3
19367-2b.diff (2.7 KB) - added by claudep 17 months ago.
Approach 2: Always force content to bytes

Download all attachments as: .zip

Change History (11)

Changed 17 months ago by void

Failing test

comment:1 Changed 17 months ago by aaugustin

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

IMHO it's an error to initialize a file-like object with unicode content; files are bytes.

That said, if Django currently supports it, we shouldn't remove this possibility outright (ie. without deprecation).

Is this a regression from Django 1.4?

comment:2 Changed 17 months ago by claudep

I was the bad guy who added support for unicode argument to ContentFile (in #11739).

After re-reading the ticket, I tend to think now that even if we should still be liberal in what we accept as parameter, we should force conversion to bytes. The downside is that the write/read cycle is not symetrical (write unicode (or bytes), read bytes), but as Aymeric said, files are bytes. I just quickly tested a patch with this idea, and except of the specific "symetrical" test committed in #11739, the test suite passes.

comment:3 Changed 17 months ago by void

Just a note: files are bytes, unless there're opened in python3 with 't' option

Changed 17 months ago by claudep

Approach 1: Different behaviour depending on python 2 or 3

Changed 17 months ago by claudep

Approach 2: Always force content to bytes

comment:4 Changed 17 months ago by claudep

  • Has patch set
  • Triage Stage changed from Unreviewed to Accepted

I'm still undecided about the best way to fix this. On one side, it would be easier and more consistent to always convert the content to bytes (Approach 2), but on the other side, opening text files on Python 3 returns unicode content, so we might also reflect that difference in our implementation (Approach 1).

comment:5 Changed 17 months ago by void

I'd prefer approach 1, because it is more consistent in my eyes with python2 and python3 files behaviour - and this behaviour is, in fact, different for python2 and python3.

comment:6 Changed 17 months ago by aaugustin

  • Triage Stage changed from Accepted to Ready for checkin

Patch 2a looks fine.

comment:7 Changed 17 months ago by Claude Paroz <claude@…>

  • Resolution set to fixed
  • Status changed from new to closed

In 34dcf51e06fc57fc5462bd40f6a4c8ee949521da:

Fixed #19367 -- Fixed saving ContentFile in filesystem storage

This was not working properly when ContentFile was initialized with
an unicode string.
Thanks Alexey Boriskin for the report and the test.

comment:8 Changed 17 months ago by Claude Paroz <claude@…>

In e9301ae451f84b06687d27af0a43bcb986054f63:

[1.5.x] Fixed #19367 -- Fixed saving ContentFile in filesystem storage

This was not working properly when ContentFile was initialized with
an unicode string.
Thanks Alexey Boriskin for the report and the test.

Backport of 34dcf51e06 from master.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.