#21321 closed New feature (wontfix)

File.__iter__().chunk python3 compatibility

Reported by: jeremiahsavage@… Owned by: nobody
Component: Core (Other) Version: master
Severity: Normal Keywords: python3
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

In testing django-downloadview for a project, I found one of it's classes, PathDownloadView worked with python2.7, but not with python3.3. I've made the following change to django/core/files/base.py in Files.__iter() and now I am able to use PathDownloadView and StorageDownloadView.

from

chunk_buffer = BytesIO(chunk)

to

if type(chunk) is str:
    chunk_buffer = BytesIO(str.encode(chunk))
elif type(chunk) is bytes:
    chunk_buffer = BytesIO(chunk)

Change History (13)

comment:1 Changed 21 months ago by timo

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

If you're able to offer a patch along with a test, that will expedite us being able to fix this, thanks!

comment:2 Changed 21 months ago by benoitbryon

In Python 2.7, if "chunk" is unicode, we get the same kind of error.

What about the following changes?

- chunk_buffer = BytesIO(chunk)
+ chunk_buffer = BytesIO(force_bytes(chunk))

... where force_bytes is already imported from django.utils.encoding and used in ContentFile.

comment:3 Changed 21 months ago by daftshady

Using force_bytes looks good.
I will attach a test code on it.

comment:4 Changed 21 months ago by claudep

Until now, the File wrapper has always assumed the underlying file object was byte-oriented (in the sense of io.FileIO of the standard lib).
It must not absolutely remain so, but then isn't this report more about the new feature "Support text-oriented I/O in Django File wrapper"?

comment:5 Changed 21 months ago by daftshady

Attached pull request https://github.com/django/django/pull/1808 temporally fixing that errors.

comment:6 Changed 21 months ago by claudep

  • Has patch set
  • Patch needs improvement set

I think you missed the point. The issue is not about getting automagically decoded text by iterating a file, but supporting a text-oriented underlying file object.

comment:7 Changed 21 months ago by vajrasky

Attached the PR: https://github.com/django/django/pull/1814 fixing (I hope) this issue. Tested with Python 3.3, 3.4, 2.7.

Test with Python2.7 may need to be improved. I update the PR later while waiting for the confirmation about whether this PR is the correct way to solve the problem.

comment:8 Changed 21 months ago by vajrasky

Okay, fixed the test with Python2.7 (Same PR). https://github.com/django/django/pull/1814

It is so cumbersome to make both python2 and python3 happy. >.<

comment:9 Changed 20 months ago by claudep

  • Type changed from Bug to New feature
  • Version changed from 1.5 to master

Note that about the OP use case, it's actually a bug in django-downloadview. As the file type is not known in PathDownloadView, binary mode should be explicitly specified in the open() call.

comment:10 Changed 20 months ago by benoitbryon

Note that about the OP use case, it's actually a bug in django-downloadview. As the file type is not known in PathDownloadView, binary mode should be explicitly specified in the open() call.

Created https://github.com/benoitbryon/django-downloadview/issues/57 about binary VS text mode.

comment:11 Changed 20 months ago by benoitbryon

In Python 2.7, if "chunk" is unicode, we get the same kind of error.

Here is a simple example to reproduce the story...

With "str" it works fine:

>>> from StringIO import StringIO
>>> from django.core.files import File
>>> str_file = StringIO('Hello world')
>>> str_file_wrapper = File(str_file)
>>> [chunk for chunk in str_file_wrapper]
['Hello world']

But with unicode ValueError is raised:

>>> unicode_file = StringIO(u'Hello world')
>>> unicode_file_wrapper = File(unicode_file)
>>> [chunk for chunk in unicode_file_wrapper]
Traceback (most recent call last):
  ...
  File "/mnt/data/web/django-downloadview/lib/buildout/eggs/Django-1.5-py2.7.egg/django/core/files/base.py", line 97, in __iter__
    chunk_buffer = BytesIO(chunk)
TypeError: 'unicode' does not have the buffer interface

Is the use case valid? Is File intended to support such file objects?

In django-downloadview, I use something similar to support generators (files that are generated via "yield" statement). Having "yield u'Hello world'" is valid, isn't it?
Should I use another file wrapper that inherits File? Something similar to ContentFile, but for generators?

comment:12 Changed 20 months ago by benoitbryon

Note: I just released an update of django-downloadview where:

  • files on local filesystem are opened using binary mode by default
  • VirtualFile (generated files, StringIO) overrides __iter__() implementation to use force_bytes()

=> It looks like django-downloadview works even if Django's File is byte-oriented.

So, I cannot tell if the feature request / bug report is still valid.
Any idea jeremiahsavage?

comment:13 Changed 20 months ago by claudep

  • Resolution set to wontfix
  • Status changed from new to closed

I would say that we will keep the Django File interface byte-oriented for now. If we find more use cases which are harder to workaround, we might reconsider this.

Note: See TracTickets for help on using tickets.
Back to Top