Opened 16 years ago

Closed 14 years ago

Last modified 14 years ago

#8765 closed (fixed)

mod_python handler requires explicit str() cast for HttpResponse(mimetype=...)

Reported by: Ilya Semenov Owned by: nobody
Component: HTTP handling Version: dev
Severity: Keywords: mod_python content-type mimetype
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

mod_python assumes content_type to be an ASCII string, while django uses unicode strings everywhere.

For example, the following code:

att = Attachment.objects.get()
return HttpResponse(content=att.file, mimetype=att.content_type)

works in django dev server, but crashes in mod_python:

Traceback (most recent call last):

  File "/usr/lib/python2.5/site-packages/mod_python/importer.py", line 1537, in HandlerDispatch
    default=default_handler, arg=req, silent=hlist.silent)

  File "/usr/lib/python2.5/site-packages/mod_python/importer.py", line 1229, in _process_target
    result = _execute_target(config, req, object, arg)

  File "/usr/lib/python2.5/site-packages/mod_python/importer.py", line 1128, in _execute_target
    result = object(arg)

  File "/usr/lib/python2.5/site-packages/django/core/handlers/modpython.py", line 210, in handler
    return ModPythonHandler()(req)

  File "/usr/lib/python2.5/site-packages/django/core/handlers/modpython.py", line 193, in __call__
    req.content_type = response['Content-Type']

TypeError: content_type must be a string

so the only way to make it work is to cast explicitely:

return HttpResponse(content=att.file, mimetype=str(att.content_type))

The str() cast needs to be moved into core/handlers/modpython.py, which already applies it for all headers except Content-Type (since it's handled separately).

Attachments (3)

modpython.8693.diff (614 bytes ) - added by Ilya Semenov 16 years ago.
The proposed patch
8765.diff (1.6 KB ) - added by arien 15 years ago.
make sure any content_type (or mimetype) passed to HttpResponse is properly encoded. (Updated patch that is more explicit and with minimal changes to code.)
8765.2.diff (3.0 KB ) - added by Chris Beaven 15 years ago.

Download all attachments as: .zip

Change History (13)

by Ilya Semenov, 16 years ago

Attachment: modpython.8693.diff added

The proposed patch

comment:1 by Malcolm Tredinnick, 16 years ago

So this proposes a solution without describing the course of events that makes it arise. At the moment we don't do any casting there because it shouldn't be necessary. What normal piece of code is passing in a non-ASCII mimetype?

It's probably not too harmful -- although str() is the wrong way to do it, for error handling reasons; plus I'll have to look up the spec to see if only ASCII is permitted or if octets are also allowed -- but I'd like to understand how this can arise in the normal course of events before doing that.

comment:2 by Ilya Semenov, 16 years ago

Yes it does describe the course of events.

Let me paste it again:

att = Attachment.objects.get()
return HttpResponse(content=att.file, mimetype=att.content_type)

I think that is perfectly normal piece of code. In my project, I have the concept of attachments, and access to particular attachments depends on the user logged in. Therefore, I need to proxy the download requests via my custom view. In the model, I need to store the original content_type to help the browser properly recognize it when downloading (e.g., distinguish a text file from a HTML file).

For your reference, below is my model. As you can see, content_type is a CharField, which is represented as a unicode object when accessed.

class Attachment(models.Model):
        comment = models.ForeignKey(Comment)
        file = models.FileField(upload_to='attachments')
        filename = models.CharField(max_length=255)
        content_type = models.CharField(max_length=255)
        content_length = models.IntegerField()

comment:3 by Ilya Semenov, 16 years ago

Also, if you're still curious, that is how Model instances are created (i.e., how are unicode content_types originally created):

class Comment:

        # ...

        def create_attachments(self, uploaded_files):
                for uploaded_file in uploaded_files:
                        a = self.attachment_set.create(
                                filename = uploaded_file.name,
                                content_type = uploaded_file.content_type,
                                content_length = uploaded_file.size,
                        )
                        a.file.save('%s' % a.id, uploaded_file)

# ....

comment.create_attachments(request.FILES.getlist('attachment'))

comment:4 by arien, 15 years ago

Resolution: invalid
Status: newclosed

The syntax of the Content-Type header is...

      Content-Type   = "Content-Type" ":" media-type

... where the syntax of media-type is...

       media-type     = type "/" subtype *( ";" parameter )
       type           = token
       subtype        = token

... and token and parameter are sequences of characters from a subset of (7-bit) US-ASCII.

Marking invalid.

comment:5 by Ilya Semenov, 15 years ago

Resolution: invalid
Status: closedreopened

I don't think you have read through the ticket description and understood the problem.

Let me paste the problematic code again:

att = Attachment.objects.get()
return HttpResponse(content=att.file, mimetype=att.content_type) # crashes in mod_python
return HttpResponse(content=att.file, mimetype=str(att.content_type)) # workaround for mod_python

This is a real life example and I consider it to be perfectly natural. Please outline why do you consider the above code invalid or unnatural (and you do consider it as such, as your opinion was that it's fine that the given code crashed).

I agree that Content Type is supposed to accept 7-bit ASCII characters only. That is not the problem described in the ticket. I am not trying to make it work with non-ASCII data.

The problem is that the entire string data flow within Django is guaranteed to be unicode: Django natively supports Unicode data everywhere. Providing your database can somehow store the data, you can safely pass around Unicode strings to templates, models and the database. (http://docs.djangoproject.com/en/dev/ref/unicode/#ref-unicode). The problem is that it works as described everywhere except HttpResponse handled by mod_python, and that is highly inconsistent.

comment:6 by arien, 15 years ago

Triage Stage: UnreviewedAccepted

I'd read through the ticket, but you're right: I misunderstood the real issue. (I thought you were talking about ASCII vs. non-ASCII values, instead of bytestrings vs Unicode strings.)

As for opinions on the code you gave, I didn't really have any. They don't really matter anyway. :-)

How about fixing this issue in django.http.HttpResponse instead? I'll attach a patch.

comment:7 by Ilya Semenov, 15 years ago

Agreed, your patch is way better as it just makes content-type to pass the common conversion performed by HttpResponse.setitem

by arien, 15 years ago

Attachment: 8765.diff added

make sure any content_type (or mimetype) passed to HttpResponse is properly encoded. (Updated patch that is more explicit and with minimal changes to code.)

by Chris Beaven, 15 years ago

Attachment: 8765.2.diff added

comment:8 by Chris Beaven, 15 years ago

Triage Stage: AcceptedReady for checkin

Patch against new trunk. Functionally equivalent to arien's but I reordered the code in HttpResponse.__init__ to a more logical order (oh, and I change tests to test for failures against both content_type and mimetype).

comment:9 by Malcolm Tredinnick, 14 years ago

Resolution: fixed
Status: reopenedclosed

(In [13740]) Improved unicode-type, ASCII-convertible header handling in
HttpResponse.

Fixed #8765. Thanks to SmileyChris and semenov for working on this one.

comment:10 by Malcolm Tredinnick, 14 years ago

(In [13748]) [1.2.X] Improved unicode-type, ASCII-convertible header handling in
HttpResponse.

Fixed #8765. Thanks to SmileyChris and semenov for working on this one.

Backport of r13740 from trunk.

Note: See TracTickets for help on using tickets.
Back to Top