Code

Opened 6 years ago

Closed 4 years ago

Last modified 4 years ago

#8765 closed (fixed)

mod_python handler requires explicit str() cast for HttpResponse(mimetype=...)

Reported by: semenov Owned by: nobody
Component: HTTP handling Version: master
Severity: Keywords: mod_python content-type mimetype
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

mod_python assumes content_type to be an ASCII string, while django uses unicode strings everywhere.

For example, the following code:

att = Attachment.objects.get()
return HttpResponse(content=att.file, mimetype=att.content_type)

works in django dev server, but crashes in mod_python:

Traceback (most recent call last):

  File "/usr/lib/python2.5/site-packages/mod_python/importer.py", line 1537, in HandlerDispatch
    default=default_handler, arg=req, silent=hlist.silent)

  File "/usr/lib/python2.5/site-packages/mod_python/importer.py", line 1229, in _process_target
    result = _execute_target(config, req, object, arg)

  File "/usr/lib/python2.5/site-packages/mod_python/importer.py", line 1128, in _execute_target
    result = object(arg)

  File "/usr/lib/python2.5/site-packages/django/core/handlers/modpython.py", line 210, in handler
    return ModPythonHandler()(req)

  File "/usr/lib/python2.5/site-packages/django/core/handlers/modpython.py", line 193, in __call__
    req.content_type = response['Content-Type']

TypeError: content_type must be a string

so the only way to make it work is to cast explicitely:

return HttpResponse(content=att.file, mimetype=str(att.content_type))

The str() cast needs to be moved into core/handlers/modpython.py, which already applies it for all headers except Content-Type (since it's handled separately).

Attachments (3)

modpython.8693.diff (614 bytes) - added by semenov 6 years ago.
The proposed patch
8765.diff (1.6 KB) - added by arien 5 years ago.
make sure any content_type (or mimetype) passed to HttpResponse is properly encoded. (Updated patch that is more explicit and with minimal changes to code.)
8765.2.diff (3.0 KB) - added by SmileyChris 5 years ago.

Download all attachments as: .zip

Change History (13)

Changed 6 years ago by semenov

The proposed patch

comment:1 Changed 6 years ago by mtredinnick

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

So this proposes a solution without describing the course of events that makes it arise. At the moment we don't do any casting there because it shouldn't be necessary. What normal piece of code is passing in a non-ASCII mimetype?

It's probably not too harmful -- although str() is the wrong way to do it, for error handling reasons; plus I'll have to look up the spec to see if only ASCII is permitted or if octets are also allowed -- but I'd like to understand how this can arise in the normal course of events before doing that.

comment:2 Changed 6 years ago by semenov

Yes it does describe the course of events.

Let me paste it again:

att = Attachment.objects.get()
return HttpResponse(content=att.file, mimetype=att.content_type)

I think that is perfectly normal piece of code. In my project, I have the concept of attachments, and access to particular attachments depends on the user logged in. Therefore, I need to proxy the download requests via my custom view. In the model, I need to store the original content_type to help the browser properly recognize it when downloading (e.g., distinguish a text file from a HTML file).

For your reference, below is my model. As you can see, content_type is a CharField, which is represented as a unicode object when accessed.

class Attachment(models.Model):
        comment = models.ForeignKey(Comment)
        file = models.FileField(upload_to='attachments')
        filename = models.CharField(max_length=255)
        content_type = models.CharField(max_length=255)
        content_length = models.IntegerField()

comment:3 Changed 6 years ago by semenov

Also, if you're still curious, that is how Model instances are created (i.e., how are unicode content_types originally created):

class Comment:

        # ...

        def create_attachments(self, uploaded_files):
                for uploaded_file in uploaded_files:
                        a = self.attachment_set.create(
                                filename = uploaded_file.name,
                                content_type = uploaded_file.content_type,
                                content_length = uploaded_file.size,
                        )
                        a.file.save('%s' % a.id, uploaded_file)

# ....

comment.create_attachments(request.FILES.getlist('attachment'))

comment:4 Changed 5 years ago by arien

  • Resolution set to invalid
  • Status changed from new to closed

The syntax of the Content-Type header is...

      Content-Type   = "Content-Type" ":" media-type

... where the syntax of media-type is...

       media-type     = type "/" subtype *( ";" parameter )
       type           = token
       subtype        = token

... and token and parameter are sequences of characters from a subset of (7-bit) US-ASCII.

Marking invalid.

comment:5 Changed 5 years ago by semenov

  • Resolution invalid deleted
  • Status changed from closed to reopened

I don't think you have read through the ticket description and understood the problem.

Let me paste the problematic code again:

att = Attachment.objects.get()
return HttpResponse(content=att.file, mimetype=att.content_type) # crashes in mod_python
return HttpResponse(content=att.file, mimetype=str(att.content_type)) # workaround for mod_python

This is a real life example and I consider it to be perfectly natural. Please outline why do you consider the above code invalid or unnatural (and you do consider it as such, as your opinion was that it's fine that the given code crashed).

I agree that Content Type is supposed to accept 7-bit ASCII characters only. That is not the problem described in the ticket. I am not trying to make it work with non-ASCII data.

The problem is that the entire string data flow within Django is guaranteed to be unicode: Django natively supports Unicode data everywhere. Providing your database can somehow store the data, you can safely pass around Unicode strings to templates, models and the database. (http://docs.djangoproject.com/en/dev/ref/unicode/#ref-unicode). The problem is that it works as described everywhere except HttpResponse handled by mod_python, and that is highly inconsistent.

comment:6 Changed 5 years ago by arien

  • Triage Stage changed from Unreviewed to Accepted

I'd read through the ticket, but you're right: I misunderstood the real issue. (I thought you were talking about ASCII vs. non-ASCII values, instead of bytestrings vs Unicode strings.)

As for opinions on the code you gave, I didn't really have any. They don't really matter anyway. :-)

How about fixing this issue in django.http.HttpResponse instead? I'll attach a patch.

comment:7 Changed 5 years ago by semenov

Agreed, your patch is way better as it just makes content-type to pass the common conversion performed by HttpResponse.setitem

Changed 5 years ago by arien

make sure any content_type (or mimetype) passed to HttpResponse is properly encoded. (Updated patch that is more explicit and with minimal changes to code.)

Changed 5 years ago by SmileyChris

comment:8 Changed 5 years ago by SmileyChris

  • Triage Stage changed from Accepted to Ready for checkin

Patch against new trunk. Functionally equivalent to arien's but I reordered the code in HttpResponse.__init__ to a more logical order (oh, and I change tests to test for failures against both content_type and mimetype).

comment:9 Changed 4 years ago by mtredinnick

  • Resolution set to fixed
  • Status changed from reopened to closed

(In [13740]) Improved unicode-type, ASCII-convertible header handling in
HttpResponse.

Fixed #8765. Thanks to SmileyChris and semenov for working on this one.

comment:10 Changed 4 years ago by mtredinnick

(In [13748]) [1.2.X] Improved unicode-type, ASCII-convertible header handling in
HttpResponse.

Fixed #8765. Thanks to SmileyChris and semenov for working on this one.

Backport of r13740 from trunk.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.