Opened 17 years ago

Closed 17 years ago

#4969 closed (fixed)

GZip middleware fails due to UnicodeDecodeError

Reported by: Johann Queuniet <johann.queuniet@…> Owned by: nobody
Component: HTTP handling Version: dev
Severity: Keywords: unicode gzip middleware
Cc: Triage Stage: Design decision needed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

The GZip middleware crash after the compression, when it tries to fill the Content-Length header and calls len(str(request.content)). HttpResponse calls smart_str(), which fails since it expects UTF-8 content.

2007-07-25 18:32:48: (mod_fastcgi.c.2550) FastCGI-stderr: Traceback (most recent call last):
  File "/usr/lib64/python2.5/site-packages/flup/server/fcgi_base.py", line 558, in run
    protocolStatus, appStatus = self.server.handler(self)
  File "/usr/lib64/python2.5/site-packages/flup/server/fcgi_base.py", line 1112, in handler
    result = self.application(environ, start_response)
  File "/usr/lib/python2.5/django/core/handlers/wsgi.py", line 194, in __call__
  File "/usr/lib64/python2.5/site-packages/django/middleware/gzip.py", line 28, in process_response
    response['Content-Length'] = str(len(response.content))
  File "/usr/lib64/python2.5/site-packages/django/http/__init__.py", line 275, in _get_content
    content = smart_str(''.join(self._container), self._charset)
  File "/usr/lib/python2.5/django/utils/encoding.py", line 63, in smart_str
  File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x8b in position 1: unexpected code byte

I don't really know where the fault lies here. Is it smart_str's if it chocks on gzip-encoded content ? Or should the HttpResponse object do something else when the Content-Encoding header is set ?

The patch included assumes the latter.

Attachments (1)

gzip-http.diff (469 bytes ) - added by Johann Queuniet <johann.queuniet@…> 17 years ago.
patch for HttpResponse

Download all attachments as: .zip

Change History (5)

by Johann Queuniet <johann.queuniet@…>, 17 years ago

Attachment: gzip-http.diff added

patch for HttpResponse

comment:1 by Simon G. <dev@…>, 17 years ago

Keywords: unicode added
Patch needs improvement: set
Summary: GZip middleware failsGZip middleware fails due to UnicodeDecodeError
Triage Stage: UnreviewedReady for checkin

comment:2 by Simon G. <dev@…>, 17 years ago

Triage Stage: Ready for checkinDesign decision needed

My promotion here was a bit hasty, I'll move this back to DDN to get some comments on the best way to do this.

comment:3 by Malcolm Tredinnick, 17 years ago

I think it's pretty close as it is, Simon. It's on my list to look at shortly, anyway. Leave it in its current state, but if anybody else is looking at this, my gut feeling is the approach in the patch is along the right lines and we just have to make sure the edge-cases are handled correctly.

comment:4 by Malcolm Tredinnick, 17 years ago

Resolution: fixed
Status: newclosed

(In [6548]) Fixed #4969 -- Changed content retrieval in HttpResponse to be more robust in
the presence of an existing content encoding. Fixes some sporadic failures with
the GzipMiddleware, for example. Thanks, Johann Queuniet.

Note: See TracTickets for help on using tickets.
Back to Top