Django

Code

Ticket #4969 (closed: fixed)

Opened 1 year ago

Last modified 1 year ago

GZip middleware fails due to UnicodeDecodeError

Reported by: Johann Queuniet <johann.queuniet@adh.naellia.eu> Assigned to: nobody
Milestone: Component: HTTP handling
Version: SVN Keywords: unicode gzip middleware
Cc: Triage Stage: Design decision needed
Has patch: 1 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 1

Description

The GZip middleware crash after the compression, when it tries to fill the Content-Length header and calls len(str(request.content)). HttpResponse calls smart_str(), which fails since it expects UTF-8 content.

2007-07-25 18:32:48: (mod_fastcgi.c.2550) FastCGI-stderr: Traceback (most recent call last):
  File "/usr/lib64/python2.5/site-packages/flup/server/fcgi_base.py", line 558, in run
    protocolStatus, appStatus = self.server.handler(self)
  File "/usr/lib64/python2.5/site-packages/flup/server/fcgi_base.py", line 1112, in handler
    result = self.application(environ, start_response)
  File "/usr/lib/python2.5/django/core/handlers/wsgi.py", line 194, in __call__
  File "/usr/lib64/python2.5/site-packages/django/middleware/gzip.py", line 28, in process_response
    response['Content-Length'] = str(len(response.content))
  File "/usr/lib64/python2.5/site-packages/django/http/__init__.py", line 275, in _get_content
    content = smart_str(''.join(self._container), self._charset)
  File "/usr/lib/python2.5/django/utils/encoding.py", line 63, in smart_str
  File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x8b in position 1: unexpected code byte

I don't really know where the fault lies here. Is it smart_str's if it chocks on gzip-encoded content ? Or should the HttpResponse object do something else when the Content-Encoding header is set ?

The patch included assumes the latter.

Attachments

gzip-http.diff (469 bytes) - added by Johann Queuniet <johann.queuniet@adh.naellia.eu> on 07/25/07 12:11:18.
patch for HttpResponse

Change History

07/25/07 12:11:18 changed by Johann Queuniet <johann.queuniet@adh.naellia.eu>

  • attachment gzip-http.diff added.

patch for HttpResponse

07/25/07 20:19:49 changed by Simon G. <dev@simon.net.nz>

  • needs_better_patch set to 1.
  • needs_tests changed.
  • summary changed from GZip middleware fails to GZip middleware fails due to UnicodeDecodeError.
  • keywords changed from gzip middleware to unicode gzip middleware.
  • needs_docs changed.
  • stage changed from Unreviewed to Ready for checkin.

08/20/07 04:54:56 changed by Simon G. <dev@simon.net.nz>

  • stage changed from Ready for checkin to Design decision needed.

My promotion here was a bit hasty, I'll move this back to DDN to get some comments on the best way to do this.

08/20/07 05:11:44 changed by mtredinnick

I think it's pretty close as it is, Simon. It's on my list to look at shortly, anyway. Leave it in its current state, but if anybody else is looking at this, my gut feeling is the approach in the patch is along the right lines and we just have to make sure the edge-cases are handled correctly.

10/20/07 01:50:17 changed by mtredinnick

  • status changed from new to closed.
  • resolution set to fixed.

(In [6548]) Fixed #4969 -- Changed content retrieval in HttpResponse to be more robust in the presence of an existing content encoding. Fixes some sporadic failures with the GzipMiddleware?, for example. Thanks, Johann Queuniet.


Add/Change #4969 (GZip middleware fails due to UnicodeDecodeError)




Change Properties
Action