Code

#18916 closed Bug (fixed)

Django incorrectly restricts HTTP header values to ASCII

Reported by: aaugustin Owned by: aaugustin
Component: HTTP handling Version: master
Severity: Normal Keywords:
Cc: chris@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Whenever a HTTP header is set for an HttpResponse, Django raises an exception if its key or value contains non-ASCII characters.

However, RFC2616 defines message headers in section 4.2 as:

       message-header = field-name ":" [ field-value ]
       field-name     = token
       field-value    = *( field-content | LWS )
       field-content  = <the OCTETs making up the field-value
                        and consisting of either *TEXT or combinations
                        of token, separators, and quoted-string>

where

The TEXT rule is only used for descriptive field contents and values
   that are not intended to be interpreted by the message parser. Words
   of *TEXT MAY contain characters from character sets other than ISO-
   8859-1 [22] only when encoded according to the rules of RFC 2047
   [14].

       TEXT           = <any OCTET except CTLs,
                        but including LWS>

This indicates that an arbitrary bytestring is acceptable as a value.


I hit this issue while setting a X-SendFile header pointing to a non-ASCII file name. It seems to me that Django should:

  • at least accept any bytes content (since any bytestring can be interpreted as latin-1) and attempt converting text content to latin-1, raising an error if that isn't possible
  • even better, use MIME encoding for text values that don't fit in the latin-1 charset.

The header keys must stay restricted to ASCII: RFC 2616 says they're of the token type, defined by:

token          = 1*<any CHAR except CTLs or separators>

with

CHAR           = <any US-ASCII character (octets 0 - 127)>

Finally, PEP 3333 says:

Note also that strings passed to start_response() as a status or as response headers must follow RFC 2616 with respect to encoding. That is, they must either be ISO-8859-1 characters, or use RFC 2047 MIME encoding.

On Python platforms where the str or StringType type is in fact Unicode-based (e.g. Jython, IronPython, Python 3, etc.), all "strings" referred to in this specification must contain only code points representable in ISO-8859-1 encoding (\u0000 through \u00FF, inclusive).


PS: RFC 2616 points to RFC 822, where section 3.1.2. restricts headers to ASCII. This may explain why Django has this restriction.

Attachments (1)

18916.diff (6.5 KB) - added by aaugustin 20 months ago.

Download all attachments as: .zip

Change History (8)

comment:1 Changed 20 months ago by claudep

  • Triage Stage changed from Unreviewed to Accepted

Your analysis seems correct. +1 from me.

comment:2 Changed 20 months ago by acdha

  • Cc chris@… added

Changed 20 months ago by aaugustin

comment:3 follow-up: Changed 20 months ago by aaugustin

  • Has patch set
  • Patch needs improvement set

Attached patch would work if it weren't for this line in core/mail/message.py:

Charset.add_charset('utf-8', Charset.SHORTEST, None, 'utf-8')

comment:4 Changed 20 months ago by aaugustin

  • Owner changed from nobody to aaugustin

comment:5 Changed 20 months ago by aaugustin

  • Patch needs improvement unset

Updated patch, pull request here: https://github.com/django/django/pull/339

comment:6 in reply to: ↑ 3 Changed 20 months ago by aaugustin

Replying to aaugustin:

Attached patch would work if it weren't for this line in core/mail/message.py:

Charset.add_charset('utf-8', Charset.SHORTEST, None, 'utf-8')

This problem is actually tracked in #12422. Thanks Claude for pointing this out.

comment:7 Changed 20 months ago by Aymeric Augustin <aymeric.augustin@…>

  • Resolution set to fixed
  • Status changed from new to closed

In [9b07b5edeb770b037dc735d48dfd6f979422f586]:

Fixed #18916 -- Allowed non-ASCII headers.

Thanks Malcolm Tredinnick for the review.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.