Opened 12 years ago

Closed 6 years ago

#20147 closed New feature (fixed)

Provide an alternative to request.META for accessing HTTP headers

Reported by: Luke Plant Owned by: Santiago Basulto
Component: HTTP handling Version: dev
Severity: Normal Keywords:
Cc: marc.tamlyn@…, tom@…, ben@…, Zach Borboa, Santiago Basulto Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

From the docs:

HttpRequest.META
A standard Python dictionary containing all available HTTP headers...

With the exception of CONTENT_LENGTH and CONTENT_TYPE, as given above, any HTTP headers in the request are converted to META keys by converting all characters to uppercase, replacing any hyphens with underscores and adding an HTTP_ prefix to the name. So, for example, a header called X-Bender would be mapped to the META key HTTP_X_BENDER.

The question is, why? Why do we have this ridiculous transform? It is pure silliness, whose only explanation is a quirk of CGI, which is now totally irrelevant.

You should be able to look up a header in the HTTP spec and do something very simple to get it from the HTTP request. How about this API:

request.HEADERS['Host']

(for consistency with GET/POST/FILES etc.), or even

request['Host']

Dictionary access should obey HTTP rules about case-sensitivity of the header names.

This also would has the advantage that repr(request) wouldn't have lots of junk you don't need i.e. the entire content of os.environ, which, on a developer machine especially, can have a lot of noise (mine does).

It also future-proofs us for when WSGI is replaced with something more sensible, and the whole silly round trip to os.environ can be removed completely, or if we want to support something else parallel to WSGI and client code wants to access HTTP headers in the same way for both.

This leaves a few things in META that are not derived from an HTTP header, and do not have a way of accessing them from the request object. I think these are just:

  • SCRIPT_NAME - this is a CGI leftover, that is only useful in constructing other things, AFAICS
  • QUERY_STRING - this can be easily constructed from request.get_full_path() for the rare times that you need the raw query string rather than request.GET
  • SERVER_NAME - should use get_host() instead
  • SERVER_PORT - use get_host()
  • SERVER_PROTOCOL - could use is_secure(), but perhaps it would be nice to have a convenience get_protocol() method.

(see http://wsgi.readthedocs.org/en/latest/definitions.html)

Change History (27)

comment:1 by Luke Plant, 12 years ago

A strong argument against the request['Referer'] API is the use of request in templates (e.g. if request.GET.some_flag), which conflates dictionary access and attribute access, probably making request.HEADERS['Referer'] a much safer API.

comment:2 by anonymous, 12 years ago

HTTP headers are case insensitive. You want to get rid of the transform, but what happens when someone sends "accept: " and you check for HEADERS["Accept"]?

Last edited 12 years ago by Luke Plant (previous) (diff)

comment:3 by Luke Plant, 12 years ago

As stated above, "Dictionary access should obey HTTP rules about case-sensitivity of the header names."

I didn't say get rid of the transform - it should be done within the API, not by the user of the API. In terms of implementation, request.HEADERS['Accept'] will map straight to request._META['HTTP_ACCEPT'], at least for wsgi, or do something equivalent that will ensure case-insensitivity.

comment:4 by Luke Plant, 12 years ago

There are a few more things that need considering if this is to be done:

  • RequestFactory and the test Client, and their APIs which pass directly to request.META.
  • REMOTE_ADDRESS, REMOTE_USER
  • SECURE_PROXY_SSL_HEADER

comment:5 by Carl Meyer, 12 years ago

Minor bikeshed-type question: is there really value in making request.HEADERS all-caps? I realize the parallel to request.POST, request.GET, and request.META, but the former two are all-caps simply because HTTP methods are usually written that way. I guess I'd just like to see a bit of rationale spelled out for how we decide whether a given request attribute ought to be all-caps; I'd probably lean towards just request.headers for the new API.

More discussion of this proposal (in particular, whether to deprecate/change request.META) is here: https://groups.google.com/d/topic/django-developers/Jvs3F79cY4Y/discussion

comment:6 by Marc Tamlyn, 12 years ago

Cc: marc.tamlyn@… added

It would be consistent for request.headers to be lowercase to match up with request.body for example.

comment:7 by anonymous, 12 years ago

Should we consider having request.headers return unicode values rather than byte values?

Correctly decoding HTTP headers is slightly fiddly - the default supported encoding is iso-8859-1,
but utf-8 can also be supported as per RFC 2231, RFC 5987.

Getting the decoding right probably isn't something we want developers to have to think about.

Note: For real-world usage see this example of browser support for utf-8 in uploaded filenames: https://code.google.com/p/chromium/issues/detail?id=57830

comment:8 by Tom Christie, 12 years ago

Cc: tom@… added

(Ooops, that anonymous comment is mine.)

comment:9 by Tom Christie, 12 years ago

Okay, noticed that the link to chrome's use of iso-8859-1 is actually for response headers, so disregard that.

The question regarding unicode vs byte values still stands, though.

comment:10 by Luke Plant, 12 years ago

I'm happy with request.headers instead of request.HEADERS - the parallel to request.body does make more sense that request.GET.

Regarding unicode/bytes, it's a very thorny issue, and the more I look into it the worse it gets. PEP 3333 might apply, if we are assuming a simple mapping to request.META, but that essentially leaves decoding issues to the user if I'm reading it correctly.

comment:11 by Tom Christie, 12 years ago

Okay, maybe it's not obvious if unicode values would be preferable or not.

I thought I'd take a look at what the requests library does, and found this similar ticket: https://github.com/kennethreitz/requests/pull/1181

If it is something that we decide to do, then the following looks like it ought to do the trick:

from email.header import decode_header
u''.join(header_bytes.decode(enc or 'iso-8859-1') for header_bytes, enc in decode_header(h))

For further reference note that the httpbis spec is proposed to obsolete RFC2616, cleaning up & clarifying underspecified bits of the spec.
The relevant section on header value encoding is here: http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-19#section-3.2.2

comment:12 by Aymeric Augustin, 12 years ago

Summary: Replace and deprecate request.META for HTTP headersProvide an alternative to request.META for accessing HTTP headers
Triage Stage: UnreviewedAccepted

The mailing list discussion converged towards keeping META, but recommending a dict-like request.headers.

I'm updating the summary to reflect this.

comment:13 by astupidog, 12 years ago

Regarding the transformation of request headers, for example from X-Bender to the META key HTTP_X_BENDER -

From what I see this transformation is not done in django but in the wsgi implementation.

I tested with apache mod_wsgi and with python's wsgiref and seems that they are doing this transformation not django.

I couldn't find it documented anywhere but see this from python's Lib/wsgiref/simple_server.py

99 for h in self.headers.headers:
100 k,v = h.split(':',1)
101 k=k.replace('-','_').upper(); v=v.strip()
102 if k in env:
103 continue # skip content length, type,etc.
104 if 'HTTP_'+k in env:
105 env['HTTP_'+k] += ','+v # comma-separate multiple headers
106 else:
107 env['HTTP_'+k] = v

comment:14 by Ben Spaulding, 10 years ago

Cc: ben@… added

comment:15 by Tim Graham, 10 years ago

See #16068 for a duplicate.

comment:16 by Collin Anderson, 9 years ago

Proof of concept: https://github.com/django/django/pull/6803

This makes request.headers have lowercase header names, and replaces underscores with hyphens. (header names are lowercase in http2)

Also, on python3 we already get unicode headers from WSGI, and we're dropping py2 in January, so I don't think it's worth making the values unicode on python2. The header names are already unicode on python2.

{
  'accept-language': 'en-US,en;q=0.8',
  'accept-encoding': 'gzip, deflate, sdch',
  'host': 'localhost:8000',
  'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
  'upgrade-insecure-requests': '1',
  'connection': 'keep-alive',
  'cache-control': 'max-age=0',
  'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.84 Safari/537.36',
}

comment:17 by Asif Saifuddin Auvi, 8 years ago

Version: 1.5master

comment:18 by Santiago Basulto, 7 years ago

Owner: changed from nobody to Santiago Basulto
Status: newassigned

I will give it a try and submit a PR.

comment:19 by Asif Saifuddin Auvi, 7 years ago

Has patch: set
Last edited 7 years ago by Tim Graham (previous) (diff)

comment:20 by Tim Graham, 7 years ago

Patch needs improvement: set

comment:21 by Zach Borboa, 6 years ago

Cc: Zach Borboa added

comment:22 by Zach Borboa, 6 years ago

comment:23 by Tim Graham, 6 years ago

Patch needs improvement: unset

comment:24 by Carlton Gibson, 6 years ago

Triage Stage: AcceptedReady for checkin

comment:25 by Tim Graham, 6 years ago

Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

comment:26 by Santiago Basulto, 6 years ago

Cc: Santiago Basulto added
Patch needs improvement: unset

comment:27 by Tim Graham <timograham@…>, 6 years ago

Resolution: fixed
Status: assignedclosed

In 4fc35a9c:

Fixed #20147 -- Added HttpRequest.headers.

Note: See TracTickets for help on using tickets.
Back to Top