Code

Opened 5 years ago

Closed 4 years ago

Last modified 15 months ago

#11877 closed Uncategorized (fixed)

Document that request.get_host() fails when behind multiple reverse proxies

Reported by: tomevans222 Owned by: nobody
Component: Documentation Version: 1.0
Severity: Normal Keywords: proxy forwarded get_host
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

We run django behind several layers of proxies (please excuse ASCII art):

    Client
       |
       |
Apache (handling SSL connections, mod_proxy)
       |
       |
Apache (handling static content, load balancing, mod_proxy, mod_proxy_balancer)
    |                           | 
    |                           |
Apache (mod_fastcgi)       Apache (mod_fastcgi)
    |                           |
    |                           |
django instance             django instance

Each apache instance that proxies a request appends the ServerName of the proxy server into the X-Forwarded-For and X-Forwarded-Host headers. In our infrastructure, all of the apache instances respond to the same host name (which is preserved through the proxying using ProxyPreserveHost directive), so it ends up looking like this in request.META:

'HTTP_X_FORWARDED_FOR: '10.0.0.1, 10.0.0.2'
'HTTP_X_FORWARDED_HOST': 'portal.example.com, portal.example.com'
'HTTP_X_FORWARDED_SERVER': 'portal.example.com, portal.example.com'

This then breaks request.get_host(), which does the following:

   def get_host(self):
        """Returns the HTTP host using the environment or request headers."""
        # We try three options, in order of decreasing preference.
        if 'HTTP_X_FORWARDED_HOST' in self.META:
            host = self.META['HTTP_X_FORWARDED_HOST']
        elif 'HTTP_HOST' in self.META:
            host = self.META['HTTP_HOST']
        else:
            # Reconstruct the host using the algorithm from PEP 333.
            host = self.META['SERVER_NAME']
            server_port = str(self.META['SERVER_PORT'])
            if server_port != (self.is_secure() and '443' or '80'):
                host = '%s:%s' % (host, server_port)
        return host

IE, it completely ignores the possibility of more than one host in X-Forwarded-Host.

Breaking request.get_host() breaks request.build_absolute_uri(), which in turn breaks HttpResponseRedirect*, leading to redirects with location headers like:

  Location: http://portal.example.com, portal.example.com/foo/

I've not attached a patch, because I can think of a number of ways this could be fixed:

1) Change request parsing to notice multi value headers and correctly parse into arrays.
2) Change request.get_host() to look for multi value versions of X-Forwarded-Host and pull out the correct value.
3) Add separate middleware to rewrite request.META with just the 'closest' version of these headers.
... any number of other solutions.

This obviously is not a 'normal' deployment of django... We will probably work around with custom middleware for now, it is the least intrusive.

This is with 1.0, but the code for request.get_host() is the same in trunk.

Attachments (2)

proxy.py (1.3 KB) - added by tomevans222 5 years ago.
Middleware to work around the issue
patch_11877.diff (1.2 KB) - added by arnav 4 years ago.
patch with changes to documentation including example

Download all attachments as: .zip

Change History (8)

Changed 5 years ago by tomevans222

Middleware to work around the issue

comment:1 Changed 5 years ago by jacob

  • Component changed from Core framework to Documentation
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Summary changed from request.get_host() fails when behind multiple reverse proxies to Document that request.get_host() fails when behind multiple reverse proxies
  • Triage Stage changed from Unreviewed to Accepted

As you say, this is not a "normal" way of deploying Django, and I don't think there'll be a general solution here. See the back and forth about handling X-Forwarded-For for some background: the basic problem is that Django can't magically know which of a list of values is "correct" here. I think that your solution of handling this problem in middleware is as good as it's gonna get.

It would, however, be good to have this documented, so I'm hijacking this ticket and turning it into a doc request. We could add a note to the docs for request.get_host() and include your little bit of middleware as an example. Feel free to work up a doc patch if you wanna grease the wheels a bit.

comment:2 Changed 5 years ago by tomevans222

That sounds perfectly reasonable Jacob. I'll try to get a doc patch up as soon as $JOB allows :)

Changed 4 years ago by arnav

patch with changes to documentation including example

comment:3 Changed 4 years ago by arnav

  • Has patch set

documentation patch added

comment:4 Changed 4 years ago by gabrielhurley

  • Resolution set to fixed
  • Status changed from new to closed

(In [14493]) Fixed #11877 -- Documented that HttpRequest.get_host() fails behind multiple reverse proxies, and added an example middleware solution. Thanks to Tom Evans for the report, and arnav for the patch.

comment:5 Changed 4 years ago by gabrielhurley

(In [14494]) [1.2.X] Fixed #11877 -- Documented that HttpRequest.get_host() fails behind multiple reverse proxies, and added an example middleware solution. Thanks to Tom Evans for the report, and arnav for the patch.

Backport of [14493] from trunk.

comment:6 Changed 15 months ago by John Borwick <john_borwick@…>

  • Easy pickings unset
  • Severity set to Normal
  • Type set to Uncategorized
  • UI/UX unset

Note for other readers: when using the included MultipleProxyMiddleware, consider whether you want line 30 to pull the *last* element [-1] or the *first* element [0].

The right-most/last is supposed to be the most recent proxy (per <http://www.squid-cache.org/Doc/config/follow_x_forwarded_for/>). The left-most is the furthest upstream proxy.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.