Opened 18 months ago

Last modified 15 months ago

#27575 assigned Cleanup/optimization

Make host validation run on all requests

Reported by: JorisBenschop Owned by: Ketan Bhatt
Component: HTTP handling Version: 1.10
Severity: Normal Keywords:
Cc: desecho@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by JorisBenschop)

please ignore all this if its stupid, but I was trying to create a 1.10 middleware and learn from existing code. Looking at
django.middleware.common.CommonMiddleware I saw this in process_request

host = request.get_host()
must_prepend = settings.PREPEND_WWW and host and not host.startswith('www.')
redirect_url = ('%s://www.%s' % (request.scheme, host)) if must_prepend else ''

If I understand correctly, this code is ran for each request. Would we not get more performance by setting

redirect_url = ''
if settings.PREPEND_WWW:
    host = request.get_host()
    if host and not host.startswith('www.'):
        redirect_url = ('%s://www.%s' % (request.scheme, host)) 

as this evaluates most of the code only when settings.PREPEND_WWW is true.

Again, maybe I completely misunderstand how python code optimization works. But I hope it helps

Change History (12)

comment:1 Changed 18 months ago by JorisBenschop

Description: modified (diff)

comment:2 Changed 18 months ago by Simon Charette

Hi Joris,

I'd say it's worth investigating but one thing that you'll need to make sure is that get_host() is called inconditionally as this method has the side effect of performing ALLOWED_HOSTS validation.

comment:3 Changed 18 months ago by Simon Charette

Triage Stage: UnreviewedAccepted

comment:4 Changed 18 months ago by JorisBenschop

Hi Simon,
Thanks for this clarification.
Looking at the code, to my eye the get_host() only returns it output to a variable that is only used once, but now i understand there is some implicit 'magic' ran behind the screens as well. Indeed that would make the patch a lot less useful, but then would it maybe be worth annotating this hidden behaviour in a comment within the def instead?

Thanks for your gelp

comment:5 Changed 17 months ago by Tim Graham

Has patch: unset

comment:6 Changed 17 months ago by Anton Samarchyan

Cc: desecho@… added

We could add a comment in process_request method

host = request.get_host()  # Also perform host validation

And probably in get_host method as well

"""Return the HTTP host using the environment or request headers. Perform host validation."""

comment:7 Changed 17 months ago by JorisBenschop

Wouldnt refactoring this function (validate_host?) make sense if it does a lot more than retrieving a hostname?

comment:8 Changed 17 months ago by Anton Samarchyan

A function get_host could be renamed to get_host_and_validate

comment:9 Changed 17 months ago by Ketan Bhatt

Owner: changed from nobody to Ketan Bhatt
Status: newassigned

comment:10 Changed 17 months ago by Ketan Bhatt

@Simon Charette

So the final changes that need to be made are:

  1. Rename get_host() to get_host_and_validate()
  2. Handle occurrences of get_host everywhere in the docs and change it to the new name
  3. Mention the change in the changelist (will need help here regarding where to put this change exactly)

Please confirm once and I shall make a PR

comment:11 Changed 17 months ago by Tim Graham

I don't think there's value in renaming the function at this point. In fact, it might be better to move that validation elsewhere so it always runs during a request, regardless of if get_host() is called. There was some discussion about this at the time host header validation was added but I don't have time to look into this more right now.

comment:12 Changed 15 months ago by Tim Graham

Component: Core (Other)HTTP handling
Easy pickings: unset
Summary: potential commonmiddleware optimizationMake host validation run on all requests
Note: See TracTickets for help on using tickets.
Back to Top