Opened 3 years ago
Closed 3 years ago
#33546 closed Cleanup/optimization (fixed)
Optimise creation of HttpResponseBase and ResponseHeaders objects
Reported by: | Keryn Knight | Owned by: | Keryn Knight |
---|---|---|---|
Component: | HTTP handling | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Nick Pope | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
This is a series of small changes across HttpResponseBase
and ResponseHeaders
to decrease the time spent creating an HttpResponse
based on common calling conventions. As it's a few little tweaks, I'm frontloading the overall times and will explain the individual pieces thereafter.
main | patched | |
---|---|---|
ResponseHeaders({}) | 745 ns | 313 ns |
ResponseHeaders({'TEST': "1"}) | 2.17 µs | 1.8 µs |
ResponseHeaders({'TEST': "1", "TEST2": "2"}) | 3.51 µs | 2.77 µs |
ResponseHeaders({"Content-Type": "application/json"}) | 2.19 µs | 1.93 µs |
HttpResponseBase() | 7.89 µs | 4.18 µs |
HttpResponseBase(headers={"TEST": "1"}) | 9.75 µs | 6.02 µs |
HttpResponseBase(headers={"TEST": "1", "TEST2": "2"}) | 11 µs | 7.09 µs |
HttpResponseBase(headers={"Content-Type": "application/json"}) | 3.71 µs | 3.26 µs |
HttpResponseBase(content_type="test") | 4.48 µs | 3.03 µs |
HttpResponse() | 9.06 µs | 5.46 µs |
HttpResponse(headers={"TEST": "1"}) | 11.4 µs | 7.58 µs |
HttpResponse(headers={"TEST": "1", "TEST2": "2"}) | 13 µs | 8.83 µs |
HttpResponse(headers={"Content-Type": "application/json"}) | 5.22 µs | 4.82 µs |
HttpResponse(content_type="test") | 6.2 µs | 4.38 µs |
So by my estimation of things, given most responses don't contain that many headers (if any), these are all happy-path positives.
The changes themselves, then.
The first is to only call the ResponseHeaders._unpack_items
if data is given to it, and to update the HttpResponseBase
to pass the headers value straight through (which is None by default):
- for header, value in self._unpack_items(data): - self[header] = value + if data: + for header, value in self._unpack_items(data): + self[header] = value
&
- self.headers = ResponseHeaders(headers or {}) + self.headers = ResponseHeaders(headers)
That saves a few hundred nanoseconds when none are given, but adds about 50 when some are given.
So the next step is to try and claw back those 50 nanoseconds elsewhere. I won't paste the full diff here because it's more moving around than the above, but it basically involves optimizing/re-working the _convert_to_charset
method to do fewer isinstance/type-checking calls and deferring the BadHeaderError
check until we've got an actual string, in the happy path. Nick (ngnpope
on GitHub) independently came up with a similar/the same(?) idea when we were discussing #33532 (see this comment) in a slightly different manner, so would be listed in the commit as Co-Author.
The changes to _convert_to_charset
claw back the lost time well enough; let x
be pre-declared as ResponseHeaders({})
:
main | patched | |
---|---|---|
x['test'] = 'test' | 1.34 µs | 897 ns |
x._convert_to_charset('test', 'ascii') | 500 ns | 291 ns |
x._convert_to_charset('test', 'latin-1', mime_encode=True) | 565 ns | 359 ns |
The next change to stack up is in the HttpResponseBase.charset
property, which is called when neither a content_type
parameter nor a Content-Type
header key is passed in. That method actually asks for the Content-Type
Header, falling back to a blank string if not found (which it won't be, because it's currently being set) - but in asking for the header, doesn't test whether it was truthy, so always runs the regex search on an empty string. A bit of reworking and a walrus operator or two looks like:
- content_type = self.get("Content-Type", "") - matched = _charset_from_content_type_re.search(content_type) - if matched: + if content_type := self.headers.get("Content-Type", ""): + if matched := _charset_from_content_type_re.search(content_type):
Finally, rejig the HttpResponseBase.__init__
to do one fewer __contains__/__getitem__
on the ResponseHeaders
in case it needs to raise a ValueError
about content_type
and Content-Type
clashing, at least if I've thought through the boolean logic conditions correctly:
- if content_type and "Content-Type" in self.headers: - ... raise error - if "Content-Type" not in self.headers: - ... set content type + if "Content-Type" not in self.headers: + ... set content type + elif content_type: + ... raise error
And that's your lot. Technically you can go further with _convert_to_charset
in the unhappy path where it produces a UnicodeError
and then attempts to encode it with email.Header
, and defer the newlines/BadHeaderError
check until after that encoding ... But it would change the exception in certain circumstances from a BadHeaderError from UnicodeError
into just a UnicodeError
which may not be desirable, and it also means knowing how email.Header
handles newlines to grok the safety of the whole change, so I've not gone ahead with it.
Pull request for discussion of the various changes will be forthcoming once I get a ticket number and rebase.
Tests all pass for me locally (SQLite, Python 3.10, OSX)
Change History (9)
comment:1 by , 3 years ago
Has patch: | set |
---|
comment:2 by , 3 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:3 by , 3 years ago
Cc: | added |
---|
comment:4 by , 3 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:9 by , 3 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
PR is https://github.com/django/django/pull/15466 ... all tests passed. Remarkable, not even a linter to trip me!