﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
33546	Optimise creation of HttpResponseBase and ResponseHeaders objects	Keryn Knight	Keryn Knight	"This is a series of small changes across `HttpResponseBase` and `ResponseHeaders` to decrease the time spent creating an `HttpResponse` based on common calling conventions. As it's a few little tweaks, I'm frontloading the overall times and will explain the individual pieces thereafter.

|| ||= main =||= patched =||
|| `ResponseHeaders({})` || 745 ns || 313 ns ||
|| `ResponseHeaders({'TEST': ""1""})` || 2.17 µs || 1.8 µs ||
|| `ResponseHeaders({'TEST': ""1"", ""TEST2"": ""2""})` || 3.51 µs || 2.77 µs ||
|| `ResponseHeaders({""Content-Type"": ""application/json""})` || 2.19 µs || 1.93 µs ||
|| `HttpResponseBase()` || 7.89 µs || 4.18 µs ||
|| `HttpResponseBase(headers={""TEST"": ""1""})` || 9.75 µs || 6.02 µs ||
|| `HttpResponseBase(headers={""TEST"": ""1"", ""TEST2"": ""2""})` || 11 µs || 7.09 µs ||
|| `HttpResponseBase(headers={""Content-Type"": ""application/json""})` || 3.71 µs || 3.26 µs ||
|| `HttpResponseBase(content_type=""test"")` || 4.48 µs || 3.03 µs ||
|| `HttpResponse()` || 9.06 µs || 5.46 µs ||
|| `HttpResponse(headers={""TEST"": ""1""})` || 11.4 µs || 7.58 µs ||
|| `HttpResponse(headers={""TEST"": ""1"", ""TEST2"": ""2""})` || 13 µs || 8.83 µs ||
|| `HttpResponse(headers={""Content-Type"": ""application/json""})` || 5.22 µs || 4.82 µs ||
|| `HttpResponse(content_type=""test"")` || 6.2 µs || 4.38 µs ||

So by my estimation of things, given most responses don't contain ''that'' many headers (if any), these are all happy-path positives.

The changes themselves, then.

The first is to only call the `ResponseHeaders._unpack_items` if data is given to it, and to update the `HttpResponseBase` to pass the headers value straight through (which is None by default):
{{{
-        for header, value in self._unpack_items(data):
-            self[header] = value
+        if data:
+            for header, value in self._unpack_items(data):
+                self[header] = value
}}}
&
{{{
-        self.headers = ResponseHeaders(headers or {})
+        self.headers = ResponseHeaders(headers)
}}}
That saves a few hundred nanoseconds when none are given, but adds about 50 when some ''are'' given.
So the next step is to try and claw back those 50 nanoseconds elsewhere. I won't paste the full diff here because it's more moving around than the above, but it basically involves optimizing/re-working the `_convert_to_charset` method to do fewer isinstance/type-checking calls and deferring the `BadHeaderError` check until we've got an actual string, in the happy path. Nick (`ngnpope` on GitHub) independently came up with a similar/the same(?) idea when we were discussing #33532 (see [https://github.com/django/django/pull/15453#issuecomment-1047296236 this comment]) in a slightly different manner, so would be listed in the commit as Co-Author.

The changes to `_convert_to_charset` claw back the lost time well enough; let `x` be pre-declared as `ResponseHeaders({})`: 
|| ||= main =||= patched =||
|| `x['test'] = 'test'` || 1.34 µs || 897 ns ||
|| `x._convert_to_charset('test', 'ascii')` || 500 ns || 291 ns ||
|| `x._convert_to_charset('test', 'latin-1', mime_encode=True)` || 565 ns || 359 ns ||

The next change to stack up is in the `HttpResponseBase.charset` property, which is called when neither a `content_type` parameter nor a `Content-Type` header key is passed in. That method actually ''asks'' for the `Content-Type` Header, falling back to a blank string if not found (which it won't be, because it's ''currently being set'') - but in asking for the header, doesn't test whether it was truthy, so always runs the regex search on an empty string. A bit of reworking and a walrus operator or two looks like:

{{{
-        content_type = self.get(""Content-Type"", """")
-        matched = _charset_from_content_type_re.search(content_type)
-        if matched:
+        if content_type := self.headers.get(""Content-Type"", """"):
+            if matched := _charset_from_content_type_re.search(content_type):
}}}

Finally, rejig the `HttpResponseBase.__init__` to do one fewer `__contains__/__getitem__` on the `ResponseHeaders` in case it needs to raise a `ValueError` about `content_type` and `Content-Type` clashing, at least if I've thought through the boolean logic conditions correctly:

{{{
-        if content_type and ""Content-Type"" in self.headers:
-            ... raise error
-        if ""Content-Type"" not in self.headers:
-            ... set content type
+        if ""Content-Type"" not in self.headers:
+            ... set content type
+        elif content_type:
+            ... raise error
}}}

And that's your lot. Technically you can go ''further'' with `_convert_to_charset` in the unhappy path where it produces a `UnicodeError` and then attempts to encode it with `email.Header`, and defer the newlines/`BadHeaderError` check until ''after'' that encoding ... But it would change the exception in certain circumstances from a `BadHeaderError from UnicodeError` into just a `UnicodeError` which may not be desirable, and it also means knowing how `email.Header` handles newlines to grok the safety of the whole change, so I've not gone ahead with it.

Pull request for discussion of the various changes will be forthcoming once I get a ticket number and rebase.

Tests all pass for me locally (SQLite, Python 3.10, OSX)"	Cleanup/optimization	closed	HTTP handling	dev	Normal	fixed		Nick Pope	Ready for checkin	1	0	0	0	0	0
