Opened 3 weeks ago
Last modified 2 weeks ago
#37025 assigned Bug
RemoteUserMiddleware.header does not correspond to request.META in the async path
| Reported by: | Jacob Walls | Owned by: | Jacob Walls |
|---|---|---|---|
| Component: | contrib.auth | Version: | 5.2 |
| Severity: | Normal | Keywords: | RemoteUserMiddleware |
| Cc: | Jon Janzen, MANAS MADESHIYA, Carlton Gibson, Jake Howard | Triage Stage: | Accepted |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | yes |
| Easy pickings: | no | UI/UX: | no |
Description
When Django 5.2 added an async path to RemoteUserMiddleware (#35303), it prefixed HTTP_ to the provided (or default) header name before looking it up in request.META.
I figure this was so that the default REMOTE_USER header would work out of the box. (For WSGI compatibility, ASGIRequest maps all user provided headers to HTTP_... variants.)
This has two problematic effects:
- It becomes tortured to document the security considerations, see recent change in #36862:
Under WSGI, this warning doesn’t apply to RemoteUserMiddleware in its default configuration with header = "REMOTE_USER", since a key that doesn’t start with HTTP_ in request.META can only be set by your WSGI server, not directly from an HTTP request header. This warning does apply by default on ASGI, because in the async path, the middleware prepends HTTP_ to the defined header name before looking it up in request.META.
Notice how an implementation detail is leaking into an admonition, making it harder to reason about what's potentially user-controlled (wait, under ASGI, this other stuff is user-controlled, too!)
- Custom headers can't be supplied in a way that works with both WSGI & ASGI:
If you try to do this the "right" way, and define HTTP_MY_CUSTOM_HEADER as the header attribute, then aprocess_request() will lookup HTTP_HTTP_MY_CUSTOM_HEADER. Forcing the 'wrong' version (CUSTOM_HEADER) works on ASGI but breaks on WSGI, because no WSGI server will set that.
I'm suggesting we should deprecate the current support for custom headers without "HTTP_" prefixes in the async path (aprocess_request) and remove it in Django 7. The final code in Django 7 would look like:
-
django/contrib/auth/middleware.py
diff --git a/django/contrib/auth/middleware.py b/django/contrib/auth/middleware.py index a28e1705a3..7136a64c44 100644
a b class RemoteUserMiddleware: 184 184 " before the RemoteUserMiddleware class." 185 185 ) 186 186 try: 187 username = request.META[ "HTTP_" +self.header]187 username = request.META[self.header] 188 188 except KeyError: 189 189 # If specified header doesn't exist then remove any existing 190 190 # authenticated remote-user, or return (leaving request.user set to
Regrettably, we might need a transitional setting (or class attribute) to be able to opt into (or out of) the future, since I'm reluctant to add any automatic checking of other forms of headers, which will just open an avenue for security reports describing header spoofing. Open to ideas.
Change History (15)
comment:1 by , 3 weeks ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
comment:2 by , 3 weeks ago
| Cc: | added |
|---|
follow-up: 14 comment:3 by , 3 weeks ago
comment:4 by , 3 weeks ago
| Cc: | added |
|---|
comment:5 by , 3 weeks ago
Grrr. I'm not sure of the value of this change. It looks like a breaking change without benefit to my eye. Indeed, if I'm using this with ASGI I would be forced to subclass in order to provide the full header value, HTTP_REMOTE_USER, to use the middleware at all.
The WSGI implementation is based on the idea that in the normal case the request.META entry is not in fact an HTTP header, but an environment variable. In contrast in the ASGI case, the value must come from a header value, as there's no other way. Thus the difference in implementation.
I'm pretty sceptical that people are needing both WSGI and ASGI versions of the middleware with the same custom header value. Possibly would could add an resolve_meta_key helper, called on the aprocess_request path that formatted the header, maybe via a class attribute, which could then be overridden to unify the approach if a fully qualified header was provided. 🤷 (Edit: moving the HTTP_ to a header_prefix class attribute might even be enough here: if set to "" it would allow self.header to be the fully qualified version.)
comment:6 by , 3 weeks ago
I think perhaps the docs example may need an update to show that the async case of setting header does not need the HTTP_ prefix
If your authentication mechanism uses a custom HTTP header and not
``REMOTE_USER``, you can subclass ``RemoteUserMiddleware`` and set the
``header`` attribute to the desired ``request.META`` key. For example:
.. code-block:: python
:caption: ``mysite/middleware.py``
from django.contrib.auth.middleware import RemoteUserMiddleware
class CustomHeaderRemoteUserMiddleware(RemoteUserMiddleware):
header = "HTTP_AUTHUSER"
I haven't confirmed this but I wonder if this might be the breaking change in our implementation, perhaps custom headers set like this used to work when RemoteUserMiddleware was using the MiddlewareMixin on ASGI but now do not, yet the default implementation probably didn't work to begin with. So _possibly_ a bug in the #31224 implementation, partially fixed by #35303 and broke something else.
comment:7 by , 3 weeks ago
Thanks both. So the original proposal definitely involves breaking RemoteUserMiddleware + ASGI + default configuration, on the theory that if almost no one is using this, it's OK to make this "broken by default" , because "broken by default" = "secure by default". But the header_prefix idea is interesting!
That said, the recent doc edits are sufficient, so my motivation here was less about security and more "easier to reason about", given that if we lean toward Sarah's proposed doc edit in comment:6, then I wonder if we're doubling down on documenting a bug.
To Sarah's question in comment:3, to get the tests working, we would need to adjust how we parameterize the header names in the tests. I recently learned that the sync request factory expects prefixed header names, e.g. "HTTP_MYHEADER", but the async request factory expects unprefixed header names, e.g. "MYHEADER", since they go through ASGIRequest, which applies prefixing.
-
django/contrib/auth/middleware.py
diff --git a/django/contrib/auth/middleware.py b/django/contrib/auth/middleware.py index a28e1705a3..7136a64c44 100644
a b class RemoteUserMiddleware: 184 184 " before the RemoteUserMiddleware class." 185 185 ) 186 186 try: 187 username = request.META[ "HTTP_" +self.header]187 username = request.META[self.header] 188 188 except KeyError: 189 189 # If specified header doesn't exist then remove any existing 190 190 # authenticated remote-user, or return (leaving request.user set to -
tests/auth_tests/test_remote_user.py
diff --git a/tests/auth_tests/test_remote_user.py b/tests/auth_tests/test_remote_user.py index 4a97fc2120..6745009589 100644
a b from datetime import UTC, datetime 3 3 from django.conf import settings 4 4 from django.contrib.auth import aauthenticate, authenticate 5 5 from django.contrib.auth.backends import RemoteUserBackend 6 from django.contrib.auth.middleware import RemoteUserMiddleware6 from django.contrib.auth.middleware import PersistentRemoteUserMiddleware, RemoteUserMiddleware 7 7 from django.contrib.auth.models import User 8 from django.http.request import HttpHeaders 8 9 from django.middleware.csrf import _get_new_csrf_string, _mask_cipher_secret 9 10 from django.test import ( 10 11 AsyncClient, … … from django.test import ( 17 18 18 19 @override_settings(ROOT_URLCONF="auth_tests.urls") 19 20 class RemoteUserTest(TestCase): 20 middleware = "django.contrib.auth.middleware.RemoteUserMiddleware" 21 # Use a custom middleware if we want this test case to run on both WSGI & ASGI. 22 middleware = "auth_tests.test_remote_user.CustomHeaderMiddleware" 21 23 backend = "django.contrib.auth.backends.RemoteUserBackend" 22 header = " REMOTE_USER"24 header = "HTTP_AUTHUSER" 23 25 email_header = "REMOTE_EMAIL" 24 26 25 27 # Usernames to be passed in REMOTE_USER for the test_known_user test case. 26 28 known_user = "knownuser" 27 29 known_user2 = "knownuser2" 28 30 31 @property 32 def async_header(self): 33 if self.header.startswith(HttpHeaders.HTTP_PREFIX): 34 # The tests like to parameterize headers, but if the defined header 35 # is already prefixed, strip it before using it. 36 return self.header.lstrip(HttpHeaders.HTTP_PREFIX) 37 return self.header 38 29 39 @classmethod 30 40 def setUpClass(cls): 31 41 cls.enterClassContext( … … class RemoteUserTest(TestCase): 65 75 self.assertTrue(response.context["user"].is_anonymous) 66 76 self.assertEqual(await User.objects.acount(), num_users) 67 77 68 response = await self.async_client.get("/remote_user/", **{self. header: ""})78 response = await self.async_client.get("/remote_user/", **{self.async_header: ""}) 69 79 self.assertTrue(response.context["user"].is_anonymous) 70 80 self.assertEqual(await User.objects.acount(), num_users) 71 81 … … class RemoteUserTest(TestCase): 142 152 """See test_unknown_user.""" 143 153 num_users = await User.objects.acount() 144 154 response = await self.async_client.get( 145 "/remote_user/", **{self. header: "newuser"}155 "/remote_user/", **{self.async_header: "newuser"} 146 156 ) 147 157 self.assertEqual(response.context["user"].username, "newuser") 148 158 self.assertEqual(await User.objects.acount(), num_users + 1) … … class RemoteUserTest(TestCase): 150 160 151 161 # Another request with same user should not create any new users. 152 162 response = await self.async_client.get( 153 "/remote_user/", **{self. header: "newuser"}163 "/remote_user/", **{self.async_header: "newuser"} 154 164 ) 155 165 self.assertEqual(await User.objects.acount(), num_users + 1) 156 166 … … class RemoteUserTest(TestCase): 176 186 await User.objects.acreate(username="knownuser2") 177 187 num_users = await User.objects.acount() 178 188 response = await self.async_client.get( 179 "/remote_user/", **{self. header: self.known_user}189 "/remote_user/", **{self.async_header: self.known_user} 180 190 ) 181 191 self.assertEqual(response.context["user"].username, "knownuser") 182 192 self.assertEqual(await User.objects.acount(), num_users) 183 193 # A different user passed in the headers causes the new user 184 194 # to be logged in. 185 195 response = await self.async_client.get( 186 "/remote_user/", **{self. header: self.known_user2}196 "/remote_user/", **{self.async_header: self.known_user2} 187 197 ) 188 198 self.assertEqual(response.context["user"].username, "knownuser2") 189 199 self.assertEqual(await User.objects.acount(), num_users) … … class RemoteUserTest(TestCase): 221 231 await user.asave() 222 232 223 233 response = await self.async_client.get( 224 "/remote_user/", **{self. header: self.known_user}234 "/remote_user/", **{self.async_header: self.known_user} 225 235 ) 226 236 self.assertNotEqual(default_login, response.context["user"].last_login) 227 237 … … class RemoteUserTest(TestCase): 229 239 user.last_login = default_login 230 240 await user.asave() 231 241 response = await self.async_client.get( 232 "/remote_user/", **{self. header: self.known_user}242 "/remote_user/", **{self.async_header: self.known_user} 233 243 ) 234 244 self.assertEqual(default_login, response.context["user"].last_login) 235 245 … … class RemoteUserTest(TestCase): 259 269 await User.objects.acreate(username="knownuser") 260 270 # Known user authenticates 261 271 response = await self.async_client.get( 262 "/remote_user/", **{self. header: self.known_user}272 "/remote_user/", **{self.async_header: self.known_user} 263 273 ) 264 274 self.assertEqual(response.context["user"].username, "knownuser") 265 275 # During the session, the REMOTE_USER header disappears. Should trigger … … class RemoteUserTest(TestCase): 295 305 await User.objects.acreate(username="knownuser") 296 306 # Known user authenticates 297 307 response = await self.async_client.get( 298 "/remote_user/", **{self. header: self.known_user}308 "/remote_user/", **{self.async_header: self.known_user} 299 309 ) 300 310 self.assertEqual(response.context["user"].username, "knownuser") 301 311 # During the session, the REMOTE_USER changes to a different user. 302 312 response = await self.async_client.get( 303 "/remote_user/", **{self. header: "newnewuser"}313 "/remote_user/", **{self.async_header: "newnewuser"} 304 314 ) 305 315 # The current user is not the prior remote_user. 306 316 # In backends that create a new user, username is "newnewuser" … … class RemoteUserTest(TestCase): 315 325 async def test_inactive_user_async(self): 316 326 await User.objects.acreate(username="knownuser", is_active=False) 317 327 response = await self.async_client.get( 318 "/remote_user/", **{self. header: "knownuser"}328 "/remote_user/", **{self.async_header: "knownuser"} 319 329 ) 320 330 self.assertTrue(response.context["user"].is_anonymous) 321 331 … … class RemoteUserNoCreateTest(RemoteUserTest): 343 353 async def test_unknown_user_async(self): 344 354 num_users = await User.objects.acount() 345 355 response = await self.async_client.get( 346 "/remote_user/", **{self. header: "newuser"}356 "/remote_user/", **{self.async_header: "newuser"} 347 357 ) 348 358 self.assertTrue(response.context["user"].is_anonymous) 349 359 self.assertEqual(await User.objects.acount(), num_users) … … class AllowAllUsersRemoteUserBackendTest(RemoteUserTest): 362 372 async def test_inactive_user_async(self): 363 373 user = await User.objects.acreate(username="knownuser", is_active=False) 364 374 response = await self.async_client.get( 365 "/remote_user/", **{self. header: self.known_user}375 "/remote_user/", **{self.async_header: self.known_user} 366 376 ) 367 377 self.assertEqual(response.context["user"].username, user.username) 368 378 … … class RemoteUserCustomTest(RemoteUserTest): 436 446 self.assertEqual(newuser.email, "user@example.com") 437 447 438 448 449 """ 450 These tests are now duplicative in the current demonstration. We wouldn't merge 451 it in this state. 452 """ 439 453 class CustomHeaderMiddleware(RemoteUserMiddleware): 440 454 """ 441 455 Middleware that overrides custom HTTP auth user header. … … class CustomHeaderRemoteUserTest(RemoteUserTest): 454 468 header = "HTTP_AUTHUSER" 455 469 456 470 471 class CustomPersistentRemoteUserMiddleware(PersistentRemoteUserMiddleware): 472 header = "HTTP_AUTHUSER" 473 474 457 475 class PersistentRemoteUserTest(RemoteUserTest): 458 476 """ 459 477 PersistentRemoteUserMiddleware keeps the user logged in even if the 460 478 subsequent calls do not contain the header value. 461 479 """ 462 480 463 middleware = " django.contrib.auth.middleware.PersistentRemoteUserMiddleware"481 middleware = "auth_tests.test_remote_user.CustomPersistentRemoteUserMiddleware" 464 482 require_header = False 465 483 466 484 def test_header_disappears(self): … … class PersistentRemoteUserTest(RemoteUserTest): 482 500 await User.objects.acreate(username="knownuser") 483 501 # Known user authenticates 484 502 response = await self.async_client.get( 485 "/remote_user/", **{self. header: self.known_user}503 "/remote_user/", **{self.async_header: self.known_user} 486 504 ) 487 505 self.assertEqual(response.context["user"].username, "knownuser") 488 506 # Should stay logged in if the REMOTE_USER header disappears.
I had to be aware of that in caf90a971f09323775ed0cacf94eadaf39d040e0, where I added a little change in the AsyncRequestFactory to smooth out this difference with respect to underscores. It's possible we could deprecate and undo that going forward, if we want to standardize how headers are provided in tests? But that would be more disruptive than what I'm proposing here.
I'll definitely support wontfix'ing this if there is no value, but something seems off. I wonder if the quirks of AsyncRequestFactory put a little design pressure on the decision to have RemoteUserMiddleware.aprocess_request apply prefixing?
comment:8 by , 3 weeks ago
I didn’t check but if the ASGI flow is broken by default, we should fix that. The idea definitely wasn’t to have code that doesn’t work. 🫠
comment:9 by , 3 weeks ago
The flows work, but if we leave the status quo alone, this documented blurb is not correct for ASGI:
you can subclass
RemoteUserMiddlewareand set theheaderattribute to the desiredrequest.METAkey.
Under ASGI, you are forced to use a header attribute that is unprefixed -- if it is prefixed, it will be double-prefixed in request.META. Our existing test case accepts the double-prefix route:
class CustomHeaderMiddleware(RemoteUserMiddleware): """ Middleware that overrides custom HTTP auth user header. """ header = "HTTP_AUTHUSER"
When those test cases run, the client sends HTTP_AUTHUSER = ..., so request.META gets HTTP_HTTP_AUTHUSER.
That doesn't look right, and I doubt anybody is doing this, so I was interested to discuss a code change. However, a better idea than forcing ASGI projects to subclass the middleware would be to perform the lookup against self.headers, which already supports lookups by unprefixed header names. This is a little more defensible to document, I feel. The only behavior change would be that an ASGI-only project could now set header = "WITH-HYPHENS" and benefit from a more flexible lookup, which is backward compatible.
On the security list, Jake suggested roughly, "If we're doing a deprecation, let's change both WSGI/ASGI paths to use self.headers to have consistency". I think we can punt on whether to do a deprecation here, and just start with the docs fix paired with the more flexible lookup. I'll put up a PR to aid triage.
comment:10 by , 3 weeks ago
| Cc: | added |
|---|---|
| Has patch: | set |
| Keywords: | RemoteUserMiddleware added |
PR to adjust the async path to look up against request.headers without breaking anything.
comment:11 by , 3 weeks ago
| Summary: | Deprecate the prefixing of HTTP_ to header names in RemoteUserMiddleware.aprocess_request() → RemoteUserMiddleware.header does not correspond to request.META in the async path |
|---|
comment:12 by , 3 weeks ago
Thanks for the draft Jacob. I think that looks good. (And covers the ground that we need to by default IMO — Folks needing more can customise as necessary.)
comment:13 by , 3 weeks ago
| Triage Stage: | Unreviewed → Accepted |
|---|
Thank you Jacob, I like where this is going
comment:14 by , 2 weeks ago
Replying to Sarah Boyce:
Thank you for the ticket
From reviewing the PR that implemented this, I would say our approach was to get test parity by async-ifying the sync tests. This was then required for the tests to pass and, therefore, seemed required. I am now under the impression that our logic as to how we wrote the async tests must be off. Have you looked into what changes may be required there? I think that will help identify why we got this wrong
I don't think we can improve our test tooling to prevent something like this in the future. But we do have experience supporting sync/async middlewares with separate tests (like in the PR), and I can say with certainty that it's a bad approach for long-term maintenance 😌. At first it works fine, you just copy the sync tests, swap in the async equivalents, and you're done. But over time it becomes hard to maintain, and having separate tests doesn't actually guarantee that the sync and async paths behave the same way.
RemoteUserMiddleware is the first and only pure async middleware in Django, so we can experiment with testing tooling and approaches here, and then apply what we learn to future pure async middlewares.
Just a thought, if it makes sense 😌
comment:15 by , 2 weeks ago
| Patch needs improvement: | set |
|---|
Thank you for the ticket
From reviewing the PR that implemented this, I would say our approach was to get test parity by async-ifying the sync tests. This was then required for the tests to pass and, therefore, seemed required. I am now under the impression that our logic as to how we wrote the async tests must be off. Have you looked into what changes may be required there? I think that will help identify why we got this wrong
This is a partial duplicate of #36443 and #36300