#32778 closed Cleanup/optimization (fixed)
Avoided unnecessary recompilation of token regex in _sanitize_token().
Reported by: | Abhyudai | Owned by: | Abhyudai |
---|---|---|---|
Component: | CSRF | Version: | 3.2 |
Severity: | Normal | Keywords: | middleware, csrf |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | yes | UI/UX: | no |
Description
I was looking into the source code of the middleware for some reason, and saw that the regular expression is compiled inside the module. I think compiling it a module level could potentially save some time as the function _sanitize_token
is called twice inside the function process_view
for the CsrfMiddleware
class.
This is the intended patch.
diff --git a/django/middleware/csrf.py b/django/middleware/csrf.py index f323ffb..deaf7d8 100644 --- a/django/middleware/csrf.py +++ b/django/middleware/csrf.py @@ -22,6 +22,8 @@ from django.utils.log import log_response logger = logging.getLogger('django.security.csrf') +ASCII_ALPHANUMERIC_RE = re.compile('[^a-zA-Z0-9]') + REASON_BAD_ORIGIN = "Origin checking failed - %s does not match any trusted origins." REASON_NO_REFERER = "Referer checking failed - no Referer." REASON_BAD_REFERER = "Referer checking failed - %s does not match any trusted origins." @@ -107,7 +109,7 @@ def rotate_token(request): def _sanitize_token(token): # Allow only ASCII alphanumerics - if re.search('[^a-zA-Z0-9]', token): + if ASCII_ALPHANUMERIC_RE.search(token): return _get_new_csrf_token() elif len(token) == CSRF_TOKEN_LENGTH: return token
I'm not sure how exactly to profile this change. I tried using the djangobench package after some tinkering to its source code. Since it was reporting changes even on queries, I wasn't sure to trust it. Any leads on this front would be great.
I would be happy to make the change, if this seems reasonable.
Change History (9)
comment:1 by , 3 years ago
comment:2 by , 3 years ago
Summary: | Compile alphanumeric regex for CSRF middleware at module level. → Avoided unnecessary recompilation of token regex in _sanitize_token(). |
---|---|
Triage Stage: | Unreviewed → Accepted |
Thanks, sounds good. Would you like to prepare patch? Please use _lazy_re_compile()
to avoid compilation when importing the module.
comment:3 by , 3 years ago
Easy pickings: | set |
---|
comment:4 by , 3 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:6 by , 3 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:8 by , 3 years ago
I added a follow-up PR here that renames token_re
: https://github.com/django/django/pull/14461
For what's it worth, the results for the
default_middleware
section when run on my machine when comparing the patch branch to the main branch were:Although, I'm not sure if this just tests the changes for the
Csrf
class since the default middleware includes a lot more things.