#32778 closed Cleanup/optimization (fixed)
Avoided unnecessary recompilation of token regex in _sanitize_token().
| Reported by: | Abhyudai | Owned by: | Abhyudai |
|---|---|---|---|
| Component: | CSRF | Version: | 3.2 |
| Severity: | Normal | Keywords: | middleware, csrf |
| Cc: | Triage Stage: | Ready for checkin | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | yes | UI/UX: | no |
Description
I was looking into the source code of the middleware for some reason, and saw that the regular expression is compiled inside the module. I think compiling it a module level could potentially save some time as the function _sanitize_token is called twice inside the function process_view for the CsrfMiddleware class.
This is the intended patch.
diff --git a/django/middleware/csrf.py b/django/middleware/csrf.py
index f323ffb..deaf7d8 100644
--- a/django/middleware/csrf.py
+++ b/django/middleware/csrf.py
@@ -22,6 +22,8 @@ from django.utils.log import log_response
logger = logging.getLogger('django.security.csrf')
+ASCII_ALPHANUMERIC_RE = re.compile('[^a-zA-Z0-9]')
+
REASON_BAD_ORIGIN = "Origin checking failed - %s does not match any trusted origins."
REASON_NO_REFERER = "Referer checking failed - no Referer."
REASON_BAD_REFERER = "Referer checking failed - %s does not match any trusted origins."
@@ -107,7 +109,7 @@ def rotate_token(request):
def _sanitize_token(token):
# Allow only ASCII alphanumerics
- if re.search('[^a-zA-Z0-9]', token):
+ if ASCII_ALPHANUMERIC_RE.search(token):
return _get_new_csrf_token()
elif len(token) == CSRF_TOKEN_LENGTH:
return token
I'm not sure how exactly to profile this change. I tried using the djangobench package after some tinkering to its source code. Since it was reporting changes even on queries, I wasn't sure to trust it. Any leads on this front would be great.
I would be happy to make the change, if this seems reasonable.
Change History (9)
comment:1 by , 4 years ago
comment:2 by , 4 years ago
| Summary: | Compile alphanumeric regex for CSRF middleware at module level. → Avoided unnecessary recompilation of token regex in _sanitize_token(). |
|---|---|
| Triage Stage: | Unreviewed → Accepted |
Thanks, sounds good. Would you like to prepare patch? Please use _lazy_re_compile() to avoid compilation when importing the module.
comment:3 by , 4 years ago
| Easy pickings: | set |
|---|
comment:4 by , 4 years ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:6 by , 4 years ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
comment:8 by , 4 years ago
I added a follow-up PR here that renames token_re: https://github.com/django/django/pull/14461
For what's it worth, the results for the
default_middlewaresection when run on my machine when comparing the patch branch to the main branch were:Although, I'm not sure if this just tests the changes for the
Csrfclass since the default middleware includes a lot more things.