Opened 11 years ago

Closed 5 years ago

Last modified 5 years ago

#20917 closed New feature (wontfix)

Change the password hashers when testing

Reported by: Marc Tamlyn Owned by: Bruno Alla
Component: Testing framework Version: dev
Severity: Normal Keywords:
Cc: tom@… Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Disclaimer: I'm not completely sure this is a good idea as a default.

The default password hasher is very secure, and very slow to create passwords. This is never an issue in production, but in testing it is *amazingly* slow. Most of the time using the unsalted MD5 hasher as settings.PASSWORD_HASHERS[0] has resulted in a six-fold increase in speed in my test suites. To be honest, I think we could use a "non-hashing" hasher in these cases.

I'd like to change the "default" to insert this new non-hashing hasher as settings.PASSWORD_HASHERS[0] during setup_test_environment(). For anyone who does not know about this trick, their test suits will automatically speed up. Any tests expecting a certain hasher to have been used when creating would fail in a backwardsly incompatible manner. Any fixtures or similar with passwords created using another hasher would still be valid, but would then update to be the raw password on success. Of course, to validate these passwords the password text would need to be included in plain text in the test suite (See #20916 for an alternative solution to this issue).

Other comparable setting changes done in this way include: turning off translations, using the console email backend, removing allowed_hosts checking, turning debug off.

Am I mental, or is this a sensible optimisation?

Change History (17)

comment:1 by Tom Christie, 11 years ago

Cc: tom@… added

comment:2 by Tim Graham, 11 years ago

Triage Stage: UnreviewedAccepted

I think the idea is definitely worth exploring. There's a note in the docs, but it's rather buried: https://docs.djangoproject.com/en/dev/topics/testing/overview/#speeding-up-the-tests

comment:3 by Ash Christopher, 11 years ago

Owner: changed from nobody to Ash Christopher
Status: newassigned

comment:4 by Ash Christopher, 11 years ago

I was thinking of adding a TEST_PASSWORD_HASHER to the settings which by default is set to None. If TEST_PASSWORD_HASHER is None, then we don't override the PASSWORD_HASHER tuple and password hashing works as normal.

This would allow backwards compatibility.

comment:5 by anonymous, 11 years ago

I'm not sure "backwards compatibility" needs to be kept in this instance. Nothing is really broken by the change, unless somebody is manually inspecting the hash. In this case, fast by default may be the best solution. (But maybe keep the MD5 hasher instead of a no-op hasher, just in case somebody is doing something incredibly stupid.)

comment:6 by Ash Christopher, 11 years ago

If people have overridden the PASSWORD_HASHER with their own custom hasher, or if they have updated the PASSWORD_HASHER in say a test_settings.py file (which is how we solved this) then having an opinion on the password hasher in tests would break their tests.

I know that our test suite would break if we defaulted to a specific hasher in tests (though this is basically a data point of 1).

comment:7 by Ash Christopher, 11 years ago

Discussed this issue with julienphalip on irc (#django-sprints) and we both think that maybe this isn't actually something Django should have an opinion on.

[10:40:54] <+julienphalip> ashc: So I'm feeling a bit divided about #20917.
[10:40:54] <ticketbot> https://code.djangoproject.com/ticket/20917
[10:41:07] <ashc> julienphalip: re: #20917, I am not even sure if it is a real code problem
[10:41:08] <ticketbot> https://code.djangoproject.com/ticket/20917
[10:41:21] <ashc> part of me thinks it is a documentation issue
[10:41:32] <+julienphalip> ashc: I think so too
[10:41:34] <ashc> or rather, a "this is how you should do tests" problem
[10:42:06] <ashc> some people just rely on settings.py + local_setting.py and others (like us) have a special testing_settings.py file we use
[10:42:17] <ashc> and that is where we override the PASSWORD_HASHERS
[10:42:30] <+julienphalip> ashc: Yep
[10:42:56] <+julienphalip> I think it's best to let users consciously change the hashers setting for testing
[10:43:28] <ashc> Is there anything we can do to make this more clear for people? Does Django have an opinion on test settings files?
[10:44:44] <+julienphalip> ashc: I'm all for speeding up tests' execution, but if we enforced a weak hasher then I'd like to be sure there can be no weird surprise and no discrepancy between the test and production environments.
[10:45:14] <+julienphalip> ashc: I don't think Django as a project has any particular recommendation. There are many different techniques....
[10:48:08] <ashc> julienphalip: I am thinking things are kind of ok as they are, and maybe we just close the ticket with an explanation that Django shouldn't have an opinion on this matter, and that there is documentation that will allow developers to apply the speed fixes themselves?
[10:49:32] <+julienphalip> ashc: I tend to agree. It'd be great to have a chat with mjtamlyn about this first to see what he thinks.
[10:51:43] <+julienphalip> ashc: This section could be beefed up with more tips as well: https://docs.djangoproject.com/en/dev/topics/testing/overview/#speeding-up-the-tests

comment:8 by Cliff Dyer, 11 years ago

Apologies. Anonymous above (comment:5) is me. Is there a way to check if the hasher has been manually set, and use that if so, but to have it default to something fast if no hasher has been specified? I think a fast test suite is something we want people to be able to use by default. My concern is that requiring TEST_PASSWORD_HASHER to be set to get fast tests doesn't solve the problem, it just moves the workaround from supplying a separate settings file to supplying an extra setting. It's better, but we're still leaving new users with slow tests.

Or, if the solution requires TEST_PASSWORD_HASHER to be set, include it in the default settings file created by makeproject.

comment:9 by Marc Tamlyn, 11 years ago

I don't personally feel that this is any different to using an in memory database by default for SQLite rather than a file. The performance is different to in production (anyway, why are you using SQLite in production?), but this is what you expect from testing. I agree there are a number of ways to override settings, and there's the slight issue of what to do if someone wants to use something different, but in my experience I have *always* had to set this for tests to get any kind of performance. It's not a last-resort method to speed tests up, it's the first thing you should do, and as we continue to increase the iterations it will get worse.

There are a small number of users who may run into this problem, but for 90%+ of users it saves remembering to set it up, or not knowing about it at all, deciding the tests are too slow to run, getting bored and never writing any. It's not a marginal gain - on small examples where tests create users with passwords and log them in to the site I was seeing a 6x speed increase.

So in summary:

  • Most users won't notice
  • We need a mechanism so that those who do (and don't want it) can disable it
  • Will save most projects a few lines of code
  • Will help adoption of testing as it gives a better initial impression

comment:10 by Benoît Bryon, 11 years ago

I just released https://pypi.python.org/pypi/django-plainpasswordhasher, which is a no-op password hasher. Yes, very dangerous in production, but pretty quick for tests.

I have no benchmark actually, but I guess "no encoding" is the fastest encoding implementation, isn't it?

What about using it as default or recommended password hasher when running tests?

comment:11 by Bruno Alla, 5 years ago

I took a stab at implementing this, but @charettes raised the issue that MD5 might not be available on some platform (see #28401).

I like the idea though, which could be implemented by using another basic hasher. I did some test on my machine, and switching to a SHA1-based hasher would lead to very similar performances:

> ipython
Python 3.6.6 (default, Aug 28 2018, 17:15:25) 
Type 'copyright', 'credits' or 'license' for more information
IPython 6.5.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import hashlib

In [2]: %timeit hashlib.md5(('kdsglsdk' + 'jdbvkjdsbvbdksbk').encode()).hexdigest()
1.12 µs ± 5.28 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [3]: %timeit hashlib.sha1(('kdsglsdk' + 'jdbvkjdsbvbdksbk').encode()).hexdigest()
1.15 µs ± 21.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

By the look of the cPython patch, it doesn't look like this algorithm suffers from the same compatibility issues.

comment:12 by Claude Paroz, 5 years ago

What about the suggestion of Benoît Bryon to use a noop password hasher (could be in django/test/utils)?

comment:13 by Simon Charette, 5 years ago

I like Claude's suggestion as long as it lives in django.test.utils or django.contrib.auth.test to clearly denote it should only be used for testing purposes only.

comment:14 by Bruno Alla, 5 years ago

Owner: changed from Ash Christopher to Bruno Alla

comment:15 by Bruno Alla, 5 years ago

Has patch: set

comment:16 by Mariusz Felisiak, 5 years ago

Has patch: unset
Resolution: wontfix
Status: assignedclosed
Triage Stage: AcceptedUnreviewed

I'm not convinced the overhead of a new setting is justified in order to remove one line from tests settings: PASSWORD_HASHERS = ['PlainHasher']. I'll close as wontfix on that basis.

If a patch at level of the test runner turns up — automatically adjusting PASSWORD_HASHERS by default, or such — we'd be happy to access that.

comment:17 by Adam Johnson, 5 years ago

Here's such a patch:

diff --git django/test/utils.py django/test/utils.py
index d1f7d19546..6b3e9648ab 100644
--- django/test/utils.py
+++ django/test/utils.py
@@ -128,6 +128,8 @@ def setup_test_environment(debug=None):
     saved_data.email_backend = settings.EMAIL_BACKEND
     settings.EMAIL_BACKEND = 'django.core.mail.backends.locmem.EmailBackend'
 
+    settings.PASSWORD_HASHERS = ['django.contrib.auth.hashers.MD5PasswordHasher']
+
     saved_data.template_render = Template._render
     Template._render = instrumented_test_render
 

But I'm not sure how I feel about it. It's a coupling of the test runner to a contrib app.

Note: See TracTickets for help on using tickets.
Back to Top