Opened 13 years ago

Closed 12 years ago

#15367 closed New feature (fixed)

Improved Auth Password Hashing

Reported by: Paul Oswald Owned by: Paul McMillan
Component: contrib.auth Version:
Severity: Release blocker Keywords: password, hash, hashing, bcrypt, scrypt, pbkdf2, sha2, sha1
Cc: jtunney@…, Paul Oswald, Donald Stufft Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

As discussed on the django-developers mailing list, we aim to improve the default password hashing algorithm Django uses. This ticket will track the changes needed to upgrade the default algorithm to one more resistant to brute force attacks. This may include improvements ease allowing the developer to change the encryption used to a third party library wether that is by configuration or by decoupling the authentication code from the User model object.

If you would like to help, please be familiar with the summary and decisions made in the following discussion at least starting from this message:

http://groups.google.com/group/django-developers/browse_thread/thread/928ed5db00d5b1d8/919ce4798d30656a?#919ce4798d30656a

Also be aware of the past issues surrounding this issue, specifically ticket:13969 which currently has existing code:

ticket:3316 (Adding `crypt' to list of password hashes for legacy apps. - closed: fixed)

ticket:5600 (Patch to enhance cryptography on django.contrib.auth - closed: wontfix)

ticket:5787 (BCrypt password hashing support in Django - closed: duplicate)

ticket:6028 (add compatibility with glibc2 MD5-based crypt passwords - new )

ticket:9101 (Improved salt generation for django.contrib.auth - closed: wontfix)

ticket:9194 (Allow additional hashing algorithms for passwords - closed: duplicate)

ticket:13969 (auth module should use longer salt for hashing - new)

The plan at this point is to follow this path:

  • Django ships with PBKDF2 by default. This depends on SHA2 which should be python 2.5 compatible (due to hashlib being added in python 2.5) and PBKDF2 is short and simple enough that it could be included into the project. This satisfies NIST/US Gov requirements.
  • SHA1 is maintained for backwards compatibility
  • Salt size increased
  • Configurable settings for the number of hashing rounds to future-proof for faster hardware

We can also Investigate simplified ways of allowing developers to upgrade the hashing library based on their requirements, however it is more important to get the default improved and that is where we should focus.

Attachments (5)

django-passhash-2011-09-10.diff (27.0 KB ) - added by Justine Tunney 13 years ago.
django-passhash-2011-09-12.patch (33.3 KB ) - added by Justine Tunney 13 years ago.
django-passhash-2011-09-12.diff (32.8 KB ) - added by Justine Tunney 13 years ago.
django-passhash-2011-12-3.diff (32.9 KB ) - added by Paul McMillan 12 years ago.
15367.5.diff (32.9 KB ) - added by Jannis Leidel 12 years ago.
Minor nitpicks, cleanups and other improvments. More details can be found in the commits at https://github.com/jezdez/django/tree/feature/auth-hashing

Download all attachments as: .zip

Change History (32)

comment:1 by Russell Keith-Magee, 13 years ago

Triage Stage: UnreviewedDesign decision needed

Marking DDN -- there's something worth pursuing here, but there are still details to sort out.

comment:2 by Łukasz Rekucki, 13 years ago

Severity: Normal
Type: New feature

comment:3 by anonymous, 13 years ago

Easy pickings: unset
UI/UX: unset

cryptacular attempts to address the problem of supporting multiple hash functions. The delegating password manager accepts an array of supported hash functions, each hashing scheme implements a 'match()' method (does the given hash look like that hash function's hash?), and the 'check' method accepts a setter to upgrade to the most-preferred hash when someone logs in with an older hash. cryptacular includes PBKDF2 and bcrypt. https://bitbucket.org/dholth/cryptacular/src/tip/cryptacular/core/__init__.py

comment:4 by Justine Tunney, 13 years ago

Cc: jtunney@… added
Owner: changed from nobody to Justine Tunney

I'm going to start coding this in my github fork: https://github.com/jart/django/tree/auth-hashing

Here's a synopsis of the plan so far that a few of us came up with at the sprints:

  • PBKDF2 will be the new default algorithm. It'll try to compile/install a C module or fall-back to pure python. This will not introduce a dependency
  • bcrypt will also be supported through the py-bcrypt library. It can't be the default because there's no pure-python implementation
  • An plugable password hashing backend API will be provided
  • A setting will be added to specify module path of chosen algorithm, like: PBKDF2, bcrypt and SHA1
  • We're still looking into ways to mitigate the possibility of DDOS attacks due to high CPU usage of these algorithms
  • Rewriting of password hashes upon user login if needed
  • Magic for people to painlessly migrate from python-bcrypt

Any help from the community would be very much welcome :)

comment:5 by Paul McMillan, 13 years ago

Triage Stage: Design decision neededAccepted

This plan corresponds with what we worked out here at the sprints with Russell KM, Isaac Kelly, and myself. I'll also be writing a post to the django-dev list about this.

Last edited 13 years ago by Paul McMillan (previous) (diff)

by Justine Tunney, 13 years ago

comment:6 by Justine Tunney, 13 years ago

Has patch: set
Needs documentation: set
Needs tests: set

comment:7 by Jannis Leidel, 13 years ago

Patch needs improvement: set
Version: 1.2

Hm, in the patch, why is this being put in django.utils instead of staying in django.contrib.auth? I'm asking cause I've specifically created django.contrib.auth.utils a couple of weeks ago in trunk. What about having this in django.contrib.auth.hashers

Also, mind elaborating how the PASSWORD_HASHERS setting is supposed to be used? It seems a bit cumbersome to specify the setting if it can't be used like the DATABASES or CACHES setting. From what I understand is you only want to be able to specify the default hasher (atm the first item in the tuple, yuck) and some options.

Switching the password hasher or changing an option of a password hasher isn't a common use case and therefor shouldn't be exposed like that as a multi-value setting. Instead subclassing and setting a single setting (named AUTH_PASSWORD_HASHER) ought to be enough, e.g.::

AUTH_PASSWORD_HASHER  = 'mysite.security.MyCustomBCryptPasswordHasher'

Also, please don't use backslashes but brackets for line continuation.

Last edited 13 years ago by Jannis Leidel (previous) (diff)

by Justine Tunney, 13 years ago

by Justine Tunney, 13 years ago

comment:8 by Justine Tunney, 13 years ago

Sorry I didn't notice the incorrect references to the old utils file; I forgot to delete the pyc file :\

I uploaded a new patch with numerous changes including that backslash thing. The change log is here: https://github.com/jart/django/commits/auth-hashing

You might be right that people should overload classes to change the settings for a hasher so I included that in the new patch. We must however keep the list because if people use a third party password hasher, they need a way to migrate back to the standard ones. The only way to avoid this is to not allow people to write their own custom hashers.

I honestly think people should be discouraged from using the setting at all. We put a lot of time and effort into making sure the default behavior highly secure, stable, fast and portable without compromise. We still need more tests, review and research, but I believe this PBKDF2 implementation will end up being an excellent choice for everybody.

in reply to:  8 comment:9 by Jannis Leidel, 13 years ago

Replying to jart:

Sorry I didn't notice the incorrect references to the old utils file; I forgot to delete the pyc file :\

Hm, you seem to have put imports there now, please move the content of the passhash.py file which was previously in the utils.py file back there and the hasher classes to a file called hashers.py in the auth contrib app's. The utils.py *is not deprecated* as you claim. The reason is simple: there isn't any other password hashing happening outside the auth contrib app, so it should live there, too.

I uploaded a new patch with numerous changes including that backslash thing. The change log is here: https://github.com/jart/django/commits/auth-hashing

You might be right that people should overload classes to change the settings for a hasher so I included that in the new patch. We must however keep the list because if people use a third party password hasher, they need a way to migrate back to the standard ones. The only way to avoid this is to not allow people to write their own custom hashers.

You misunderstood me, the setting would *not* be a list of hashers, but just the module path of the activated hasher.

By default the global_settings.py should contain::

AUTH_PASSWORD_HASHER  = 'django.contrib.auth.hashers.PBKDF2PasswordHasher'

and users can override that in their own settings.py accordingly (in case they have any options to modify, which is unlikely):

AUTH_PASSWORD_HASHER  = 'mysite.security.MyCustomBCryptPasswordHasher'

I honestly think people should be discouraged from using the setting at all. We put a lot of time and effort into making sure the default behavior highly secure, stable, fast and portable without compromise. We still need more tests, review and research, but I believe this PBKDF2 implementation will end up being an excellent choice for everybody.

Well, even if it's not a common case, those users that need to switch the password algorithm (probably due to backwards compatibility or security requirements) need some kind of documentation of how to do it. That includes documenting the various hasher classes in that Django ships in the django.contrib.auth.hashers.

comment:10 by Jannis Leidel, 13 years ago

After talking to Carl on IRC, I'm sorry to say that I misunderstood the reason why you used a list for the setting (to enforce an order of precedence).

So the global_settings.py would indeed have this:

AUTH_PASSWORD_HASHERS = (
    'django.utils.passhash.PBKDF2PasswordHasher',
    'django.utils.passhash.BCryptPasswordHasher',
    'django.utils.passhash.SHA1PasswordHasher',
    'django.utils.passhash.MD5PasswordHasher',
    'django.utils.passhash.CryptPasswordHasher',
)

and users could add their own implementation if needed:

AUTH_PASSWORD_HASHERS = (
    'mysite.security.MyCustomBCryptPasswordHasher',
    'django.utils.passhash.PBKDF2PasswordHasher',
    # etc
)

That said, I'm still convinced this should be part of the auth app and not of django.utils.

Last edited 13 years ago by Jannis Leidel (previous) (diff)

comment:11 by Carl Meyer, 13 years ago

Yeah, we're still discussing the location question :-)

I agree with jezdez that the PASSWORD_HASHERS setting, and the built-in hasher classes, do need to be documented, even if we don't expect most people to need to use them.

comment:12 by Justine Tunney, 13 years ago

jezdez, could we please talk about this on IRC for a moment? I'm jart on freenode

comment:13 by Jacob, 13 years ago

#8647 was a duplicate and has some code as well.

comment:14 by Carl Meyer, 13 years ago

So the question of where the hashing code goes is a minor issue compared to the bulk of the work here (which is awesome, btw), but since there is an unresolved disagreement, I'll briefly summarize for posterity's sake the points on either side from the conversation in IRC (jezdez, please correct if I mis-represent anything):

In favor of putting it in django.utils.passhash, as the current patch does (or django.core.passhash?):

  • Contrib.auth is notorious for not being as flexible as people want (this was raised numerous times at DjangoCon), and some people build their own alternative auth solutions. It would be good for these people to use a standard, common, well-tested-and-reviewed password hashing implementation rather than rolling their own. Putting it in core encourages people to perceive it as a low-level standard utility rather than a part of contrib.auth, and doesn't require them to add a dependency to a contrib app they don't otherwise use.
  • Contrib.auth is likely to get a major overhaul soonish; it'd be good for the password hashing implementation to not be subject to any upheavals or namespace deprecations or whatnot that might be involved in that.

In favor of putting it in contrib.auth.utils:

  • Nothing else in Django besides contrib.auth uses it, and password-hashing isn't really useful outside of auth/auth, so it may as well be part of contrib.auth, which is our only built-in auth/auth solution.
  • django.utils is kind of a miscellaneous dumping ground for stuff, and not everything in there is currently well-maintained (this was asserted, anyway; I haven't checked myself to look for examples).
  • A django.contrib.auth.utils module was recently added, and is now the home of the current password-hashing code.

(In case it isn't clear, I favor putting it in core rather than contrib.)

comment:15 by Paul McMillan, 13 years ago

I think the passhash stuff belongs in contrib.auth. It's specific enough to our use case that it would almost certainly require tweaking to work elsewhere. The upgrade mechanism, the specific string format - all of these are implementation details related to auth. It's a good system, but there's not a huge amount of stuff in it that reflects tricky (security-wise) implementation details.

The PBKDF2 stuff belongs in utils.crypto.

For better or worse the majority of the hashing code[1] IS crypto code. It's defined in PKCS#5v2, and the implementation is designed primarily for deriving a key suitable for use in encryption. Our verification use case is an expected application of the spec, but is only a small part of what the code is intended to do.

I'm not currently in favor of adding "encryption" type features to Django, but if we ever did, this code would be relevant for that use case. It would also be relevant for other signing-type operations, if we decide to extend the current functionality. Even if Django itself does not ever make use of this code for any purpose other than hashing user passwords, I would like it to be available to developers as an explicit crypto function, rather than requiring them to repurpose code that lives as part of auth.

[1] Technically, this is key stretching, the "hashing" part is just the sha256 function.

A more readable version of the relevant bits of spec can be found here: http://www.ietf.org/rfc/rfc2898.txt

comment:16 by Carl Meyer, 13 years ago

That split works for me. I was under the impression that the hasher classes themselves would be useful in another auth implementation, since they provide a uniform API to a variety of hashing strategies (and that the string format we were using for hashes was a sort of semi-standard), but anything that wouldn't likely be used by others without tweaking makes more sense to keep in contrib.auth.

comment:17 by Jannis Leidel, 13 years ago

Yeah, that seems like a smart way to put it.

  • the hashing functions ought to be part of the django.utils package since they fit the goal of django.utils of providing low-level implementations
  • the specific implementation (the "hashers") ought to live in the auth app and use the low-level pieces by importing from django.utils and provide a high-level API for users to implement their own hasher

comment:18 by Jacob, 12 years ago

milestone: 1.4

Milestone 1.4 deleted

comment:19 by Paul McMillan, 12 years ago

Severity: NormalRelease blocker

Marking this as a release blocker so it doesn't get overlooked for 1.4. The patch is just about ready, aside from the minor location stuff discussed above.

comment:20 by Justine Tunney, 12 years ago

Would anyone mind picking up this ticket? I've been heavily involved in organizing Occupy Wall Street for the past month and my time is stretched pretty thin :(

Also yes, OccupyWallSt.org runs Django :)
https://github.com/jart/occupywallst

comment:21 by Justine Tunney, 12 years ago

Also if you absolutely need me to make this into 1.4 on time, I'll make time in my schedule :)

comment:22 by Paul McMillan, 12 years ago

Owner: changed from Justine Tunney to Paul McMillan

No problem. Assigning this ticket to myself, though I would appreciate code review from others.

comment:23 by Justine Tunney, 12 years ago

I'll be sure to do that <3

comment:24 by Paul Oswald, 12 years ago

Cc: Paul Oswald added

There has been some focus on closing down 1.4 and releasing an alpha. As a user, I absolutely want this code to go into that release. Is this ready for testing? If so, I'm willing to apply it and give it a shot on upgrading some data dumps of my projects. I note one thing I think we are missing is a description of how to use this and information about backward compatibility for the release notes.

comment:25 by Donald Stufft, 12 years ago

Cc: Donald Stufft added

by Paul McMillan, 12 years ago

comment:26 by Paul McMillan, 12 years ago

Needs tests: unset

I've attached my current version of the patch, merged up to trunk. There are a few things left to do, the most prominent being to improve the documentation. Overall this code is pretty close to what will be committed. Comments are encouraged, and I'd love it if people could give it a try. It can also be found on github.

Last edited 12 years ago by Paul McMillan (previous) (diff)

by Jannis Leidel, 12 years ago

Attachment: 15367.5.diff added

Minor nitpicks, cleanups and other improvments. More details can be found in the commits at https://github.com/jezdez/django/tree/feature/auth-hashing

comment:27 by Aymeric Augustin, 12 years ago

Resolution: fixed
Status: newclosed

This was committed at r17253 and r17254.

Note: See TracTickets for help on using tickets.
Back to Top