Code


Version 10 (modified by russellm, 2 years ago) (diff)

Added some extra points based on discussions in IRC

Improving contrib.auth

Django's auth application has been largely unchanged since the initial release of Django. Unfortunately, the User model enforces a number of design decisions that have proven problematic over time. These include (but are not limited to):

  • Username is required, and limited to 30 characters.
  • Email has a 75 character limit
  • Email isn't unique or required.
  • There is a "first name" and "last name" field, which is a western-centric organisation of naming and doesn't work well in other cultures.
  • There's no ability to add additional fields to the base user model.

The last point requires some elaboration. The overwhelming stated use case for adding additional fields is to avoid the need for a UserProfile-style 1-1 model on aesthetic or performance grounds. The Django core team has historically counselled users away from this technique.

However, there is also a use case for adding non-email, non-username identification marks to the base User object e.g., an authentication identifier from another login system. This requires additional fields, and needing to perform a join in order to perform the core function of a user object is wasteful.

These use cases have driven a long-lived request (#3011) to "fix" auth.User, either by removing the base limitations of the model, or making the User model user-configurable in some way.

Some django-developer discussions that cover this topic (there have been many, over many years):

A large number of solutions have also been proposed over the years. Here is a summary of the most viable candidates:

Solution 1: Superminimal update

Don't make any significant contribution to contrib.auth -- just make 2 changes to he User model, phased in over multiple release cycles.

  1. Increase the length of the username field to 254 characters, so it can hold an email if necessary
  2. Make the email field 254 characters long.

Implementation

  • Introduce a new USE_NEW_USER_SETTINGS setting. This would be set as False by default in global_settings.py, and True by default in the project template. This means existing projects get a value of False, but can opt-in to True at their convenience; new projects get a value of True.
  • Modify the existing User model to use this setting to determine the length/uniqueness of email and username fields
  • Introduce a "MigrationWarning" that will be raised if USE_NEW_USER_SETTINGS is False. Assuming this is introduced in 1.5, Django 1.5 users would get the warning if they have USE_NEW_USER_SETTINGS set to False. Django 1.6 would raise an error if USE_NEW_USER_SETTINGS is False. Django 1.7 would remove all references to the setting.

Optionally:

  • We could introduce 2 settings -- one for username length, and one for email length, to separate the two issues (and allow for people who want to introduce the email length fix, but not the username length fix.

Advantages

  • Very little code to write -- not much more than a dozen lines.
  • Solves the most common request for auth.User -- making it easy to make email addresses the username.

Problems

  • Doesn't actually fix the problem for any use case other than "email address as username"
  • Introduces a setting that immediately becomes deprecated (since it won't be needed once the migration cycle is complete)
  • Doesn't address the problem with any other usage of EmailField having a max_length of 75.
  • Introduces a circular dependency between settings and models. When settings are loaded, INSTALLED_APPS is inspected, and each models file is loaded. If a models file contains a reference to settings, hilarity can ensue. This isn't a problem *most* of the time, but it can lead to some interesting side effects.

Solution 1a: Superminimal with forced migration

As for Solution 1, but don't have the setting -- require the database migration as part of the 1.5 upgrade path.

Advantages

As for Solution 1, plus:

  • Guarantees that every Django 1.5 user has the same user model

Problems

  • Is a huge backwards incompatibility: Requires that every Django user upgrading read, understand, and act upon the instructions in the release notes.
  • Poor failure modes. If a users fail to read and act on the instructions in the release notes, their projects won't fail until the first user enters an email that is longer than 75 characters, or a username longer than 30 characters, at which point DatabaseErrors will be raised.

Solution 2: AUTH_USER_MODEL setting

Allow users to specify a User model via a setting. This is essentially the original #3011 proposal.

Implementation

Introduce an AUTH_USER_MODEL setting; instead of assuming that auth.User is the user model, the model found at contrib.auth.models.User is determined at runtime by reading settings. This can be facilitated through the use of a 'get_user_model()' or similar helper function.

There are several examples of implementations implementing this approach:

Advantages

  • Allows any user model, providing it adheres to some basic contract (defined by Django, probably using contrib.admin as a baseline use case)
  • Existing apps require no migration -- references to auth.User continue to work as-is.
  • Mirrors current usage in contrib.comments

Optionally:

  • Introduce an AbstractUser base class that implements the core of the User contract, and encourage people developing User models to extend this base class.
  • Modify auth.User to extend the new AbstractUser. Care must be taken to ensure that the database expression of the new User model is identical to the old User model, to ensure backwards compatibility.
  • Introduce other concrete User models implementing common login patterns (e.g., login by email)

In order to support contrib.admin, there will be a need to include some permissions API in the User model -- this is required by the admin app, but doesn't necessarily have to be part of the AbstractUser. Care should be taken to ensure that the AbstractUser really does represent the minimal requirements for user authentication, and that authorisation concerns are kept separate.

All three points here are optional. We can introduce an AbstractUser without changing the base User. We also don't have to introduce other concrete models -- we could leave this up to the community at large to develop an ecosystem of User models.

Problems

  • Has the same settings-models circular dependency problem as Solution 1.
  • Doesn't address the EmailField length problem for existing users. We could address this by having a User model (reflecting current field lengths) and a new SimpleUser (that reflects better defaults); then use global_settings and project template settings to define which User is the default for new vs existing projects.
  • Doesn't solve the analogous problem for any other project. E.g., contrib.comments already has pluggable Comments models, and has invented a bespoke solution. Other projects will have similar needs; this solution doesn't address the duplication of code.
  • Has unpredictable failure modes if a third-party app assumes that User has a certain attribute or property which the project-provided User model doesn't support (or supports in a way different to the core auth.User model).
  • Prone to unpredictable problems if AUTH_USER_MODEL is modified after the initial syncdb (i.e., someone changes the settings to change the User model, but doesn't document/alert anyone that a migration will be required). This might be able to be managed by introducing a management table to the database to track the syncdb-time value for the AUTH_USER_MODEL setting, and raising a validation error if the current setting value doesn't match the value in the table.

Solution 2a: USER_MODEL setting

Similar to solution 2. Specify a User model via a setting, but don't mount it at django.contrib.auth.User.

Implementation

Introduce an USER_MODEL setting (not necessarily related to AUTH at all) that defaults to "auth.User".

There is a branch of django trunk that implements the basic idea:

Advantages

  • Allows any user model, potentially independent of contrib.auth entirely.
  • Existing projects require no migration if USER_MODEL isn't modified.

Optionally:

  • Split off as much as possible of auth.User into orthogonal mixins that can be reused.
  • Modify auth.User to inherit these mixins. Care must be taken to ensure that the database expression of the new User model is identical to the old User model, to ensure backwards compatibility.
  • Unrelated and third-party apps can indicate that they depend on various orthogonal mixins. For example, contrib.admin can specify that it works with auth.User out of the box, and with any model implementing PermissionsMixin if you supply your own login forms.

Exposing pieces of auth.User as mixins is optional, and potentially advantageous for any solution that allows you to define your own user models, such as solution 2.

Problems

  • Doesn't address the EmailField length problem for existing users. We could address this by having a User model (reflecting current field lengths) and a new SimpleUser (that reflects better defaults); then use global_settings and project template settings to define which User is the default for new vs existing projects.
  • Doesn't solve the analogous problem for any other project.
  • Existing apps need to be updated to reflect the fact that auth.User may not be the User model. All instance of ForeignKey(User) need to updated to ForeignKey(settings.AUTH_USER). Failure modes will be unpredictable, as auth.User will still exist as a table, but won't contain any User information.
  • Still has the settings-models circular dependency problem
  • As with Solution 2, prone to unpredictable problems if USER_MODEL is modified after the initial syncdb.

Solution 3: Leverage App Refactor

Use Arthur Koziel's App Refactor patch from GSoC 2010 as a way to define a configurable auth app.

Implementation

  • Land the App Refactor patch. This introduces a number of benefits -- reliable hooks for app startup, configurable app labels, predictable module loading, amongst others -- but the one that matters for the purposes of auth.User is that it allows Apps to be treated as items that need to be configured as a runtime activity. In this case, we need to be able to specify, at a project level, which model is your "User" model in the auth app.
  • Introduce the concept of a LazyForeignKey. LazyForeignKey is a normal foreign key, with all the usual foreign key behaviors; the only difference is that the model it links to isn't specified in the model -- it's a configuration item drawn from an application configuration. So, ForeignKey('auth.User') creates a foreign key to django.contrib.auth.User; LazyForeignKey('auth.User') asks the auth app for the model that is being used as the 'User' model, and creates a foreign key to that. This can be done by slotting into the existing model reference resolution code, which is something that the app refactor cleans up.
  • Add a Meta option to models -- pluggable -- which controls whether the model can be replaced at runtime. If it can be, then the model may not be synchronised (so we don't get empty auth_user tables).

Optionally:

  • Introduce an AbstractUser, and other concrete User models, same as for Solution 2

Advantages

  • Doesn't have the circular dependency between settings and User
  • Solves the generic problem, not a specific auth problem. contrib.comments could be retrofitted to use this approach, and any other application could do the same.

Problems

  • No transparent update path -- requires that third party apps be updated to be "pluggable auth compatible". This means app authors need to convert all ForeignKey(User) into LazyForeignKey('auth.User'), and modify any usage of forms etc. This could be considered a benefit, however; Migrating User models is a nontrivial step, and it should probably involve some opt-in engineering.
  • Doesn't address the immediate problem for EmailField. We could do the same User/SimpleUser conversion here; with the added benefit that we are also introducing App Refactor, so we can use the distinction between an "unconfigured" auth app and a "Django 1.5 App Refactor Configured" auth app as the point for identifying whether User or SimpleUser is in use.
  • As with Solution 2, prone to unpredictable problems if the app configuration is modified after the initial syncdb.

Solution 3a: Transparent LazyForeignKeys

As for Solution 3, but don't include an explicit LazyForeignKey class. Instead, use the "pluggable" marker in the Meta class; If you define a ForeignKey to a 'pluggable' model, you're indicating that this foreign key reference might be changed later on.

Advantages

  • As for Solution 3, but provides a transparent migration path for apps -- no need to manually change ForeignKey(User). However, depending on your perspective, this might not actually be an advantage at all, because it removes the opt-in migration path.

Problems

  • As for Solution 3.

Solution 4: contrib.newauth

  • Deprecate contrib.auth, and create a new contrib.newauth in the same way we deprecated forms into newforms.

Ian Lewis has a project that starts down this path (See https://bitbucket.org/IanLewis/django-newauth/); however:

  • it doesn't yet handle permissions, so it can't be used for contrib.admin
  • It exhibits the settings-models dependency problem
  • It introduces a new feature -- the ability to have *multiple* user models -- that isn't an obviously required improvement. Additional discussion is required before being adopted.

Advantages

  • Gives us a clean slate to examine authentication and authorisation issues.

Problems

  • There's no pret-a-porter project ready as a candidate. Rebuilding contrib.auth is a big undertaking, and doesn't yet have a clear design (or even design requirements).

Universal problems

Regardless of the final solution that is chosen, the decision to move to a pluggable User model will introduce some challenges.

The User Contract

In order for code to actually be able to adapt to swappable User models, there needs to be some common ground on what a "User" object can actually do. Django is in a position to enforce this by convention -- contrib.admin is a good baseline case.

Consensus seems to be that the 'minimal contract' for User needs to be little more than that required basic identification -- you need to be able to authenticate using arbitrary credentials, and be able to service a request to print "Hello <user>"

Beyond this base contract, we're in true duck-typing territory. If you have an app that expects to find an 'is_admin' attribute, and the provided User model doesn't have one, then your app won't work with that User model (and the failure mode won't be predictable).

This means that the onus will be on the developer using pluggable User model to test that their User model will work as expected. This can be helped by app developers:

  • Clearly documenting the app's User model requirements.
  • Including test suites in their app that explicitly test their User model requirements.

Separation of authentication from authorisation

contrib.admin requires a permissions API on User. However, this API isn't an obvious thing to require for *all* User objects. There has been very little discussion about the possibility of an abstract API for permissions.

The best approach here may be implement contrib.admin's permissions API as a mixin, separate to the AbstractUser base class. This would allow someone who doesn't want admin's permission model to avoid them, while making it easy to include all Django's permission requirements if the developer wants to use contrib.admin.

Forms

If an app creates a form (or ModelForm) on User, the exact contents of that form will be unpredictable. It will be extremely easy to define a form in an app that has clean methods or widget overrides that reference fields that don't exist on the User model in use.

It may be necessary to prevent ModelForm from being used on User (or any other pluggable model in the App Refactor case), and encourage app developers to make any Forms that they have configurable. This is analogous to Django's existing configurability for LoginForm et al.

Inheritance

If the User model can change, then models inheriting from the User model need to be able to adapt to a variable base class.

As a first iteration, it may be necessary to prohibit inheritance from a User model that is marked as pluggable.

Parallel concerns

None of these fully address the limitations with EmailField -- that the default max_length of 75 is too short to hold all email addresses. As a separate concern, we could address this problem in the same way that Solution 1 proposes to fix the User model, but focussed on the EmailField's max_length argument specifically:

  • Introduce a new ALLOW_RFC_COMPLIANT_EMAIL_ADDRESSES setting. This would be set as False by default in global_settings.py, and True by default in the project template. This means existing projects get a value of False, but can opt-in to True at their convenience; new projects get a value of True.
  • Modify the EmailField definition to use this setting to determine the max_length
  • Introduce a "MigrationWarning" that will be raised if ALLOW_RFC_COMPLIANT_EMAIL_ADDRESSES is False (or max_length isn't manually specified). Assuming this is introduced in 1.5, Django 1.5 users would get the warning if they have ALLOW_RFC_COMPLIANT_EMAIL_ADDRESSES set to False. Django 1.6 would raise an error if ALLOW_RFC_COMPLIANT_EMAIL_ADDRESSES is False. Django 1.7 would remove all reference to the setting.

This would address the problem for all EmailFields, not just the one in auth.User. This fix could be used in conjunction with any of the solutions proposed here; in fact, it would make some of them simpler (since there wouldn't be a need for a SimpleUser to migrate to an improved email field length).

Recommendations

Discussion on django-developers suggests that complete consensus is unlikely; Currently awaiting BDFL mandate.