Improving contrib.auth

Django's auth application has been largely unchanged since the initial release of Django. Unfortunately, the User model enforces a number of design decisions that have proven problematic over time. These include (but are not limited to):

  • Username is required, and limited to 30 characters.
  • Email has a 75 character limit
  • Email isn't unique or required.
  • There is a "first name" and "last name" field, which is a western-centric organisation of naming and doesn't work well in other cultures.
  • There's no ability to add additional fields to the base user model.

The last point requires some elaboration. The overwhelming stated use case for adding additional fields is to avoid the need for a UserProfile-style 1-1 model on aesthetic or performance grounds. The Django core team has historically counselled users away from this technique.

However, there is also a use case for adding non-email, non-username identification marks to the base User object e.g., an authentication identifier from another login system. This requires additional fields, and needing to perform a join in order to perform the core function of a user object is wasteful.

These use cases have driven a long-lived request (#3011) to "fix" auth.User, either by removing the base limitations of the model, or making the User model user-configurable in some way.

Some django-developer discussions that cover this topic (there have been many, over many years):

A large number of solutions have also been proposed over the years. Here is a summary of the most viable candidates:

Solution 1: Superminimal update

Don't make any significant contribution to contrib.auth -- just make 2 changes to the User model, phased in over multiple release cycles.

  1. Increase the length of the username field to 254 characters, so it can hold an email if necessary
  2. Make the email field 254 characters long.

Implementation

  • Introduce a new USE_NEW_USER_SETTINGS setting. This would be set as False by default in global_settings.py, and True by default in the project template. This means existing projects get a value of False, but can opt-in to True at their convenience; new projects get a value of True.
  • Modify the existing User model to use this setting to determine the length/uniqueness of email and username fields
  • Introduce a "MigrationWarning" that will be raised if USE_NEW_USER_SETTINGS is False. Assuming this is introduced in 1.5, Django 1.5 users would get the warning if they have USE_NEW_USER_SETTINGS set to False. Django 1.6 would raise an error if USE_NEW_USER_SETTINGS is False. Django 1.7 would remove all references to the setting.

Optionally:

  • We could introduce 2 settings -- one for username length, and one for email length, to separate the two issues (and allow for people who want to introduce the email length fix, but not the username length fix.

Advantages

  • Very little code to write -- not much more than a dozen lines.
  • Solves the most common request for auth.User -- making it easy to make email addresses the username.

Problems

  • Doesn't actually fix the problem for any use case other than "email address as username"
  • Introduces a setting that immediately becomes deprecated (since it won't be needed once the migration cycle is complete)
  • Doesn't address the problem with any other usage of EmailField having a max_length of 75.

Solution 1a: Superminimal with forced migration

As for Solution 1, but don't have the setting -- require the database migration as part of the 1.5 upgrade path.

Advantages

As for Solution 1, plus:

  • Guarantees that every Django 1.5 user has the same user model

Problems

  • Is a huge backwards incompatibility: Requires that every Django user upgrading read, understand, and act upon the instructions in the release notes.
  • Poor failure modes. If a users fail to read and act on the instructions in the release notes, their projects won't fail until the first user enters an email that is longer than 75 characters, or a username longer than 30 characters, at which point DatabaseErrors will be raised.

Solution 2: AUTH_USER_MODEL setting

Allow users to specify a User model via a setting. This is essentially the original #3011 proposal.

Implementation

Introduce an AUTH_USER_MODEL setting; instead of assuming that auth.User is the user model, the model found at contrib.auth.models.User is determined at runtime by reading settings. This can be facilitated through the use of a 'get_user_model()' or similar helper function.

There are several examples of implementations implementing this approach:

Advantages

  • Allows any user model, providing it adheres to some basic contract (defined by Django, probably using contrib.admin as a baseline use case)
  • Existing apps require no migration -- references to auth.User continue to work as-is.
  • Mirrors current usage in contrib.comments

Optionally:

  • Introduce an AbstractUser base class that implements the core of the User contract, and encourage people developing User models to extend this base class.
  • Modify auth.User to extend the new AbstractUser. Care must be taken to ensure that the database expression of the new User model is identical to the old User model, to ensure backwards compatibility.
  • Introduce other concrete User models implementing common login patterns (e.g., login by email)

In order to support contrib.admin, there will be a need to include some permissions API in the User model -- this is required by the admin app, but doesn't necessarily have to be part of the AbstractUser. Care should be taken to ensure that the AbstractUser really does represent the minimal requirements for user authentication, and that authorisation concerns are kept separate.

All three points here are optional. We can introduce an AbstractUser without changing the base User. We also don't have to introduce other concrete models -- we could leave this up to the community at large to develop an ecosystem of User models.

Problems

  • Doesn't address the EmailField length problem for existing users. We could address this by having a User model (reflecting current field lengths) and a new SimpleUser (that reflects better defaults); then use global_settings and project template settings to define which User is the default for new vs existing projects.
  • Doesn't solve the analogous problem for any other project. E.g., contrib.comments already has pluggable Comments models, and has invented a bespoke solution. Other projects will have similar needs; this solution doesn't address the duplication of code.
  • Has unpredictable failure modes if a third-party app assumes that User has a certain attribute or property which the project-provided User model doesn't support (or supports in a way different to the core auth.User model).
  • Prone to unpredictable problems if AUTH_USER_MODEL is modified after the initial syncdb (i.e., someone changes the settings to change the User model, but doesn't document/alert anyone that a migration will be required). This might be able to be managed by introducing a management table to the database to track the syncdb-time value for the AUTH_USER_MODEL setting, and raising a validation error if the current setting value doesn't match the value in the table.

Solution 2a: USER_MODEL setting

Similar to solution 2. Specify a User model via a setting, but don't mount it at django.contrib.auth.User.

Implementation

Introduce an USER_MODEL setting (not necessarily related to AUTH at all) that defaults to "auth.User".

Then, suppress the auth.User model when it is not in use to make failure modes predictable and noisy.

There is a branch of django trunk that implements the basic idea:

Advantages

  • Allows any user model, potentially independent of contrib.auth entirely.
  • Existing projects require no migration if USER_MODEL isn't modified.
  • Avoids app-app circular dependencies, where apps that monkey-patch or plug into auth.User must be loaded before django.contrib.auth.models.User can be safely referenced

Optionally:

  • Split off as much as possible of auth.User into orthogonal mixins that can be reused.
  • Modify auth.User to inherit these mixins. Care must be taken to ensure that the database expression of the new User model is identical to the old User model, to ensure backwards compatibility.
  • Unrelated and third-party apps can indicate that they depend on various orthogonal mixins. For example, contrib.admin can specify that it works with auth.User out of the box, and with any model implementing PermissionsMixin if you supply your own login forms.

Exposing pieces of auth.User as mixins is optional, and potentially advantageous for any solution that allows you to define your own user models, such as solution 2.

Problems

  • Doesn't address the EmailField length problem for existing users. We could address this by having a User model (reflecting current field lengths) and a new SimpleUser (that reflects better defaults); then use global_settings and project template settings to define which User is the default for new vs existing projects.
  • Doesn't solve the analogous problem for any other project.
  • Existing apps need to be updated to reflect the fact that auth.User may not be the User model. All instance of ForeignKey(User) need to updated to ForeignKey(settings.AUTH_USER).
  • Still has the settings-models circular dependency problem
  • As with Solution 2, prone to unpredictable problems if USER_MODEL is modified after the initial syncdb.

Solution 2b: AUTH_USER_MODEL_MIXINS setting

This is similar to solution 2 and 2a, but is quite the opposite to solution 3.

Making every model pluggable. So that auth.User could be defined like this.

class User(models.Model):
    __mixins__ = settings.AUTH_USER_MODEL_MIXINS

Implementation

  • Change bases before creating user-defined Model class in ModelBase.
    mixins = attrs.get('__mixins__', ())
    bases = load_available_model_mixins(mixins) + bases
    
  • separate fields out as mixins like solution 2a

Advantages

  • do not have unpredictable problems in solution 2/2a.
  • Existing projects require no migration. the same as solution 2/2a.
  • Unlike solution 2a, change is not required for existing apps.
  • Unrelated and third-party apps can indicate that they depend on various orthogonal mixins. the same as solution 2a.

Problems

  • built-in schema migration tools is required.
  • Doesn't address the EmailField length problem. (can be solved by schema migration tools?)
  • ModelForm must be more restrictive, otherwise, django will suffer security issues, just as the register_globals of PHP or the mass-assignment of Rails.
  • ModelForm (and any other code that introspects auth.User) should be made lazy, or else circular dependencies can result. Introspecting auth.User and plugging into auth.User from the same app or an app loaded later is a potential circular dependency. (This is not a problem, unless you put mixins in models.py?)

Solution 2c: Generic swappable models

Follows the general direction of Solution 2a, but instead of a USER_MODEL setting that only solves the problem for auth.User, sets up the infrastructure for the general concept of swappable models.

Implementation

A model that wants to declare itself as swappable adds a new Meta option:

class User(Model):
    ....
    class Meta:
        swappable = 'user'

Here, the name 'user' is an identifier that will be used to refer to this swappable model. By convention, it might be a good idea to namespace this tag (e.g., 'auth.user', instead of just 'user'.

We then introduce a SWAPPABLE_MODELS setting that provides a way for users to specify which models will be overridden in this application:

SWAPPABLE_MODELS = {
    'user': 'myapp.SuperDuperUser'
}

This specifies that the 'user' model will be satisfied by SuperDuperUser in this project.

If a model identifier (i.e., 'user') is mentioned in SWAPPABLE_MODELS, then the original model (auth.User) isn't synchronised to the database, and isn't added to the App cache.

We then add a LazyForeignKey():

class Comment(Model):
    user = LazyForeignKey('user')
    ....

that will resolve the ForeignKey to the currently defined swappable model.

If an app defines a model with a ForeignKey to a swappable model, a warning is raised (since the model could potentially be swapped out); if the model is currently defined as swapped out, then an error will be raised (since the ForeignKey will be pointing at the wrong table).

The app cache gains a registry of swappable models, and a get_swappable_model(identifier) entry point, so that users can easily retrieve the model currently being used as the swappable model for the various swappable endpoints.

Advantages

As for Solution 2a, but:

  • Solves the general problem of swappable models, so other apps with similar problems (e.g., comments) can use the same infrastructure instead of having to reinvent the wheel.
  • Catches migration problems as warnings/errors if existing apps haven't been updated to point to the swappable model.

Problems

As for Solution 2a.

Solution 2d: MYAPP_USER_MODEL setting (a)

Similar to solution 2a, but instead of a single global USER_MODEL setting, each app has its own USER_MODEL setting.

Implementation

django.contrib.admin can introduce an ADMIN_USER_MODEL setting which defaults to None.

user = models.ForeignKey(settings.ADMIN_USER_MODEL or settings.USER_MODEL)

Advantages

same as solution 2a, but

  • support multiple user models

Problems

same as solution 2a.

Solution 2e: MYAPP_USER_MODEL setting (b)

A combination of solution 2b and 2d. Every model is pluggable and each app has its own USER_MODEL setting.

Implementation

django.contrib.admin can introduce an ADMIN_USER_MODEL setting which defaults to "auth.user".

user = models.ForeignKey(settings.ADMIN_USER_MODEL)

Advantages

  • support multiple user models
  • Existing projects require no migration. the same as solution 2/2a.
  • Unlike solution 2a, change is not required for existing apps, if they do not use a different user model.
  • Unrelated and third-party apps can indicate that they depend on various orthogonal mixins. the same as solution 2a.

Problems

same as solution 2b, and

  • As with Solution 2, prone to unpredictable problems if MYAPP_USER_MODEL is modified after the initial syncdb.

Solution 3: Leverage App Refactor

Use Arthur Koziel's App Refactor patch from GSoC 2010 as a way to define a configurable auth app.

Implementation

  • Land the App Refactor patch. This introduces a number of benefits -- reliable hooks for app startup, configurable app labels, predictable module loading, amongst others -- but the one that matters for the purposes of auth.User is that it allows Apps to be treated as items that need to be configured as a runtime activity. In this case, we need to be able to specify, at a project level, which model is your "User" model in the auth app.
  • Introduce the concept of a LazyForeignKey. LazyForeignKey is a normal foreign key, with all the usual foreign key behaviors; the only difference is that the model it links to isn't specified in the model -- it's a configuration item drawn from an application configuration. So, ForeignKey('auth.User') creates a foreign key to django.contrib.auth.User; LazyForeignKey('auth.User') asks the auth app for the model that is being used as the 'User' model, and creates a foreign key to that. This can be done by slotting into the existing model reference resolution code, which is something that the app refactor cleans up.
  • Add a Meta option to models -- pluggable -- which controls whether the model can be replaced at runtime. If it can be, then the model may not be synchronised (so we don't get empty auth_user tables).

Optionally:

  • Introduce an AbstractUser, and other concrete User models, same as for Solution 2

Advantages

  • Doesn't have the circular dependency between settings and User
  • Solves the generic problem, not a specific auth problem. contrib.comments could be retrofitted to use this approach, and any other application could do the same.

Problems

  • Is dependent on the App Refactor landing in trunk. This may not occur any time soon.
  • No transparent update path -- requires that third party apps be updated to be "pluggable auth compatible". This means app authors need to convert all ForeignKey(User) into LazyForeignKey('auth.User'), and modify any usage of forms etc. This could be considered a benefit, however; Migrating User models is a nontrivial step, and it should probably involve some opt-in engineering.
  • Doesn't address the immediate problem for EmailField. We could do the same User/SimpleUser conversion here; with the added benefit that we are also introducing App Refactor, so we can use the distinction between an "unconfigured" auth app and a "Django 1.5 App Refactor Configured" auth app as the point for identifying whether User or SimpleUser is in use.
  • As with Solution 2, prone to unpredictable problems if the app configuration is modified after the initial syncdb.

Solution 3a: Transparent LazyForeignKeys

As for Solution 3, but don't include an explicit LazyForeignKey class. Instead, use the "pluggable" marker in the Meta class; If you define a ForeignKey to a 'pluggable' model, you're indicating that this foreign key reference might be changed later on.

Advantages

  • As for Solution 3, but provides a transparent migration path for apps -- no need to manually change ForeignKey(User). However, depending on your perspective, this might not actually be an advantage at all, because it removes the opt-in migration path.

Problems

  • As for Solution 3.

Solution 4: contrib.newauth

  • Deprecate contrib.auth, and create a new contrib.newauth in the same way we deprecated forms into newforms.

Ian Lewis has a project that starts down this path (See https://bitbucket.org/IanLewis/django-newauth/); however:

  • it doesn't yet handle permissions, so it can't be used for contrib.admin
  • It exhibits the settings-models dependency problem
  • It introduces a new feature -- the ability to have *multiple* user models -- that isn't an obviously required improvement. Additional discussion is required before being adopted.

Advantages

  • Gives us a clean slate to examine authentication and authorisation issues.

Problems

  • There's no pret-a-porter project ready as a candidate. Rebuilding contrib.auth is a big undertaking, and doesn't yet have a clear design (or even design requirements).

Solution 5: Profile-based single user model

After reviewing proposals 1-4 on this list, Jacob indicated that he was uncomfortable with the idea of 'swappable' models (driven at least in part by lessons learned from Django's own history), and so he came up with an alternate proposal. This proposal suggests migrating towards a single, absolutely minimalist User model, with Profile objects being used to store any additional information. The full proposal is laid out in detail here https://gist.github.com/2245327 , but as a brief summary:

The user model becomes something like:

class User(models.Model):
    identifier = models.CharField(unique=True, db_index=True)
    password = models.CharField()

identifier is included because it's required by 99% of User models; password is included as an active discouragement to people reimplementing (badly) their own password authentications schemes.

Other user information -- name, staff/admin flags, permissions and so on -- are stored on profile models that are linked 1-1 with the User model. In order to make it easier to transition to the new user model, and to avoid coupling to specific profile models, the User model may also implement a delegate pattern for other user attributes. For example, the 'display name' for a user won't be defined by the User object itself; it will be serviced by one of the profile objects associated with the User, but you'll be able to request user.display_name to retrieve the name. When multiple profiles specify the same attribute, an AUTH_PROFILES setting will setting the precedence order.

Along the way, we'll also deprecate AUTH_USER_PROFILE and user.get_profile(), reflecting the fact that there isn't a single "profile" anymore.

Advantages

  • There is a single canonical user model. There's no need for a swappable anything, or Foreign keys that point to dynamic models. This means the properties of the User object are reliable and consistent between projects. Inconsistencies only arise between uses of Profile objects.
  • It makes almost no judgements about what a User should have on it. All such decisions are made by the developer by implementing an appropriate Profile object.

Problems

  • There is a *huge* migration task, since all existing User tables need to be migrated to the new profile-based format. However, we would be moving from one known format (old auth.User) to another well known format (new auth.User plus admin.UserProfile), so the migration script will be reasonably predictable. When combined with delegated attributes, it should be possible to have a migration path that has all the usual warnings etc.
  • It requires joins to get to anything other than the core User data. There is some argument as to whether this is primarily a technical limitation or a social one. Mailing list discussions have indicated a preference by some people for a monolithic User object, avoiding the need for joins when retrieving user data. Although technical reasons are usually given for this preference (usually, "joins are evil"), it isn't clear that this is a a problem in practice -- at least to the extent that we should be basing the entire design around accommodating sites that are actually experiencing those problems.
  • If a new profile model is added to an existing project, new profile objects need to be instantiated for any existing User object. This is the analog of the problem in the "swappable-user" case; if the User model is modified, there is a migration task to make sure that the database User matches the code definition.

  • There are still some issues to resolve around the handling of the delegate attributes:
    • Should profiles be auto-select-related?
    • Should profiles be auto-created if they are missing? If a profile has a required field and it is auto-created, how will that field be filled in?
  • Form handling for the new User object -- especially ModelForm handling -- hasn't been fully elaborated.

Universal problems

Regardless of the final solution that is chosen, the decision to move to a pluggable User model will introduce some challenges.

The User Contract

In order for code to actually be able to adapt to swappable User models, there needs to be some common ground on what a "User" object can actually do. Django is in a position to enforce this by convention -- contrib.admin is a good baseline case.

Consensus seems to be that the 'minimal contract' for User needs to be little more than that required basic identification -- you need to be able to authenticate using arbitrary credentials, and be able to service a request to print "Hello <user>"

Beyond this base contract, we're in true duck-typing territory. If you have an app that expects to find an 'is_admin' attribute, and the provided User model doesn't have one, then your app won't work with that User model (and the failure mode won't be predictable).

This means that the onus will be on the developer using pluggable User model to test that their User model will work as expected. This can be helped by app developers:

  • Clearly documenting the app's User model requirements.
  • Including test suites in their app that explicitly test their User model requirements.

Separation of authentication from authorisation

contrib.admin requires a permissions API on User. However, this API isn't an obvious thing to require for *all* User objects. There has been very little discussion about the possibility of an abstract API for permissions.

The best approach here may be implement contrib.admin's permissions API as a mixin, separate to the AbstractUser base class. This would allow someone who doesn't want admin's permission model to avoid them, while making it easy to include all Django's permission requirements if the developer wants to use contrib.admin.

Forms

If an app creates a form (or ModelForm) on User, the exact contents of that form will be unpredictable. It will be extremely easy to define a form in an app that has clean methods or widget overrides that reference fields that don't exist on the User model in use.

It may be necessary to prevent ModelForm from being used on User (or any other pluggable model in the App Refactor case), and encourage app developers to make any Forms that they have configurable. This is analogous to Django's existing configurability for LoginForm et al.

Inheritance

If the User model can change, then models inheriting from the User model need to be able to adapt to a variable base class.

As a first iteration, it may be necessary to prohibit inheritance from a User model that is marked as pluggable.

Parallel concerns

None of these fully address the limitations with EmailField -- that the default max_length of 75 is too short to hold all email addresses. As a separate concern, we could address this problem in the same way that Solution 1 proposes to fix the User model, but focussed on the EmailField's max_length argument specifically:

  • Introduce a new ALLOW_RFC_COMPLIANT_EMAIL_ADDRESSES setting. This would be set as False by default in global_settings.py, and True by default in the project template. This means existing projects get a value of False, but can opt-in to True at their convenience; new projects get a value of True.
  • Modify the EmailField definition to use this setting to determine the max_length
  • Introduce a "MigrationWarning" that will be raised if ALLOW_RFC_COMPLIANT_EMAIL_ADDRESSES is False (or max_length isn't manually specified). Assuming this is introduced in 1.5, Django 1.5 users would get the warning if they have ALLOW_RFC_COMPLIANT_EMAIL_ADDRESSES set to False. Django 1.6 would raise an error if ALLOW_RFC_COMPLIANT_EMAIL_ADDRESSES is False. Django 1.7 would remove all reference to the setting.

This would address the problem for all EmailFields, not just the one in auth.User. This fix could be used in conjunction with any of the solutions proposed here; in fact, it would make some of them simpler (since there wouldn't be a need for a SimpleUser to migrate to an improved email field length).

Recommendations

Discussion on django-developers rang long, and revealed that complete consensus is unlikely:

A BDFL decision was called for; Option 2a was selected.

Implementation

A draft branch is available

Last modified 12 years ago Last modified on Jun 4, 2012, 10:13:05 AM
Note: See TracWiki for help on using the wiki.
Back to Top