-
Improving
contrib.auth
- Solution 1: Superminimal update
- Solution 1a: Superminimal with forced migration
-
Solution 2:
AUTH_USER_MODEL
setting -
Solution 2a:
USER_MODEL
setting -
Solution 2b:
AUTH_USER_MODEL_MIXINS
setting - Solution 2c: Generic swappable models
-
Solution 2d:
MYAPP_USER_MODEL
setting (a) -
Solution 2e:
MYAPP_USER_MODEL
setting (b) - Solution 3: Leverage App Refactor
- Solution 3a: Transparent LazyForeignKeys
- Solution 4: contrib.newauth
- Solution 5: Profile-based single user model
- Universal problems
- Parallel concerns
- Recommendations
- Implementation
Improving contrib.auth
Django's auth application has been largely unchanged since the initial release of Django. Unfortunately, the User model enforces a number of design decisions that have proven problematic over time. These include (but are not limited to):
- Username is required, and limited to 30 characters.
- Email has a 75 character limit
- Email isn't unique or required.
- There is a "first name" and "last name" field, which is a western-centric organisation of naming and doesn't work well in other cultures.
- There's no ability to add additional fields to the base user model.
The last point requires some elaboration. The overwhelming stated use case for adding additional fields is to avoid the need for a UserProfile-style 1-1 model on aesthetic or performance grounds. The Django core team has historically counselled users away from this technique.
However, there is also a use case for adding non-email, non-username identification marks to the base User object e.g., an authentication identifier from another login system. This requires additional fields, and needing to perform a join in order to perform the core function of a user object is wasteful.
These use cases have driven a long-lived request (#3011) to "fix" auth.User, either by removing the base limitations of the model, or making the User model user-configurable in some way.
Some django-developer discussions that cover this topic (there have been many, over many years):
- http://groups.google.com/group/django-developers/browse_thread/thread/6dadb540ca5f7d9b
- http://groups.google.com/group/django-developers/browse_thread/thread/61c15301a89d88be
A large number of solutions have also been proposed over the years. Here is a summary of the most viable candidates:
Solution 1: Superminimal update
Don't make any significant contribution to contrib.auth
-- just make 2 changes to the User model, phased in over multiple release cycles.
- Increase the length of the username field to 254 characters, so it can hold an email if necessary
- Make the email field 254 characters long.
Implementation
- Introduce a new
USE_NEW_USER_SETTINGS
setting. This would be set as False by default in global_settings.py, and True by default in the project template. This means existing projects get a value of False, but can opt-in to True at their convenience; new projects get a value of True. - Modify the existing User model to use this setting to determine the length/uniqueness of email and username fields
- Introduce a "MigrationWarning" that will be raised if
USE_NEW_USER_SETTINGS
is False. Assuming this is introduced in 1.5, Django 1.5 users would get the warning if they haveUSE_NEW_USER_SETTINGS
set to False. Django 1.6 would raise an error ifUSE_NEW_USER_SETTINGS
is False. Django 1.7 would remove all references to the setting.
Optionally:
- We could introduce 2 settings -- one for username length, and one for email length, to separate the two issues (and allow for people who want to introduce the email length fix, but not the username length fix.
Advantages
- Very little code to write -- not much more than a dozen lines.
- Solves the most common request for auth.User -- making it easy to make email addresses the username.
Problems
- Doesn't actually fix the problem for any use case other than "email address as username"
- Introduces a setting that immediately becomes deprecated (since it won't be needed once the migration cycle is complete)
- Doesn't address the problem with any other usage of EmailField having a max_length of 75.
Solution 1a: Superminimal with forced migration
As for Solution 1, but don't have the setting -- require the database migration as part of the 1.5 upgrade path.
Advantages
As for Solution 1, plus:
- Guarantees that every Django 1.5 user has the same user model
Problems
- Is a huge backwards incompatibility: Requires that every Django user upgrading read, understand, and act upon the instructions in the release notes.
- Poor failure modes. If a users fail to read and act on the instructions in the release notes, their projects won't fail until the first user enters an email that is longer than 75 characters, or a username longer than 30 characters, at which point DatabaseErrors will be raised.
Solution 2: AUTH_USER_MODEL
setting
Allow users to specify a User model via a setting. This is essentially the original #3011 proposal.
Implementation
Introduce an AUTH_USER_MODEL
setting; instead of assuming that auth.User is the user model, the model found at contrib.auth.models.User is determined at runtime by reading settings. This can be facilitated through the use of a 'get_user_model()' or similar helper function.
There are several examples of implementations implementing this approach:
- https://github.com/aino/django-primate Uses a monkeypatch to override Django's code at the moment, but this patch wouldn't be required if it were merged to trunk.
- https://github.com/claymation/django/tree/pluggable-auth-apps This puts the confuguration at the app level (i.e., the setting is
AUTH_APP
, notAUTH_USER_MODEL
), but uses the same principle
Advantages
- Allows any user model, providing it adheres to some basic contract (defined by Django, probably using contrib.admin as a baseline use case)
- Existing apps require no migration -- references to auth.User continue to work as-is.
- Mirrors current usage in contrib.comments
Optionally:
- Introduce an AbstractUser base class that implements the core of the User contract, and encourage people developing User models to extend this base class.
- Modify auth.User to extend the new AbstractUser. Care must be taken to ensure that the database expression of the new User model is identical to the old User model, to ensure backwards compatibility.
- Introduce other concrete User models implementing common login patterns (e.g., login by email)
In order to support contrib.admin, there will be a need to include some permissions API in the User model -- this is required by the admin app, but doesn't necessarily have to be part of the AbstractUser. Care should be taken to ensure that the AbstractUser really does represent the minimal requirements for user authentication, and that authorisation concerns are kept separate.
All three points here are optional. We can introduce an AbstractUser without changing the base User. We also don't have to introduce other concrete models -- we could leave this up to the community at large to develop an ecosystem of User models.
Problems
- Doesn't address the EmailField length problem for existing users. We could address this by having a User model (reflecting current field lengths) and a new SimpleUser (that reflects better defaults); then use global_settings and project template settings to define which User is the default for new vs existing projects.
- Doesn't solve the analogous problem for any other project. E.g., contrib.comments already has pluggable Comments models, and has invented a bespoke solution. Other projects will have similar needs; this solution doesn't address the duplication of code.
- Has unpredictable failure modes if a third-party app assumes that User has a certain attribute or property which the project-provided User model doesn't support (or supports in a way different to the core auth.User model).
- Prone to unpredictable problems if
AUTH_USER_MODEL
is modified after the initial syncdb (i.e., someone changes the settings to change the User model, but doesn't document/alert anyone that a migration will be required). This might be able to be managed by introducing a management table to the database to track the syncdb-time value for the AUTH_USER_MODEL setting, and raising a validation error if the current setting value doesn't match the value in the table.
Solution 2a: USER_MODEL
setting
Similar to solution 2. Specify a User model via a setting, but don't mount it at django.contrib.auth.User.
Implementation
Introduce an USER_MODEL
setting (not necessarily related to AUTH
at all) that defaults to "auth.User"
.
Then, suppress the auth.User model when it is not in use to make failure modes predictable and noisy.
There is a branch of django trunk that implements the basic idea:
- https://github.com/ogier/django/tree/auth-mixins Also refactors orthogonal authentication, permissions and profile mixins out of auth.User.
Advantages
- Allows any user model, potentially independent of contrib.auth entirely.
- Existing projects require no migration if USER_MODEL isn't modified.
- Avoids app-app circular dependencies, where apps that monkey-patch or plug into auth.User must be loaded before django.contrib.auth.models.User can be safely referenced
Optionally:
- Split off as much as possible of auth.User into orthogonal mixins that can be reused.
- Modify auth.User to inherit these mixins. Care must be taken to ensure that the database expression of the new User model is identical to the old User model, to ensure backwards compatibility.
- Unrelated and third-party apps can indicate that they depend on various orthogonal mixins. For example, contrib.admin can specify that it works with auth.User out of the box, and with any model implementing PermissionsMixin if you supply your own login forms.
Exposing pieces of auth.User as mixins is optional, and potentially advantageous for any solution that allows you to define your own user models, such as solution 2.
Problems
- Doesn't address the EmailField length problem for existing users. We could address this by having a User model (reflecting current field lengths) and a new SimpleUser (that reflects better defaults); then use global_settings and project template settings to define which User is the default for new vs existing projects.
- Doesn't solve the analogous problem for any other project.
- Existing apps need to be updated to reflect the fact that auth.User may not be the User model. All instance of ForeignKey(User) need to updated to ForeignKey(settings.AUTH_USER).
- Still has the settings-models circular dependency problem
- As with Solution 2, prone to unpredictable problems if
USER_MODEL
is modified after the initial syncdb.
Solution 2b: AUTH_USER_MODEL_MIXINS
setting
This is similar to solution 2 and 2a, but is quite the opposite to solution 3.
Making every model pluggable. So that auth.User could be defined like this.
class User(models.Model): __mixins__ = settings.AUTH_USER_MODEL_MIXINS
Implementation
- Change bases before creating user-defined Model class in ModelBase.
mixins = attrs.get('__mixins__', ()) bases = load_available_model_mixins(mixins) + bases
- separate fields out as mixins like solution 2a
Advantages
- do not have unpredictable problems in solution 2/2a.
- Existing projects require no migration. the same as solution 2/2a.
- Unlike solution 2a, change is not required for existing apps.
- Unrelated and third-party apps can indicate that they depend on various orthogonal mixins. the same as solution 2a.
Problems
- built-in schema migration tools is required.
- Doesn't address the EmailField length problem. (can be solved by schema migration tools?)
- ModelForm must be more restrictive, otherwise, django will suffer security issues, just as the register_globals of PHP or the mass-assignment of Rails.
- ModelForm (and any other code that introspects auth.User) should be made lazy, or else circular dependencies can result. Introspecting auth.User and plugging into auth.User from the same app or an app loaded later is a potential circular dependency. (This is not a problem, unless you put mixins in models.py?)
Solution 2c: Generic swappable models
Follows the general direction of Solution 2a, but instead of a USER_MODEL
setting that only solves the problem for auth.User, sets up the infrastructure for the general concept of swappable models.
Implementation
A model that wants to declare itself as swappable adds a new Meta option:
class User(Model): .... class Meta: swappable = 'user'
Here, the name 'user' is an identifier that will be used to refer to this swappable model. By convention, it might be a good idea to namespace this tag (e.g., 'auth.user', instead of just 'user'.
We then introduce a SWAPPABLE_MODELS
setting that provides a way for users to specify which models will be overridden in this application:
SWAPPABLE_MODELS = { 'user': 'myapp.SuperDuperUser' }
This specifies that the 'user' model will be satisfied by SuperDuperUser in this project.
If a model identifier (i.e., 'user') is mentioned in SWAPPABLE_MODELS
, then the original model (auth.User) isn't synchronised to the database, and isn't added to the App cache.
We then add a LazyForeignKey():
class Comment(Model): user = LazyForeignKey('user') ....
that will resolve the ForeignKey to the currently defined swappable model.
If an app defines a model with a ForeignKey to a swappable model, a warning is raised (since the model could potentially be swapped out); if the model is currently defined as swapped out, then an error will be raised (since the ForeignKey will be pointing at the wrong table).
The app cache gains a registry of swappable models, and a get_swappable_model(identifier)
entry point, so that users can easily retrieve the model currently being used as the swappable model for the various swappable endpoints.
Advantages
As for Solution 2a, but:
- Solves the general problem of swappable models, so other apps with similar problems (e.g., comments) can use the same infrastructure instead of having to reinvent the wheel.
- Catches migration problems as warnings/errors if existing apps haven't been updated to point to the swappable model.
Problems
As for Solution 2a.
Solution 2d: MYAPP_USER_MODEL
setting (a)
Similar to solution 2a, but instead of a single global USER_MODEL
setting, each app has its own USER_MODEL
setting.
Implementation
django.contrib.admin
can introduce an ADMIN_USER_MODEL
setting which defaults to None
.
user = models.ForeignKey(settings.ADMIN_USER_MODEL or settings.USER_MODEL)
Advantages
same as solution 2a, but
- support multiple user models
Problems
same as solution 2a.
Solution 2e: MYAPP_USER_MODEL
setting (b)
A combination of solution 2b and 2d. Every model is pluggable and each app has its own USER_MODEL
setting.
Implementation
django.contrib.admin
can introduce an ADMIN_USER_MODEL
setting which defaults to "auth.user"
.
user = models.ForeignKey(settings.ADMIN_USER_MODEL)
Advantages
- support multiple user models
- Existing projects require no migration. the same as solution 2/2a.
- Unlike solution 2a, change is not required for existing apps, if they do not use a different user model.
- Unrelated and third-party apps can indicate that they depend on various orthogonal mixins. the same as solution 2a.
Problems
same as solution 2b, and
- As with Solution 2, prone to unpredictable problems if
MYAPP_USER_MODEL
is modified after the initial syncdb.
Solution 3: Leverage App Refactor
Use Arthur Koziel's App Refactor patch from GSoC 2010 as a way to define a configurable auth app.
Implementation
- Land the App Refactor patch. This introduces a number of benefits -- reliable hooks for app startup, configurable app labels, predictable module loading, amongst others -- but the one that matters for the purposes of auth.User is that it allows Apps to be treated as items that need to be configured as a runtime activity. In this case, we need to be able to specify, at a project level, which model is your "User" model in the auth app.
- Introduce the concept of a LazyForeignKey. LazyForeignKey is a normal foreign key, with all the usual foreign key behaviors; the only difference is that the model it links to isn't specified in the model -- it's a configuration item drawn from an application configuration. So, ForeignKey('auth.User') creates a foreign key to django.contrib.auth.User; LazyForeignKey('auth.User') asks the auth app for the model that is being used as the 'User' model, and creates a foreign key to that. This can be done by slotting into the existing model reference resolution code, which is something that the app refactor cleans up.
- Add a Meta option to models --
pluggable
-- which controls whether the model can be replaced at runtime. If it can be, then the model may not be synchronised (so we don't get empty auth_user tables).
Optionally:
- Introduce an AbstractUser, and other concrete User models, same as for Solution 2
Advantages
- Doesn't have the circular dependency between settings and User
- Solves the generic problem, not a specific auth problem. contrib.comments could be retrofitted to use this approach, and any other application could do the same.
Problems
- Is dependent on the App Refactor landing in trunk. This may not occur any time soon.
- No transparent update path -- requires that third party apps be updated to be "pluggable auth compatible". This means app authors need to convert all ForeignKey(User) into LazyForeignKey('auth.User'), and modify any usage of forms etc. This could be considered a benefit, however; Migrating User models is a nontrivial step, and it should probably involve some opt-in engineering.
- Doesn't address the immediate problem for EmailField. We could do the same User/SimpleUser conversion here; with the added benefit that we are also introducing App Refactor, so we can use the distinction between an "unconfigured" auth app and a "Django 1.5 App Refactor Configured" auth app as the point for identifying whether User or SimpleUser is in use.
- As with Solution 2, prone to unpredictable problems if the app configuration is modified after the initial syncdb.
Solution 3a: Transparent LazyForeignKeys
As for Solution 3, but don't include an explicit LazyForeignKey class. Instead, use the "pluggable" marker in the Meta class; If you define a ForeignKey to a 'pluggable' model, you're indicating that this foreign key reference might be changed later on.
Advantages
- As for Solution 3, but provides a transparent migration path for apps -- no need to manually change ForeignKey(User). However, depending on your perspective, this might not actually be an advantage at all, because it removes the opt-in migration path.
Problems
- As for Solution 3.
Solution 4: contrib.newauth
- Deprecate contrib.auth, and create a new contrib.newauth in the same way we deprecated forms into newforms.
Ian Lewis has a project that starts down this path (See https://bitbucket.org/IanLewis/django-newauth/); however:
- it doesn't yet handle permissions, so it can't be used for contrib.admin
- It exhibits the settings-models dependency problem
- It introduces a new feature -- the ability to have *multiple* user models -- that isn't an obviously required improvement. Additional discussion is required before being adopted.
Advantages
- Gives us a clean slate to examine authentication and authorisation issues.
Problems
- There's no pret-a-porter project ready as a candidate. Rebuilding contrib.auth is a big undertaking, and doesn't yet have a clear design (or even design requirements).
Solution 5: Profile-based single user model
After reviewing proposals 1-4 on this list, Jacob indicated that he was uncomfortable with the idea of 'swappable' models (driven at least in part by lessons learned from Django's own history), and so he came up with an alternate proposal. This proposal suggests migrating towards a single, absolutely minimalist User model, with Profile objects being used to store any additional information. The full proposal is laid out in detail here https://gist.github.com/2245327 , but as a brief summary:
The user model becomes something like:
class User(models.Model): identifier = models.CharField(unique=True, db_index=True) password = models.CharField()
identifier
is included because it's required by 99% of User models; password
is included as an active discouragement to people reimplementing (badly) their own password authentications schemes.
Other user information -- name, staff/admin flags, permissions and so on -- are stored on profile models that are linked 1-1 with the User model. In order to make it easier to transition to the new user model, and to avoid coupling to specific profile models, the User model may also implement a delegate pattern for other user attributes. For example, the 'display name' for a user won't be defined by the User object itself; it will be serviced by one of the profile objects associated with the User, but you'll be able to request user.display_name to retrieve the name. When multiple profiles specify the same attribute, an AUTH_PROFILES setting will setting the precedence order.
Along the way, we'll also deprecate AUTH_USER_PROFILE and user.get_profile(), reflecting the fact that there isn't a single "profile" anymore.
Advantages
- There is a single canonical user model. There's no need for a swappable anything, or Foreign keys that point to dynamic models. This means the properties of the User object are reliable and consistent between projects. Inconsistencies only arise between uses of Profile objects.
- It makes almost no judgements about what a User should have on it. All such decisions are made by the developer by implementing an appropriate Profile object.
Problems
- There is a *huge* migration task, since all existing User tables need to be migrated to the new profile-based format. However, we would be moving from one known format (old auth.User) to another well known format (new auth.User plus admin.UserProfile), so the migration script will be reasonably predictable. When combined with delegated attributes, it should be possible to have a migration path that has all the usual warnings etc.
- It requires joins to get to anything other than the core User data. There is some argument as to whether this is primarily a technical limitation or a social one. Mailing list discussions have indicated a preference by some people for a monolithic User object, avoiding the need for joins when retrieving user data. Although technical reasons are usually given for this preference (usually, "joins are evil"), it isn't clear that this is a a problem in practice -- at least to the extent that we should be basing the entire design around accommodating sites that are actually experiencing those problems.
- If a new profile model is added to an existing project, new profile objects need to be instantiated for any existing User object. This is the analog of the problem in the "swappable-user" case; if the User model is modified, there is a migration task to make sure that the database User matches the code definition.
- There are still some issues to resolve around the handling of the delegate attributes:
- Should profiles be auto-select-related?
- Should profiles be auto-created if they are missing? If a profile has a required field and it is auto-created, how will that field be filled in?
- Form handling for the new User object -- especially ModelForm handling -- hasn't been fully elaborated.
Universal problems
Regardless of the final solution that is chosen, the decision to move to a pluggable User model will introduce some challenges.
The User Contract
In order for code to actually be able to adapt to swappable User models, there needs to be some common ground on what a "User" object can actually do. Django is in a position to enforce this by convention -- contrib.admin is a good baseline case.
Consensus seems to be that the 'minimal contract' for User needs to be little more than that required basic identification -- you need to be able to authenticate using arbitrary credentials, and be able to service a request to print "Hello <user>"
Beyond this base contract, we're in true duck-typing territory. If you have an app that expects to find an 'is_admin' attribute, and the provided User model doesn't have one, then your app won't work with that User model (and the failure mode won't be predictable).
This means that the onus will be on the developer using pluggable User model to test that their User model will work as expected. This can be helped by app developers:
- Clearly documenting the app's User model requirements.
- Including test suites in their app that explicitly test their User model requirements.
Separation of authentication from authorisation
contrib.admin requires a permissions API on User. However, this API isn't an obvious thing to require for *all* User objects. There has been very little discussion about the possibility of an abstract API for permissions.
The best approach here may be implement contrib.admin's permissions API as a mixin, separate to the AbstractUser base class. This would allow someone who doesn't want admin's permission model to avoid them, while making it easy to include all Django's permission requirements if the developer wants to use contrib.admin.
Forms
If an app creates a form (or ModelForm) on User, the exact contents of that form will be unpredictable. It will be extremely easy to define a form in an app that has clean methods or widget overrides that reference fields that don't exist on the User model in use.
It may be necessary to prevent ModelForm from being used on User (or any other pluggable model in the App Refactor case), and encourage app developers to make any Forms that they have configurable. This is analogous to Django's existing configurability for LoginForm et al.
Inheritance
If the User model can change, then models inheriting from the User model need to be able to adapt to a variable base class.
As a first iteration, it may be necessary to prohibit inheritance from a User model that is marked as pluggable.
Parallel concerns
None of these fully address the limitations with EmailField -- that the default max_length of 75 is too short to hold all email addresses. As a separate concern, we could address this problem in the same way that Solution 1 proposes to fix the User model, but focussed on the EmailField's max_length argument specifically:
- Introduce a new
ALLOW_RFC_COMPLIANT_EMAIL_ADDRESSES
setting. This would be set as False by default in global_settings.py, and True by default in the project template. This means existing projects get a value of False, but can opt-in to True at their convenience; new projects get a value of True. - Modify the EmailField definition to use this setting to determine the max_length
- Introduce a "MigrationWarning" that will be raised if
ALLOW_RFC_COMPLIANT_EMAIL_ADDRESSES
is False (or max_length isn't manually specified). Assuming this is introduced in 1.5, Django 1.5 users would get the warning if they haveALLOW_RFC_COMPLIANT_EMAIL_ADDRESSES
set to False. Django 1.6 would raise an error ifALLOW_RFC_COMPLIANT_EMAIL_ADDRESSES
is False. Django 1.7 would remove all reference to the setting.
This would address the problem for all EmailFields, not just the one in auth.User. This fix could be used in conjunction with any of the solutions proposed here; in fact, it would make some of them simpler (since there wouldn't be a need for a SimpleUser to migrate to an improved email field length).
Recommendations
Discussion on django-developers rang long, and revealed that complete consensus is unlikely:
- https://groups.google.com/d/topic/django-developers/ba21QMpffZs/discussion
- https://groups.google.com/d/topic/django-developers/Na0AmIGSGQA/discussion
- https://groups.google.com/d/topic/django-developers/PLTW8Mon9QU/discussion
A BDFL decision was called for; Option 2a was selected.
Implementation
A draft branch is available