Opened 3 years ago

Last modified 6 weeks ago

#21837 new Bug

auth.User Email - non-RFC spec case normalization

Reported by: ross@… Owned by:
Component: contrib.auth Version: 1.6
Severity: Normal Keywords: authentication, email, filter, get, error nlsprint14
Cc: FunkyBob, eromijn@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Anton Samarchyan)

when a user signs up with

auth.User.object.normalize_email() saves the email as to conform with RFC.

But future lookups will return None due to != where the user continually enters because thats what in their muscle/chrome memory.

  1. normalize_email is applied at the ORM level, but should (also?) be applied at the Field level to help with this problem.

Change History (6)

comment:1 Changed 3 years ago by Erik Romijn

Cc: eromijn@… added
Keywords: nlsprint14 added
Owner: changed from nobody to Erik Romijn
Status: newassigned

comment:2 Changed 3 years ago by Aymeric Augustin

Triage Stage: UnreviewedAccepted

Erik, did you have a plan for this?

I'm wondering why we bother with normalize_email. Can't we simply drop it?

comment:3 Changed 3 years ago by Erik Romijn

Owner: Erik Romijn deleted
Status: assignednew

Well, in the mean time this has been discussed on the side in!msg/django-developers/7feYlp9HqKs - there's some RFC references in there too.

Technically, the correct behaviour is to keep the case of the user part intact, and ignore case in the domain part. This is what normalize_email helps, by explicitly lowercasing the domain part. However, this is not consistently applied in e.g. UserCreateForm. With custom user models, the default behaviour for the username field (which could be an email address, but may not be) is to match case sensitive, without first applying normalize_email.

There are two approaches:

  • The technically correct: when doing lookups on email addresses, either in Django or in third party apps, the value should always first be passed through normalize_email. The subsequent query should then be case sensitive. If a custom user model would use email as username, they could override get_by_natural_key to include normalizing. The downside is that if users accidentally enter an uppercase character, due to helpful auto-capitalisation for example, they have to use that on future entries too.
  • The more pragmatic: always match case insensitive on email addresses. This still needs a custom get_by_natural_key for custom user models, unless we make all username fields of any kind case insensitive by default. This is the current choice in the password reset view in django.contrib.auth. This means we can drop normalize_email. This has a very minor backwards compatibility issue if someone has a database with users where their email address only differs by case.

I do think our current inconsistencies should be resolved, but have no strong preference to either approach. I'm a bit more inclined to option 2.

comment:4 Changed 3 years ago by Marc Tamlyn

As an aside, we have to be very careful encouraging case insensitive lookups due to index usage. On postgres you can make an index for this, but not in Django (for now). I'm unsure about other DBS.

comment:5 Changed 2 months ago by Collin Anderson

Summary: auth.User Email - non-RFC spec normalizationauth.User Email - non-RFC spec case normalization

comment:6 Changed 6 weeks ago by Anton Samarchyan

Description: modified (diff)
Note: See TracTickets for help on using tickets.
Back to Top