Opened 16 months ago

Last modified 14 months ago

#21837 new Bug

auth.User Email - non-RFC spec normalization

Reported by: ross@… Owned by:
Component: contrib.auth Version: 1.6
Severity: Normal Keywords: authentication, email, filter, get, error nlsprint14
Cc: FunkyBob, eromijn@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

when a user signs up with Monkey@…

auth.User.object.normalize_email() saves the email as Monkey@… to conform with RFC.

But future lookups will return None due to BadDomain.com != baddomain.com where the user continually enters Monkey@… because thats what in their muscle/chrome memory.

  1. normalize_email is applied at the orm level, but should (also?) be applied at the Field level to help with this problem.

Change History (4)

comment:1 Changed 15 months ago by erikr

  • Cc eromijn@… added
  • Keywords nlsprint14 added
  • Needs documentation unset
  • Needs tests unset
  • Owner changed from nobody to erikr
  • Patch needs improvement unset
  • Status changed from new to assigned

comment:2 Changed 14 months ago by aaugustin

  • Triage Stage changed from Unreviewed to Accepted

Erik, did you have a plan for this?

I'm wondering why we bother with normalize_email. Can't we simply drop it?

comment:3 Changed 14 months ago by erikr

  • Owner erikr deleted
  • Status changed from assigned to new

Well, in the mean time this has been discussed on the side in https://groups.google.com/forum/#!msg/django-developers/7feYlp9HqKs - there's some RFC references in there too.

Technically, the correct behaviour is to keep the case of the user part intact, and ignore case in the domain part. This is what normalize_email helps, by explicitly lowercasing the domain part. However, this is not consistently applied in e.g. UserCreateForm. With custom user models, the default behaviour for the username field (which could be an email address, but may not be) is to match case sensitive, without first applying normalize_email.

There are two approaches:

  • The technically correct: when doing lookups on email addresses, either in Django or in third party apps, the value should always first be passed through normalize_email. The subsequent query should then be case sensitive. If a custom user model would use email as username, they could override get_by_natural_key to include normalizing. The downside is that if users accidentally enter an uppercase character, due to helpful auto-capitalisation for example, they have to use that on future entries too.
  • The more pragmatic: always match case insensitive on email addresses. This still needs a custom get_by_natural_key for custom user models, unless we make all username fields of any kind case insensitive by default. This is the current choice in the password reset view in django.contrib.auth. This means we can drop normalize_email. This has a very minor backwards compatibility issue if someone has a database with users where their email address only differs by case.

I do think our current inconsistencies should be resolved, but have no strong preference to either approach. I'm a bit more inclined to option 2.

comment:4 Changed 14 months ago by mjtamlyn

As an aside, we have to be very careful encouraging case insensitive lookups due to index usage. On postgres you can make an index for this, but not in Django (for now). I'm unsure about other DBS.

Note: See TracTickets for help on using tickets.
Back to Top