Opened 7 years ago

Last modified 6 weeks ago

#27029 assigned Cleanup/optimization

Make EmailValidator accept non-ASCII characters

Reported by: Ramin Farajpour Cami Owned by: j-bernard
Component: Core (Other) Version: dev
Severity: Normal Keywords:
Cc: Florian Apolloner Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

from django.core.validators import validate_email
validate_email('うえあいお@email.com')

if you check this url email chacker with うえあいお@address.com , this is not valid email address ,

Thanks,
Ramin

Change History (34)

comment:1 Changed 7 years ago by Claude Paroz

Resolution: duplicate
Status: newclosed

Sure thing!
Duplicate of #26423

comment:2 Changed 7 years ago by Claude Paroz

Has patch: set
Resolution: duplicate
Status: closednew
Triage Stage: UnreviewedAccepted
Version: 1.10master

Reopening as I have a patch which targets this specific issue.

comment:3 in reply to:  2 Changed 7 years ago by Ramin Farajpour Cami

Replying to claudep:

Reopening as I have a patch which targets this specific issue.

Hi,

Thanks a lot,

comment:4 Changed 7 years ago by Tim Graham

Triage Stage: AcceptedReady for checkin

comment:5 Changed 7 years ago by Claude Paroz

I'm just wondering, is there still a use case to keep ASCII-only validation (and hence provide validate_email_ascii)?

comment:6 Changed 7 years ago by Tim Graham

Not sure, maybe you want to ask on the DevelopersMailingList. I guess usage might be a bit difficult until #25594 is fixed.

comment:7 Changed 7 years ago by Claude Paroz

Triage Stage: Ready for checkinAccepted

I just tested Firefox and Chrome email validation, and they don't accept non-ASCII in the local part.

In any case, I think we should provide both validators (ASCII-only and Unicode). It might be a bit too soon to unconditionally allow Unicode in emails.

comment:8 Changed 7 years ago by Claude Paroz

Patch needs improvement: set

comment:9 Changed 7 years ago by Tim Graham

Summary: invalid email addresses input django validate_emailMake EmailVadliator accept non-ASCII characters
Type: BugCleanup/optimization

See #21859 for a documentation request to clarify the state of ASCII/Unicode email addresses in Django.

comment:10 Changed 7 years ago by Tim Graham

Summary: Make EmailVadliator accept non-ASCII charactersMake EmailValidator accept non-ASCII characters

comment:11 Changed 7 years ago by Ramin Farajpour Cami

Hi Tim,

Please merge PR ,
Thanks

comment:12 Changed 7 years ago by Claude Paroz

Hey RaminFP,
I plan to come with a new patch, with two versions of the email validator. I don't think that non-ASCII local parts of email addresses are widespread enough to set it by default. The idea is that you could easily opt-in for the Unicode validator of your choice when you define the field in your models.

comment:13 in reply to:  12 Changed 7 years ago by Ramin Farajpour Cami

Replying to claudep:

Hey RaminFP,
I plan to come with a new patch, with two versions of the email validator. I don't think that non-ASCII local parts of email addresses are widespread enough to set it by default. The idea is that you could easily opt-in for the Unicode validator of your choice when you define the field in your models.

Will be fix in new version 1.11? i can make suggestions to fix?

comment:14 Changed 7 years ago by Claude Paroz

Yes, the plan is clearly to include that in 1.11. We still have some months ahead :-)

comment:15 in reply to:  14 Changed 7 years ago by Ramin Farajpour Cami

Replying to claudep:

Yes, the plan is clearly to include that in 1.11. We still have some months ahead :-)

Owesome, i have one question, you have CONTRIBUTORS LIST i can add this list? or no i should try for send a lot of report for added to list contributors?

comment:16 Changed 7 years ago by Claude Paroz

Yes, you are supposed to have done significant work for Django to be listed there. Of course, that's very subjective, but filling a couple of reports isn't sufficient for that.

comment:17 Changed 7 years ago by Ramin Farajpour Cami

Very good, i like working with Django community always in security and see code issue for fix it,i see you write PR for this report so I can't add my name to ​CONTRIBUTORS LIST, :((

Thanks,

comment:18 Changed 7 years ago by Ramin Farajpour Cami

Hi,
I see here way Contributing to Django
https://docs.djangoproject.com/en/dev/internals/contributing/
You write PR for patch issue ,this means my name not added?

comment:19 Changed 7 years ago by Tim Graham

Yes, we add to AUTHORS based on code contributions not bug reports.

comment:20 Changed 6 years ago by Wout De Puysseleir

Owner: changed from Ramin Farajpour Cami to Wout De Puysseleir
Patch needs improvement: unset
Status: newassigned

PR

I've added a new patch for this.

comment:21 Changed 6 years ago by Florian Apolloner

Patch needs improvement: set

I am against this patch, adding more regular expressions is the wrong way to go. I'd like to propose to change the current email validator to just check if "@" is in the address and be done with it. See also https://davidcel.is/posts/stop-validating-email-addresses-with-regex/ -- I think this is something which should have a bit of discussion on the mailing list.

comment:22 Changed 6 years ago by Florian Apolloner

Cc: Florian Apolloner added

comment:23 Changed 6 years ago by Tim Graham

Ideas about simplification are discussed in #26423 and on the django-developers mailing list.

comment:24 Changed 5 years ago by Collin Anderson

if we do allow non-ascii, I wonder if we should be sure the email is "printable" (not allow hidden characters like '\u200b') https://docs.python.org/3/library/stdtypes.html#str.isprintable

comment:25 Changed 13 months ago by Mariusz Felisiak

Owner: Wout De Puysseleir deleted
Status: assignednew

comment:26 Changed 7 months ago by j-bernard

Commenting here since #33967 has been closed as a duplicate.

Unicode in local-part is allowed by the latest standards, therefore EmailValidator is preventing valid email addresses to be used in Django. Making the current regex allow Unicode characters instead of [0-9A-Z] would do the trick.

#26423 won't solve this as HTML5 validator does not allow Unicode in local-part either.

Last edited 7 months ago by j-bernard (previous) (diff)

comment:27 Changed 6 months ago by j-bernard

I submitted this PR. I made it change as little as possible to at least get Unicode local-part valid.

comment:28 Changed 3 months ago by Jacob Walls

Patch needs improvement: unset

Improvement flag was set on prior PR proposing additional regular expressions. Current PR simplifies the existing one per comment.

comment:29 Changed 3 months ago by Mariusz Felisiak

Owner: set to j-bernard
Status: newassigned

comment:30 Changed 3 months ago by Carlton Gibson

The new PR seems OK™ — for strings `\w` is equivalent to [a-zA-Z0-9_] with ASCII, and the unicode examples then pass.

I worry slightly about bringing in a host of lookalike address vulnerabilities. 🤔

I think this needs a discussion to decide the way forward.

  1. I'm not convinced this is really a distinct issue to #26423.
  2. The mailing list discussion was essentially unanimous to radically simplify here (rather than continue to tweak).

Florian's comment:21 more or less sums it up:

...propose to change the current email validator to just check if "@" is in the address and be done with it.

We've said similar with URLValidator a number of times.

I'm not sure we shouldn't (again) mark this as a duplicate of #26423, re-purpose that to simplify the validation, make sure How to customise validation shows the way forward clearly, and then close everything else in this area as wontfix. 🤔

Last edited 3 months ago by Carlton Gibson (previous) (diff)

comment:31 Changed 3 months ago by Claude Paroz

Different projects have different requirements. What about providing different validators: a simple one where only <somechar>@<somechar> presence is checked, a more elaborate one like the current one, and an equivalent to the previous allowing unicode. The question then is to decide which would be the default.

comment:32 Changed 3 months ago by Carlton Gibson

I think that sounds quite reasonable Claude.

comment:33 Changed 3 months ago by j-bernard

Here is a little context for my use case. In general, I create my own validator whenever it's needed to override the default Django behavior but I have a particular case where django-allauth app is used and is using the EmailValidator. In that case, I cannot easily override it.
My suggestion would then be to make the default validator more permissive to get some flexibility in the kind of use case that I have. One can still include another validation layer on top of that.

If you don't mind implementing a more complex specific validator for internationalized email addresses it would be better to avoid using only a regex. I kept it the simplest as I could in my PR because I'm aware that changing the validator is touchy.

comment:34 Changed 6 weeks ago by Carlton Gibson

Patch needs improvement: set

My suggestion would then be to make the default validator more permissive to get some flexibility in the kind of use case that I have. One can still include another validation layer on top of that.

I don't think we can just swap out the current validation for a looser one. Folks will be depending on the existing behaviour.

Maybe we can ship a couple of variants, but I'm not sure what switching method we might allow. I think we need a story there in order to proceed. 🤔

I looked at django-allauth — it's using the validate_email instance, in various, deeply-nested places — I think an issue over there, to look at making that pluggable, is needed really. (In the meantime, one could monkey patch validate_email with whatever validator you wanted to adjust that… — again, not something I think we can just swap out from beneath it.)

Note: See TracTickets for help on using tickets.
Back to Top