Validation & Unicode Character 'ZERO WIDTH SPACE' (U+200B)
|Reported by:||Raymond Penners||Owned by:||nobody|
|Has patch:||no||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||no|
Once in a while users somehow manage to inject e-mail addresses into the system containing unicode zero width space characters. I am not sure how they do it -- it probably happens when copy/pasting from a document of some sorts. Nevertheless, form validation does not reject such e-mail addresses:
>>> from django.core.validators import validate_email >>> firstname.lastname@example.org\u200bm' >>> validate_email(email) >>> # No ValidationError ?
These e-mail addresses get accepted and cause trouble later on (try sending mail to them, or hashing them for gravatar uses). Either:
a) Raise a ValidationError for such e-mail addresses, or
b) Automatically strip this character
Downside of a) is that the user is most likely unaware of this invisible character. He wouldn't know what character to remove where, even if instructed by an error message.