#14301 closed (fixed)
django crashes on email address that passed validate_email() (utf8-tld)
Reported by: | harm | Owned by: | nobody |
---|---|---|---|
Component: | Core (Mail) | Version: | 1.2 |
Severity: | Keywords: | ||
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
description
It seems validate_email() and send_email() in django don't agree on the validness of email addresses whenever utf8 is used in the tld.
I'm not sure what is currently allowed the domain part, there is an experimental RFC 5335, but at least django should agree with itself on this matter.
steps to reproduce
python manage.py shell >>> from django.core.validators import validate_email >>> from django.core.mail import send_mail >>> email = u'chw08820@nyc.odn.ne.j\uff43' >>> validate_email(email) >>> send_mail('subject','message','from@example.com',[email])
result
Traceback (most recent call last): File "<console>", line 1, in <module> File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/__init__.py", line 61, in send_mail connection=connection).send() File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 175, in send return self.get_connection(fail_silently).send_messages([self]) File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/backends/smtp.py", line 85, in send_messages sent = self._send(message) File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/backends/smtp.py", line 101, in _send email_message.message().as_string()) File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 147, in message msg['To'] = ', '.join(self.to) File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 86, in __setitem__ name, val = forbid_multi_line_headers(name, val, self.encoding) File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 70, in forbid_multi_line_headers result.append(formataddr((nm, str(addr)))) UnicodeEncodeError: 'ascii' codec can't encode character u'\uff43' in position 21: ordinal not in range(128)
expected result
Either
- validate_email() should reject the email address,
or
- send_email() should handle the message and attempt to deliver it.
version
django 1.2
Attachments (3)
Change History (24)
comment:1 by , 14 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 14 years ago
comment:3 by , 14 years ago
comment:4 by , 14 years ago
milestone: | → 1.3 |
---|
by , 14 years ago
Attachment: | bug14301.diff added |
---|
comment:5 by , 14 years ago
Hm, I can't seem to reproduce the error, tbh. Can someone confirm the test breakage?
by , 14 years ago
Attachment: | 14301.2.diff added |
---|
comment:6 by , 14 years ago
Is the try/except even needed? If the name gets always encoded to str using the charset, why not just assume we should do this for the email address to?
comment:7 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:8 by , 14 years ago
comment:9 by , 14 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
While one part of the problem is fixed, the SMTP backend still needs to be fixed since it is not expecting non-ASCII characters.
File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/__init__.py", line 61, in send_mail connection=connection).send() File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/message.py", line 186, in send return self.get_connection(fail_silently).send_messages([self]) File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/backends/smtp.py", line 85, in send_messages sent = self._send(message) File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/backends/smtp.py", line 101, in _send email_message.message().as_string()) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 703, in sendmail (code,resp)=self.rcpt(each, rcpt_options) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 457, in rcpt self.putcmd("rcpt","TO:%s%s" % (quoteaddr(recip),optionlist)) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 318, in putcmd self.send(str) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 305, in send self.sock.sendall(str) File "<string>", line 1, in sendall UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 16: ordinal not in range(128)
comment:10 by , 14 years ago
Without giving specific details what actually spawned this traceback, I don't see a way to see the reason. Providing a test case would be the best way.
comment:11 by , 14 years ago
The exact same steps to reproduce as listed by the original poster will spawn this traceback *if* you are using the SMTP backend. The email tests are probably not using it, which is why the error went unnoticed.
So, for the following example would show the problem:
>>> email = u'foo@bär.com' >>> send_mail('subject','message','from@example.com',[email])
comment:12 by , 14 years ago
Maybe we need something like this in the SMTP backend:
>>> email = u'foo@bär.com' >>> name, domain = email.split('@', 1) >>> email = '@'.join([name, domain.encode('idna')]) >>> email <<< u'foo@xn--br-eda5w.com'
But I'm not very familiar with IDN and the depths of SMTP, so this approach could be wrong.
follow-up: 16 comment:13 by , 14 years ago
The relevant Wikipedia article (German only, http://de.wikipedia.org/wiki/E-Mail-Adresse#Der_Dom.C3.A4nenteil_.28Domain_Part.29) says that with the introduction of IDN, nothing changes technically with regard to the SMTP protocol: Characters above ASCII code #127 are illegal. It is the client's responsibility to convert to an IDNA string.
Hence, andialbrechts approach seems to be absolutely correct.
comment:14 by , 14 years ago
@philomat is right. Even in gmail, they won't accept IDN domain names like: test@façonnable.com
comment:15 by , 14 years ago
One problem that this ticket exposes is that now we allow unicode email addresses for user accounts (#9764) but those people will never be able to receive email through Django - until it supports send() to IDN email addresses. I almost wonder if #9764 should be undone with regards to the email address until Django can send to IDN email addresses.
follow-up: 17 comment:16 by , 14 years ago
Patch needs improvement: | set |
---|
Replying to philomat:
The relevant Wikipedia article (German only, http://de.wikipedia.org/wiki/E-Mail-Adresse#Der_Dom.C3.A4nenteil_.28Domain_Part.29) says that with the introduction of IDN, nothing changes technically with regard to the SMTP protocol: Characters above ASCII code #127 are illegal. It is the client's responsibility to convert to an IDNA string.
Hence, andialbrechts approach seems to be absolutely correct.
Indeed, and the actual culprit seems to be the fact that to, cc and bcc attributes of an EmailMessage are only idna encoded when calling its message() method (and blankly passed by the SMTP backend by using the recipients() method (see 1)). In other words, the encoding with idna needs to happen earlier in the life of an EmailMessage, say in the __init__
when the different recipient attributes are handled anyway (see 2).
1: http://code.djangoproject.com/browser/django/trunk/django/core/mail/message.py?rev=14216#L173
2: http://code.djangoproject.com/browser/django/trunk/django/core/mail/message.py?rev=14216#L121
comment:17 by , 14 years ago
Replying to jezdez:
the encoding with idna needs to happen earlier in the life of an EmailMessage, say in the
__init__
when the different recipient attributes are handled anyway (see 2).
I think encoding during __init__
is a bad idea since you might want to change those object variables at a later time. Maybe set private object vars during __init__
and make to, cc, bcc
properties delivering encoded values?
comment:18 by , 14 years ago
Has patch: | set |
---|---|
Patch needs improvement: | unset |
Just attached a patch using the idea on comment 12 from andialbrecht. I choose to use the encode-at-the-last-moment approach. Note that only the addresses really used by SMTP to send the messages are idna encoded, not the addresses in the message headers.
comment:19 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
I added such test to regressiontests/mail.py and it worked fine.