#14301 closed (fixed)
django crashes on email address that passed validate_email() (utf8-tld)
| Reported by: | harm | Owned by: | nobody |
|---|---|---|---|
| Component: | Core (Mail) | Version: | 1.2 |
| Severity: | Keywords: | ||
| Cc: | Triage Stage: | Accepted | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
description
It seems validate_email() and send_email() in django don't agree on the validness of email addresses whenever utf8 is used in the tld.
I'm not sure what is currently allowed the domain part, there is an experimental RFC 5335, but at least django should agree with itself on this matter.
steps to reproduce
python manage.py shell >>> from django.core.validators import validate_email >>> from django.core.mail import send_mail >>> email = u'chw08820@nyc.odn.ne.j\uff43' >>> validate_email(email) >>> send_mail('subject','message','from@example.com',[email])
result
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/__init__.py", line 61, in send_mail
connection=connection).send()
File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 175, in send
return self.get_connection(fail_silently).send_messages([self])
File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/backends/smtp.py", line 85, in send_messages
sent = self._send(message)
File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/backends/smtp.py", line 101, in _send
email_message.message().as_string())
File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 147, in message
msg['To'] = ', '.join(self.to)
File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 86, in __setitem__
name, val = forbid_multi_line_headers(name, val, self.encoding)
File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 70, in forbid_multi_line_headers
result.append(formataddr((nm, str(addr))))
UnicodeEncodeError: 'ascii' codec can't encode character u'\uff43' in position 21: ordinal not in range(128)
expected result
Either
- validate_email() should reject the email address,
or
- send_email() should handle the message and attempt to deliver it.
version
django 1.2
Attachments (3)
Change History (24)
comment:1 by , 15 years ago
| Triage Stage: | Unreviewed → Accepted |
|---|
comment:2 by , 15 years ago
comment:3 by , 15 years ago
comment:4 by , 15 years ago
| milestone: | → 1.3 |
|---|
by , 15 years ago
| Attachment: | bug14301.diff added |
|---|
comment:5 by , 15 years ago
Hm, I can't seem to reproduce the error, tbh. Can someone confirm the test breakage?
by , 15 years ago
| Attachment: | 14301.2.diff added |
|---|
comment:6 by , 15 years ago
Is the try/except even needed? If the name gets always encoded to str using the charset, why not just assume we should do this for the email address to?
comment:7 by , 15 years ago
| Resolution: | → fixed |
|---|---|
| Status: | new → closed |
comment:8 by , 15 years ago
comment:9 by , 15 years ago
| Resolution: | fixed |
|---|---|
| Status: | closed → reopened |
While one part of the problem is fixed, the SMTP backend still needs to be fixed since it is not expecting non-ASCII characters.
File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/__init__.py", line 61, in send_mail
connection=connection).send()
File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/message.py", line 186, in send
return self.get_connection(fail_silently).send_messages([self])
File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/backends/smtp.py", line 85, in send_messages
sent = self._send(message)
File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/backends/smtp.py", line 101, in _send
email_message.message().as_string())
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 703, in sendmail
(code,resp)=self.rcpt(each, rcpt_options)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 457, in rcpt
self.putcmd("rcpt","TO:%s%s" % (quoteaddr(recip),optionlist))
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 318, in putcmd
self.send(str)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 305, in send
self.sock.sendall(str)
File "<string>", line 1, in sendall
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 16: ordinal not in range(128)
comment:10 by , 15 years ago
Without giving specific details what actually spawned this traceback, I don't see a way to see the reason. Providing a test case would be the best way.
comment:11 by , 15 years ago
The exact same steps to reproduce as listed by the original poster will spawn this traceback *if* you are using the SMTP backend. The email tests are probably not using it, which is why the error went unnoticed.
So, for the following example would show the problem:
>>> email = u'foo@bär.com'
>>> send_mail('subject','message','from@example.com',[email])
comment:12 by , 15 years ago
Maybe we need something like this in the SMTP backend:
>>> email = u'foo@bär.com'
>>> name, domain = email.split('@', 1)
>>> email = '@'.join([name, domain.encode('idna')])
>>> email
<<< u'foo@xn--br-eda5w.com'
But I'm not very familiar with IDN and the depths of SMTP, so this approach could be wrong.
follow-up: 16 comment:13 by , 15 years ago
The relevant Wikipedia article (German only, http://de.wikipedia.org/wiki/E-Mail-Adresse#Der_Dom.C3.A4nenteil_.28Domain_Part.29) says that with the introduction of IDN, nothing changes technically with regard to the SMTP protocol: Characters above ASCII code #127 are illegal. It is the client's responsibility to convert to an IDNA string.
Hence, andialbrechts approach seems to be absolutely correct.
comment:14 by , 15 years ago
@philomat is right. Even in gmail, they won't accept IDN domain names like: test@façonnable.com
comment:15 by , 15 years ago
One problem that this ticket exposes is that now we allow unicode email addresses for user accounts (#9764) but those people will never be able to receive email through Django - until it supports send() to IDN email addresses. I almost wonder if #9764 should be undone with regards to the email address until Django can send to IDN email addresses.
follow-up: 17 comment:16 by , 15 years ago
| Patch needs improvement: | set |
|---|
Replying to philomat:
The relevant Wikipedia article (German only, http://de.wikipedia.org/wiki/E-Mail-Adresse#Der_Dom.C3.A4nenteil_.28Domain_Part.29) says that with the introduction of IDN, nothing changes technically with regard to the SMTP protocol: Characters above ASCII code #127 are illegal. It is the client's responsibility to convert to an IDNA string.
Hence, andialbrechts approach seems to be absolutely correct.
Indeed, and the actual culprit seems to be the fact that to, cc and bcc attributes of an EmailMessage are only idna encoded when calling its message() method (and blankly passed by the SMTP backend by using the recipients() method (see 1)). In other words, the encoding with idna needs to happen earlier in the life of an EmailMessage, say in the __init__ when the different recipient attributes are handled anyway (see 2).
1: http://code.djangoproject.com/browser/django/trunk/django/core/mail/message.py?rev=14216#L173
2: http://code.djangoproject.com/browser/django/trunk/django/core/mail/message.py?rev=14216#L121
comment:17 by , 15 years ago
Replying to jezdez:
the encoding with idna needs to happen earlier in the life of an EmailMessage, say in the
__init__when the different recipient attributes are handled anyway (see 2).
I think encoding during __init__ is a bad idea since you might want to change those object variables at a later time. Maybe set private object vars during __init__ and make to, cc, bcc properties delivering encoded values?
comment:18 by , 15 years ago
| Has patch: | set |
|---|---|
| Patch needs improvement: | unset |
Just attached a patch using the idea on comment 12 from andialbrecht. I choose to use the encode-at-the-last-moment approach. Note that only the addresses really used by SMTP to send the messages are idna encoded, not the addresses in the message headers.
comment:19 by , 15 years ago
| Resolution: | → fixed |
|---|---|
| Status: | reopened → closed |
I added such test to regressiontests/mail.py and it worked fine.