Opened 14 years ago

Closed 14 years ago

Last modified 13 years ago

#14301 closed (fixed)

django crashes on email address that passed validate_email() (utf8-tld)

Reported by: harm Owned by: nobody
Component: Core (Mail) Version: 1.2
Severity: Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

description

It seems validate_email() and send_email() in django don't agree on the validness of email addresses whenever utf8 is used in the tld.

I'm not sure what is currently allowed the domain part, there is an experimental RFC 5335, but at least django should agree with itself on this matter.

steps to reproduce

python manage.py shell
>>> from django.core.validators import validate_email
>>> from django.core.mail import send_mail
>>> email = u'chw08820@nyc.odn.ne.j\uff43'
>>> validate_email(email)
>>> send_mail('subject','message','from@example.com',[email])

result

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/__init__.py", line 61, in send_mail
    connection=connection).send()
  File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 175, in send
    return self.get_connection(fail_silently).send_messages([self])
  File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/backends/smtp.py", line 85, in send_messages
    sent = self._send(message)
  File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/backends/smtp.py", line 101, in _send
    email_message.message().as_string())
  File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 147, in message
    msg['To'] = ', '.join(self.to)
  File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 86, in __setitem__
    name, val = forbid_multi_line_headers(name, val, self.encoding)
  File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 70, in forbid_multi_line_headers
    result.append(formataddr((nm, str(addr))))
UnicodeEncodeError: 'ascii' codec can't encode character u'\uff43' in position 21: ordinal not in range(128)

expected result

Either

  1. validate_email() should reject the email address,

or

  1. send_email() should handle the message and attempt to deliver it.

version

django 1.2

Attachments (3)

bug14301.diff (1.9 KB ) - added by Andi Albrecht 14 years ago.
14301.2.diff (1.7 KB ) - added by Jannis Leidel 14 years ago.
idn_smtp.diff (2.3 KB ) - added by Claude Paroz 14 years ago.
IDN encode in smtp backend

Download all attachments as: .zip

Change History (24)

comment:1 by Luke Plant, 14 years ago

Triage Stage: UnreviewedAccepted

comment:2 by anonymous, 14 years ago

I added such test to regressiontests/mail.py and it worked fine.

comment:3 by Andi Albrecht, 14 years ago

For the record, support for IDN in validate_email() was added in r12474 (#9764).

comment:4 by Jannis Leidel, 14 years ago

milestone: 1.3

by Andi Albrecht, 14 years ago

Attachment: bug14301.diff added

comment:5 by Jannis Leidel, 14 years ago

Hm, I can't seem to reproduce the error, tbh. Can someone confirm the test breakage?

by Jannis Leidel, 14 years ago

Attachment: 14301.2.diff added

comment:6 by Chris Beaven, 14 years ago

Is the try/except even needed? If the name gets always encoded to str using the charset, why not just assume we should do this for the email address to?

comment:7 by Jannis Leidel, 14 years ago

Resolution: fixed
Status: newclosed

(In [14216]) Fixed #14301 -- Handle email validation gracefully with email addresses containing non-ASCII characters. Thanks, Andi Albrecht.

comment:8 by Jannis Leidel, 14 years ago

(In [14217]) [1.2.X] Fixed #14301 -- Handle email validation gracefully with email addresses containing non-ASCII characters. Thanks, Andi Albrecht.

Backport from trunk (r14216).

comment:9 by philomat, 14 years ago

Resolution: fixed
Status: closedreopened

While one part of the problem is fixed, the SMTP backend still needs to be fixed since it is not expecting non-ASCII characters.

      File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/__init__.py", line 61, in send_mail
        connection=connection).send()
      File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/message.py", line 186, in send
        return self.get_connection(fail_silently).send_messages([self])
      File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/backends/smtp.py", line 85, in send_messages
        sent = self._send(message)
      File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/backends/smtp.py", line 101, in _send
        email_message.message().as_string())
      File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 703, in sendmail
        (code,resp)=self.rcpt(each, rcpt_options)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 457, in rcpt
        self.putcmd("rcpt","TO:%s%s" % (quoteaddr(recip),optionlist))
      File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 318, in putcmd
        self.send(str)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 305, in send
        self.sock.sendall(str)
      File "<string>", line 1, in sendall
    
    UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 16: ordinal not in range(128)

comment:10 by Jannis Leidel, 14 years ago

Without giving specific details what actually spawned this traceback, I don't see a way to see the reason. Providing a test case would be the best way.

comment:11 by philomat, 14 years ago

The exact same steps to reproduce as listed by the original poster will spawn this traceback *if* you are using the SMTP backend. The email tests are probably not using it, which is why the error went unnoticed.

So, for the following example would show the problem:

   >>> email = u'foo@bär.com'
   >>> send_mail('subject','message','from@example.com',[email])

comment:12 by Andi Albrecht, 14 years ago

Maybe we need something like this in the SMTP backend:

>>> email = u'foo@bär.com'
>>> name, domain = email.split('@', 1)
>>> email = '@'.join([name, domain.encode('idna')])
>>> email
<<< u'foo@xn--br-eda5w.com'

But I'm not very familiar with IDN and the depths of SMTP, so this approach could be wrong.

comment:13 by philomat, 14 years ago

The relevant Wikipedia article (German only, http://de.wikipedia.org/wiki/E-Mail-Adresse#Der_Dom.C3.A4nenteil_.28Domain_Part.29) says that with the introduction of IDN, nothing changes technically with regard to the SMTP protocol: Characters above ASCII code #127 are illegal. It is the client's responsibility to convert to an IDNA string.

Hence, andialbrechts approach seems to be absolutely correct.

comment:14 by Adam Nelson, 14 years ago

@philomat is right. Even in gmail, they won't accept IDN domain names like: test@façonnable.com

comment:15 by Adam Nelson, 14 years ago

One problem that this ticket exposes is that now we allow unicode email addresses for user accounts (#9764) but those people will never be able to receive email through Django - until it supports send() to IDN email addresses. I almost wonder if #9764 should be undone with regards to the email address until Django can send to IDN email addresses.

in reply to:  13 ; comment:16 by Jannis Leidel, 14 years ago

Patch needs improvement: set

Replying to philomat:

The relevant Wikipedia article (German only, http://de.wikipedia.org/wiki/E-Mail-Adresse#Der_Dom.C3.A4nenteil_.28Domain_Part.29) says that with the introduction of IDN, nothing changes technically with regard to the SMTP protocol: Characters above ASCII code #127 are illegal. It is the client's responsibility to convert to an IDNA string.

Hence, andialbrechts approach seems to be absolutely correct.

Indeed, and the actual culprit seems to be the fact that to, cc and bcc attributes of an EmailMessage are only idna encoded when calling its message() method (and blankly passed by the SMTP backend by using the recipients() method (see 1)). In other words, the encoding with idna needs to happen earlier in the life of an EmailMessage, say in the __init__ when the different recipient attributes are handled anyway (see 2).

1: http://code.djangoproject.com/browser/django/trunk/django/core/mail/message.py?rev=14216#L173

2: http://code.djangoproject.com/browser/django/trunk/django/core/mail/message.py?rev=14216#L121

in reply to:  16 comment:17 by philomat, 14 years ago

Replying to jezdez:

the encoding with idna needs to happen earlier in the life of an EmailMessage, say in the __init__ when the different recipient attributes are handled anyway (see 2).

I think encoding during __init__ is a bad idea since you might want to change those object variables at a later time. Maybe set private object vars during __init__ and make to, cc, bcc properties delivering encoded values?

by Claude Paroz, 14 years ago

Attachment: idn_smtp.diff added

IDN encode in smtp backend

comment:18 by Claude Paroz, 14 years ago

Has patch: set
Patch needs improvement: unset

Just attached a patch using the idea on comment 12 from andialbrecht. I choose to use the encode-at-the-last-moment approach. Note that only the addresses really used by SMTP to send the messages are idna encoded, not the addresses in the message headers.

comment:19 by Jannis Leidel, 14 years ago

Resolution: fixed
Status: reopenedclosed

(In [15006]) Fixed #14301 -- Further refine changes made in r14216 to support non-ASCII characters in email addresses. Thanks, Claude Peroz and Andi Albrecht.

comment:20 by Jannis Leidel, 14 years ago

(In [15007]) [1.2.X] Fixed #14301 -- Further refine changes made in r14217 to support non-ASCII characters in email addresses. Thanks, Claude Peroz and Andi Albrecht.

Backport from trunk (r15006).

comment:21 by Jacob, 13 years ago

milestone: 1.3

Milestone 1.3 deleted

Note: See TracTickets for help on using tickets.
Back to Top