Code

Opened 4 years ago

Closed 4 years ago

Last modified 3 years ago

#14301 closed (fixed)

django crashes on email address that passed validate_email() (utf8-tld)

Reported by: harm Owned by: nobody
Component: Core (Mail) Version: 1.2
Severity: Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

description

It seems validate_email() and send_email() in django don't agree on the validness of email addresses whenever utf8 is used in the tld.

I'm not sure what is currently allowed the domain part, there is an experimental RFC 5335, but at least django should agree with itself on this matter.

steps to reproduce

python manage.py shell
>>> from django.core.validators import validate_email
>>> from django.core.mail import send_mail
>>> email = u'chw08820@nyc.odn.ne.j\uff43'
>>> validate_email(email)
>>> send_mail('subject','message','from@example.com',[email])

result

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/__init__.py", line 61, in send_mail
    connection=connection).send()
  File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 175, in send
    return self.get_connection(fail_silently).send_messages([self])
  File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/backends/smtp.py", line 85, in send_messages
    sent = self._send(message)
  File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/backends/smtp.py", line 101, in _send
    email_message.message().as_string())
  File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 147, in message
    msg['To'] = ', '.join(self.to)
  File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 86, in __setitem__
    name, val = forbid_multi_line_headers(name, val, self.encoding)
  File "/opt/python2.5/lib/python2.5/site-packages/django/core/mail/message.py", line 70, in forbid_multi_line_headers
    result.append(formataddr((nm, str(addr))))
UnicodeEncodeError: 'ascii' codec can't encode character u'\uff43' in position 21: ordinal not in range(128)

expected result

Either

  1. validate_email() should reject the email address,

or

  1. send_email() should handle the message and attempt to deliver it.

version

django 1.2

Attachments (3)

bug14301.diff (1.9 KB) - added by andialbrecht 4 years ago.
14301.2.diff (1.7 KB) - added by jezdez 4 years ago.
idn_smtp.diff (2.3 KB) - added by claudep 4 years ago.
IDN encode in smtp backend

Download all attachments as: .zip

Change History (24)

comment:1 Changed 4 years ago by lukeplant

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

comment:2 Changed 4 years ago by anonymous

I added such test to regressiontests/mail.py and it worked fine.

comment:3 Changed 4 years ago by andialbrecht

For the record, support for IDN in validate_email() was added in r12474 (#9764).

comment:4 Changed 4 years ago by jezdez

  • milestone set to 1.3

Changed 4 years ago by andialbrecht

comment:5 Changed 4 years ago by jezdez

Hm, I can't seem to reproduce the error, tbh. Can someone confirm the test breakage?

Changed 4 years ago by jezdez

comment:6 Changed 4 years ago by SmileyChris

Is the try/except even needed? If the name gets always encoded to str using the charset, why not just assume we should do this for the email address to?

comment:7 Changed 4 years ago by jezdez

  • Resolution set to fixed
  • Status changed from new to closed

(In [14216]) Fixed #14301 -- Handle email validation gracefully with email addresses containing non-ASCII characters. Thanks, Andi Albrecht.

comment:8 Changed 4 years ago by jezdez

(In [14217]) [1.2.X] Fixed #14301 -- Handle email validation gracefully with email addresses containing non-ASCII characters. Thanks, Andi Albrecht.

Backport from trunk (r14216).

comment:9 Changed 4 years ago by philomat

  • Resolution fixed deleted
  • Status changed from closed to reopened

While one part of the problem is fixed, the SMTP backend still needs to be fixed since it is not expecting non-ASCII characters.

      File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/__init__.py", line 61, in send_mail
        connection=connection).send()
      File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/message.py", line 186, in send
        return self.get_connection(fail_silently).send_messages([self])
      File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/backends/smtp.py", line 85, in send_messages
        sent = self._send(message)
      File "/Sites/vendor/Django-1.3-alpha-1/django/core/mail/backends/smtp.py", line 101, in _send
        email_message.message().as_string())
      File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 703, in sendmail
        (code,resp)=self.rcpt(each, rcpt_options)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 457, in rcpt
        self.putcmd("rcpt","TO:%s%s" % (quoteaddr(recip),optionlist))
      File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 318, in putcmd
        self.send(str)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/smtplib.py", line 305, in send
        self.sock.sendall(str)
      File "<string>", line 1, in sendall
    
    UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 16: ordinal not in range(128)

comment:10 Changed 4 years ago by jezdez

Without giving specific details what actually spawned this traceback, I don't see a way to see the reason. Providing a test case would be the best way.

comment:11 Changed 4 years ago by philomat

The exact same steps to reproduce as listed by the original poster will spawn this traceback *if* you are using the SMTP backend. The email tests are probably not using it, which is why the error went unnoticed.

So, for the following example would show the problem:

   >>> email = u'foo@bär.com'
   >>> send_mail('subject','message','from@example.com',[email])

comment:12 Changed 4 years ago by andialbrecht

Maybe we need something like this in the SMTP backend:

>>> email = u'foo@bär.com'
>>> name, domain = email.split('@', 1)
>>> email = '@'.join([name, domain.encode('idna')])
>>> email
<<< u'foo@xn--br-eda5w.com'

But I'm not very familiar with IDN and the depths of SMTP, so this approach could be wrong.

comment:13 follow-up: Changed 4 years ago by philomat

The relevant Wikipedia article (German only, http://de.wikipedia.org/wiki/E-Mail-Adresse#Der_Dom.C3.A4nenteil_.28Domain_Part.29) says that with the introduction of IDN, nothing changes technically with regard to the SMTP protocol: Characters above ASCII code #127 are illegal. It is the client's responsibility to convert to an IDNA string.

Hence, andialbrechts approach seems to be absolutely correct.

comment:14 Changed 4 years ago by adamnelson

@philomat is right. Even in gmail, they won't accept IDN domain names like: test@façonnable.com

comment:15 Changed 4 years ago by adamnelson

One problem that this ticket exposes is that now we allow unicode email addresses for user accounts (#9764) but those people will never be able to receive email through Django - until it supports send() to IDN email addresses. I almost wonder if #9764 should be undone with regards to the email address until Django can send to IDN email addresses.

comment:16 in reply to: ↑ 13 ; follow-up: Changed 4 years ago by jezdez

  • Patch needs improvement set

Replying to philomat:

The relevant Wikipedia article (German only, http://de.wikipedia.org/wiki/E-Mail-Adresse#Der_Dom.C3.A4nenteil_.28Domain_Part.29) says that with the introduction of IDN, nothing changes technically with regard to the SMTP protocol: Characters above ASCII code #127 are illegal. It is the client's responsibility to convert to an IDNA string.

Hence, andialbrechts approach seems to be absolutely correct.

Indeed, and the actual culprit seems to be the fact that to, cc and bcc attributes of an EmailMessage are only idna encoded when calling its message() method (and blankly passed by the SMTP backend by using the recipients() method (see 1)). In other words, the encoding with idna needs to happen earlier in the life of an EmailMessage, say in the __init__ when the different recipient attributes are handled anyway (see 2).

1: http://code.djangoproject.com/browser/django/trunk/django/core/mail/message.py?rev=14216#L173

2: http://code.djangoproject.com/browser/django/trunk/django/core/mail/message.py?rev=14216#L121

comment:17 in reply to: ↑ 16 Changed 4 years ago by philomat

Replying to jezdez:

the encoding with idna needs to happen earlier in the life of an EmailMessage, say in the __init__ when the different recipient attributes are handled anyway (see 2).

I think encoding during __init__ is a bad idea since you might want to change those object variables at a later time. Maybe set private object vars during __init__ and make to, cc, bcc properties delivering encoded values?

Changed 4 years ago by claudep

IDN encode in smtp backend

comment:18 Changed 4 years ago by claudep

  • Has patch set
  • Patch needs improvement unset

Just attached a patch using the idea on comment 12 from andialbrecht. I choose to use the encode-at-the-last-moment approach. Note that only the addresses really used by SMTP to send the messages are idna encoded, not the addresses in the message headers.

comment:19 Changed 4 years ago by jezdez

  • Resolution set to fixed
  • Status changed from reopened to closed

(In [15006]) Fixed #14301 -- Further refine changes made in r14216 to support non-ASCII characters in email addresses. Thanks, Claude Peroz and Andi Albrecht.

comment:20 Changed 4 years ago by jezdez

(In [15007]) [1.2.X] Fixed #14301 -- Further refine changes made in r14217 to support non-ASCII characters in email addresses. Thanks, Claude Peroz and Andi Albrecht.

Backport from trunk (r15006).

comment:21 Changed 3 years ago by jacob

  • milestone 1.3 deleted

Milestone 1.3 deleted

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.