Opened 8 years ago

Closed 8 years ago

#25986 closed Bug (fixed)

Django crashes on unicode characters in the local part of an e-mail address

Reported by: Sergei Maertens Owned by: Sergei Maertens
Component: Core (Mail) Version: 1.9
Severity: Normal Keywords:
Cc: george@…, martin.pajuste@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

With Python 3.5 and Django 1.9 I'm running into trouble with internationalized e-mail addresses. According to RFC 6532 it is possible to have unicode characters in the e-mail address: https://tools.ietf.org/html/rfc6532.html and this RFC *should* be supported in Python 3.5: https://docs.python.org/3/whatsnew/3.5.html#email. Steps to reproduce are at the bottom of the ticket.

Now, for validating e-mail addresses in various places, Django calls stdlib mail.formataddr, which doesn't seem to respect this RFC - not even if you explicitly set the EmailPolicy.utf8 (see https://docs.python.org/3/library/email.policy.html#email.policy.EmailPolicy.utf8) as I see no reference to that in the source code.

This function in the stdlib blatantly calls address.encode('ascii'). Luckily, it's quite short, and I would suggest rolling 'our own' formataddr function (for the time being). I'll bring this issue up on the Python bug tracker as well. I think it's possible, as it's a relative simple function of only 40 LoC, of which 12 lines docstring.

Steps to reproduce
Basically shell output:

mkdir bug_email
cd bug_email
mkdir bug_email -p python3.5
(bug_email)  bug_email  python --version
Python 3.5.1
(bug_email)  bug_email  pip install Django==1.9
Collecting Django==1.9
  Using cached Django-1.9-py2.py3-none-any.whl
Installing collected packages: Django
Successfully installed Django-1.9
(bug_email)  bug_email  django-admin.py startproject bug_email .

# regular shell is enough to test
(bug_email)  bug_email  ./manage.py shell

Shell session

>>> from django.core.mail.message import sanitize_address
>>> sanitize_address(('dummy', u'juan.lópez@abc.com'), 'utf8')
Traceback (most recent call last):
  File "/home/bbt/coding/.virtualenvs/bug_email/lib/python3.5/site-packages/django/core/management/commands/shell.py", line 69, in handle
    self.run_shell(shell=options['interface'])
  File "/home/bbt/coding/.virtualenvs/bug_email/lib/python3.5/site-packages/django/core/management/commands/shell.py", line 61, in run_shell
    raise ImportError
ImportError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/home/bbt/coding/.virtualenvs/bug_email/lib/python3.5/site-packages/django/core/mail/message.py", line 118, in sanitize_address
    return formataddr((nm, addr))
  File "/usr/lib64/python3.5/email/utils.py", line 91, in formataddr
    address.encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode character '\xf3' in position 6: ordinal not in range(128)
>>> 
>>> sanitize_address(('dummy', u'juan.lópez@abc.com'), 'idna')
Traceback (most recent call last):
  File "/home/bbt/coding/.virtualenvs/bug_email/lib/python3.5/site-packages/django/core/management/commands/shell.py", line 69, in handle
    self.run_shell(shell=options['interface'])
  File "/home/bbt/coding/.virtualenvs/bug_email/lib/python3.5/site-packages/django/core/management/commands/shell.py", line 61, in run_shell
    raise ImportError
ImportError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/home/bbt/coding/.virtualenvs/bug_email/lib/python3.5/site-packages/django/core/mail/message.py", line 118, in sanitize_address
    return formataddr((nm, addr))
  File "/usr/lib64/python3.5/email/utils.py", line 91, in formataddr
    address.encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode character '\xf3' in position 6: ordinal not in range(128)
>>> 

Change History (11)

comment:1 by Claude Paroz, 8 years ago

Triage Stage: UnreviewedAccepted

Reporting the issue in Python would be the first step. Could you please report here the ticket number as soon as it is done?

comment:2 by Sergei Maertens, 8 years ago

Python bug tracker issue: http://bugs.python.org/issue25955

comment:3 by Sergei Maertens, 8 years ago

The upstream ticket has been closed, with the following comment:

formataddr is part of the legacy interface and has no knowledge of the current policy. So it doesn't support RFC 6532. For > that you need to use the new API: just assign your address to the appropriate field, or create a headerregistry.Address object.

I'm in the process of rewriting the docs to make all of this clear, but, well, I'm slow...

Looks like this will have to be solved on Django's end after all. I'm not too familiar with the e-mail library yet, but I could look into it when I get some more spare time.

Version 0, edited 8 years ago by Sergei Maertens (next)

comment:4 by George Marshall, 8 years ago

Cc: george@… added

comment:5 by Martin Pajuste, 8 years ago

Cc: martin.pajuste@… added

comment:6 by Sergei Maertens, 8 years ago

Owner: changed from nobody to Sergei Maertens
Status: newassigned

comment:7 by Sergei Maertens, 8 years ago

Has patch: set

As I was implementing this, some extra information:

  • The RFC 6532 is a red herring. The actual issue was that the unicode characters were not properly MIME word-encoded. To support said RFC, an extra mail-server plugin must be enabled, so I went the safe way for Django where that's not needed.
  • There are major differences between Python 2 and 3. This is the case in:
    • the FakeSMTPServer in the testcases, where the mailfrom variable might not be MIME-word-encoded all the way.
    • the usage of Header(<string>, <encoding), where the str representation calls the encode method on Python 2 and on Python 3 it simply returns the initial <string> that was passed in.

Potential issue:
Because of the difference in str representation, that has been altered to always call the encode method. This causes simple ascii local parts to look garbled, for instance: to@example.com becomes =?utf-8?q?to?=@example.com. When django users test the e-mail messages generated by Django, they may have failing tests because they're not expecting the encoded version. I have not personally confirmed this yet though.

comment:8 by Sergei Maertens, 8 years ago

Summary: RFC 6532 support for e-mailDjango crashes on unicode characters in the local part of an e-mail address

comment:9 by Tim Graham, 8 years ago

Patch needs improvement: set

Left comments for improvement on the PR.

comment:10 by Sergei Maertens, 8 years ago

Patch needs improvement: unset

Remarks were processed, up for review again (if the build succeeds).

comment:11 by Tim Graham <timograham@…>, 8 years ago

Resolution: fixed
Status: assignedclosed

In ec009ef1:

Fixed #25986 -- Fixed crash sending email with non-ASCII in local part of the address.

On Python 3, sending emails failed for addresses containing non-ASCII
characters due to the usage of the legacy Python email.utils.formataddr()
function. This is fixed by using the proper Address object on Python 3.

Note: See TracTickets for help on using tickets.
Back to Top