Opened 8 years ago

Closed 8 years ago

#25986 closed Bug (fixed)

Django crashes on unicode characters in the local part of an e-mail address

Reported by: Sergei Maertens Owned by: Sergei Maertens
Component: Core (Mail) Version: 1.9
Severity: Normal Keywords:
Cc: george@…, martin.pajuste@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

With Python 3.5 and Django 1.9 I'm running into trouble with internationalized e-mail addresses. According to RFC 6532 it is possible to have unicode characters in the e-mail address: https://tools.ietf.org/html/rfc6532.html and this RFC *should* be supported in Python 3.5: https://docs.python.org/3/whatsnew/3.5.html#email. Steps to reproduce are at the bottom of the ticket.

Now, for validating e-mail addresses in various places, Django calls stdlib mail.formataddr, which doesn't seem to respect this RFC - not even if you explicitly set the EmailPolicy.utf8 (see https://docs.python.org/3/library/email.policy.html#email.policy.EmailPolicy.utf8) as I see no reference to that in the source code.

This function in the stdlib blatantly calls address.encode('ascii'). Luckily, it's quite short, and I would suggest rolling 'our own' formataddr function (for the time being). I'll bring this issue up on the Python bug tracker as well. I think it's possible, as it's a relative simple function of only 40 LoC, of which 12 lines docstring.

Steps to reproduce
Basically shell output:

mkdir bug_email
cd bug_email
mkdir bug_email -p python3.5
(bug_email)  bug_email  python --version
Python 3.5.1
(bug_email)  bug_email  pip install Django==1.9
Collecting Django==1.9
  Using cached Django-1.9-py2.py3-none-any.whl
Installing collected packages: Django
Successfully installed Django-1.9
(bug_email)  bug_email  django-admin.py startproject bug_email .

# regular shell is enough to test
(bug_email)  bug_email  ./manage.py shell

Shell session

>>> from django.core.mail.message import sanitize_address
>>> sanitize_address(('dummy', u'juan.lópez@abc.com'), 'utf8')
Traceback (most recent call last):
  File "/home/bbt/coding/.virtualenvs/bug_email/lib/python3.5/site-packages/django/core/management/commands/shell.py", line 69, in handle
    self.run_shell(shell=options['interface'])
  File "/home/bbt/coding/.virtualenvs/bug_email/lib/python3.5/site-packages/django/core/management/commands/shell.py", line 61, in run_shell
    raise ImportError
ImportError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/home/bbt/coding/.virtualenvs/bug_email/lib/python3.5/site-packages/django/core/mail/message.py", line 118, in sanitize_address
    return formataddr((nm, addr))
  File "/usr/lib64/python3.5/email/utils.py", line 91, in formataddr
    address.encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode character '\xf3' in position 6: ordinal not in range(128)
>>> 
>>> sanitize_address(('dummy', u'juan.lópez@abc.com'), 'idna')
Traceback (most recent call last):
  File "/home/bbt/coding/.virtualenvs/bug_email/lib/python3.5/site-packages/django/core/management/commands/shell.py", line 69, in handle
    self.run_shell(shell=options['interface'])
  File "/home/bbt/coding/.virtualenvs/bug_email/lib/python3.5/site-packages/django/core/management/commands/shell.py", line 61, in run_shell
    raise ImportError
ImportError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/home/bbt/coding/.virtualenvs/bug_email/lib/python3.5/site-packages/django/core/mail/message.py", line 118, in sanitize_address
    return formataddr((nm, addr))
  File "/usr/lib64/python3.5/email/utils.py", line 91, in formataddr
    address.encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode character '\xf3' in position 6: ordinal not in range(128)
>>> 

Change History (11)

comment:1 by Claude Paroz, 8 years ago

Triage Stage: UnreviewedAccepted

Reporting the issue in Python would be the first step. Could you please report here the ticket number as soon as it is done?

comment:2 by Sergei Maertens, 8 years ago

Python bug tracker issue: http://bugs.python.org/issue25955

comment:3 by Sergei Maertens, 8 years ago

The upstream ticket has been closed, with the following comment:

formataddr is part of the legacy interface and has no knowledge of the current policy. So it doesn't support RFC 6532. For that you need to use the new API: just assign your address to the appropriate field, or create a headerregistry.Address object.

Looks like this will have to be solved on Django's end after all. I'm not too familiar with the e-mail library yet, but I could look into it when I get some more spare time.

Last edited 8 years ago by Tim Graham (previous) (diff)

comment:4 by George Marshall, 8 years ago

Cc: george@… added

comment:5 by Martin Pajuste, 8 years ago

Cc: martin.pajuste@… added

comment:6 by Sergei Maertens, 8 years ago

Owner: changed from nobody to Sergei Maertens
Status: newassigned

comment:7 by Sergei Maertens, 8 years ago

Has patch: set

As I was implementing this, some extra information:

  • The RFC 6532 is a red herring. The actual issue was that the unicode characters were not properly MIME word-encoded. To support said RFC, an extra mail-server plugin must be enabled, so I went the safe way for Django where that's not needed.
  • There are major differences between Python 2 and 3. This is the case in:
    • the FakeSMTPServer in the testcases, where the mailfrom variable might not be MIME-word-encoded all the way.
    • the usage of Header(<string>, <encoding), where the str representation calls the encode method on Python 2 and on Python 3 it simply returns the initial <string> that was passed in.

Potential issue:
Because of the difference in str representation, that has been altered to always call the encode method. This causes simple ascii local parts to look garbled, for instance: to@example.com becomes =?utf-8?q?to?=@example.com. When django users test the e-mail messages generated by Django, they may have failing tests because they're not expecting the encoded version. I have not personally confirmed this yet though.

comment:8 by Sergei Maertens, 8 years ago

Summary: RFC 6532 support for e-mailDjango crashes on unicode characters in the local part of an e-mail address

comment:9 by Tim Graham, 8 years ago

Patch needs improvement: set

Left comments for improvement on the PR.

comment:10 by Sergei Maertens, 8 years ago

Patch needs improvement: unset

Remarks were processed, up for review again (if the build succeeds).

comment:11 by Tim Graham <timograham@…>, 8 years ago

Resolution: fixed
Status: assignedclosed

In ec009ef1:

Fixed #25986 -- Fixed crash sending email with non-ASCII in local part of the address.

On Python 3, sending emails failed for addresses containing non-ASCII
characters due to the usage of the legacy Python email.utils.formataddr()
function. This is fixed by using the proper Address object on Python 3.

Note: See TracTickets for help on using tickets.
Back to Top