Opened 2 years ago

Closed 2 years ago

Last modified 4 months ago

#33969 closed New feature (needsinfo)

Improve django.core.mail.messages EAI processing

Reported by: j-bernard Owned by: nobody
Component: Core (Mail) Version: dev
Severity: Normal Keywords: EAI IDNA RFC
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This ticket is the third and last of a list of tickets aiming at bringing Email Address Internationalization (EAI) compliance to Django by supporting International Domain Name (IDN) with regards to the latest standard (IDNA 2008) and fixing some processing on internationalized domains or email addresses.
Previous tickets: #33967, #33968

sanitize_address transforms the email address domain in Punycode regardless of the email server's compliance with EAI.

The main issue here is that the conversion is performed with the deprecated IDNA 2003 standard instead of IDNA 2008 (see the previous ticket for more information) but this conversion should also be skipped for consistency with the user input and only performed if the server does not support Unicode email addresses.

The logic (with the backend using Python smtplib) would then be:

  • try to send the message, regardless of the presence of Unicode in the address (Python smtplib will add the right options if you use the send_message method else the SMTPUTF8 option should be provided)
  • if smtplib.SMTPNotSupportedError is raised
    • if the local-part is ASCII only, convert the domain to A-Label using and IDNA 2008 compliant library and retry
    • else return failure


Change History (4)

comment:1 by Florian Apolloner, 2 years ago

try to send the message, regardless of the presence of Unicode in the address (Python smtplib will add the right options if you use the send_message method else the SMTPUTF8 option should be provided)

Can you link the code in question?

if smtplib.SMTPNotSupportedError is raised

What does this error tell us? Given a standard setup your "mail server" is often a simple relay running on localhost which probably doesn't give you errors often. Whether or how the server it relays to does domain internationalization is unknown to it.

I am not deep into mail standards, so any details you could provide here would be helpful. I also do think that we have some leeway when changing stuff here because international domain names are not used often. We just should make clear to not introduce any security issues (like sending mail to a different domain all of a sudden -- though I am not sure if we can prevent that realistically because for instance ß encodes differently in the standards iirc).

comment:2 by Mariusz Felisiak, 2 years ago

Resolution: needsinfo
Status: newclosed
Type: UncategorizedNew feature
Version: 4.0dev

in reply to:  1 comment:3 by j-bernard, 2 years ago

Replying to Florian Apolloner:

try to send the message, regardless of the presence of Unicode in the address (Python smtplib will add the right options if you use the send_message method else the SMTPUTF8 option should be provided)

Can you link the code in question?

This line transforms the domain to A-Label.

if smtplib.SMTPNotSupportedError is raised

What does this error tell us? Given a standard setup your "mail server" is often a simple relay running on localhost which probably doesn't give you errors often. Whether or how the server it relays to does domain internationalization is unknown to it.

This error means the mail server does not support the SMTPUTF8 option and won't be able to process a to address with Unicode.

I am not deep into mail standards, so any details you could provide here would be helpful. I also do think that we have some leeway when changing stuff here because international domain names are not used often. We just should make clear to not introduce any security issues (like sending mail to a different domain all of a sudden -- though I am not sure if we can prevent that realistically because for instance ß encodes differently in the standards iirc).

IDNA 2008 was created to lower some security issues with the old IDNA 2003 standard. And as you mentioned, keeping both standards active creates even more issues. Many already moved to the new standard and Django could clearly make a difference by being also compliant. Without that kind of move, IDNs will keep being "not used often".

comment:4 by Mike Edmunds, 4 months ago

I looked into this earlier today as part of ticket #35581, and was surprised to find that IDNA 2003 is probably still the correct choice for sending email. Documenting my findings here.

The problem is for domains containing one of the deviation characters where the two IDNA versions differ. For instance:

IDNA 2003: otto@​faß.example → otto@​fass.example
IDNA 2008: otto@​faß.example → otto@​xn--fa-hia.example

If those two domains are owned by different people, and Django uses a different version of IDNA than Otto expects, Otto's email could go to the wrong person. Big problem.

So the question is, what version of IDNA does Otto expect? Browsers have all updated to IDNA 2008: if you enter http://faß.example, you will end up at http://xn--fa-hia.example, not http://fass.example. (You can try this with the .de equivalents to those domains, which are currently parked at different registrars.)

I had assumed email should match the browsers, and be using IDNA 2008 by now. (And I was thinking that Django's not using it for email addresses was a serious security issue.) I was wrong.

In testing earlier today, I found both Gmail and Outlook.com are still using IDNA 2003 for domains in address headers: both treat otto@​faß.example as otto@​fass.example. (They might be using IDNA 2008, but with UTS #​46 "transitional processing" enabled, which retains the IDNA 2003 encoding for the deviation characters.)

Bottom line: we wouldn't want to switch Django's sanitize_address() to use IDNA 2008 encoding (at least not without transitional processing), because that would actually introduce a security issue, by sending Otto's email to an unexpected domain.

Also, If I'm understanding correctly, part of the request here is to be able to get Django's EmailMessage.message().as_string() to generate a message that hasn't had any encoding applied to the addresses, for use with SMTPUTF8. (That is, To: jörg@faß.example should stay just like that, not turn into To: =?utf-8?q?j=C3=B6rg?=@fass.example.) I'm hoping to address that as part of #35581, if `email.policy.SMTPUTF8` is used for EmailMessage.message().

Note that Django's SMTP EmailBackend doesn't currently support SMTPUTF8. That's probably best handled as a separate new feature request. (Or could also be implemented by a third-party custom EmailBackend.)

Note: See TracTickets for help on using tickets.
Back to Top