Opened 6 years ago

Closed 4 years ago

Last modified 4 years ago

#11212 closed Cleanup/optimization (fixed)

Don't encode emails with base64/qp

Reported by: phr Owned by: nobody
Component: Core (Mail) Version: master
Severity: Normal Keywords: send_mail
Cc: petr.hroudny@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX:

Description

#3472 introduced a patch, which replaces base64 encoding with quoted-printable.

Anyway, there's even better way to do get rid of unnecessary encoding:

from email import Charset 
Charset.add_charset('utf-8',Charset.SHORTEST,None,'utf-8')

The above will set Content-Transfer-Encoding to 7bit or 8bit as needed - exactly as Mutt or Thunderbird do
and completely solve the problem even for non-latin characters.

Attachments (2)

0001-Ticket-11212-default-to-7bit-email.patch (6.6 KB) - added by gisle 6 years ago.
Patch with adjusted tests
11212.1.diff (4.9 KB) - added by ramiro 5 years ago.
Patch updated to trunk status as of now, also added test for #3472/r5143.

Download all attachments as: .zip

Change History (20)

comment:1 Changed 6 years ago by phr

  • Cc petr.hroudny@… added; phr removed
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

comment:2 Changed 6 years ago by russellm

  • Needs tests set

Tests aren't optional. If you're changing code, you need tests - or a good excuse why a test isn't possible.

comment:3 Changed 6 years ago by phr

I'm not quite sure what to test here. Only a single parameter needs to be changed
in file django/core/mail.py at line 23:

-Charset.add_charset('utf-8', Charset.SHORTEST, Charset.QP, 'utf-8')
+Charset.add_charset('utf-8', Charset.SHORTEST, None, 'utf-8')

comment:4 Changed 6 years ago by lukeplant

You need to update and add to the tests in tests/regressiontests/mail/tests.py, which fail with the above change. You should add tests that show what happens with 7bit and 8bit data.

comment:5 Changed 6 years ago by phr

Yes, sure those tests fail. They test for:

Content-Transfer-Encoding: quoted-printable

at multiple places, which after the proposed change will not pass.
If the input is pure ascii, that header will become:

Content-Transfer-Encoding: 7bit

If the input contains some accentuated character - e.g. 'áéí', there will be 8bit instead.

comment:6 Changed 6 years ago by russellm

  • milestone set to 1.2
  • Triage Stage changed from Unreviewed to Accepted

comment:7 Changed 6 years ago by gisle

  • Owner changed from nobody to gisle

Changed 6 years ago by gisle

Patch with adjusted tests

comment:8 Changed 6 years ago by gisle

  • Needs tests unset
  • Owner changed from gisle to nobody
  • Patch needs improvement set

I've included a patch that makes the tests pass, but as demonstrated by the patch this make us start generating email with "Content-Transfer-Encoding: 8bit" when passing in Unicode strings as subject or content. I think that should be avoided, so I don't recommend this patch as is.

comment:9 Changed 6 years ago by ubernostrum

  • milestone 1.2 deleted

1.2 is feature-frozen, moving this feature request off the milestone.

comment:10 Changed 6 years ago by phr

Replying to gisle:

I've included a patch that makes the tests pass, but as demonstrated by the patch this make us start generating email with "Content-Transfer-Encoding: 8bit" when passing in Unicode strings as subject or content. I think that should be avoided, so I don't recommend this patch as is.


The use of 8bit encoding is exactly the aim of this ticket, so the patch does the right thing IMHO.

Except for a few legacy paths, SMTP is mostly 8bit today - and where not, it's the MTA's role to downconvert to 7bit.

Many email clients are generating 8bit-encoded messages (Thunderbird, Mutt, etc) and people routinely use 8bit email every day. Avoiding it makes no longer any sense, and the current practice of using quoted-printable does not work right for non-latin aplhabets.

Please note that other software already implemented my proposal - see e.g. http://trac.edgewall.org/ticket/8252

comment:11 Changed 6 years ago by phr

  • Patch needs improvement unset

comment:12 Changed 5 years ago by andrewbadr

  • milestone set to 1.3

comment:13 Changed 5 years ago by ramiro

  • Patch needs improvement set

Tests need to be converted to unit tests.

Changed 5 years ago by ramiro

Patch updated to trunk status as of now, also added test for #3472/r5143.

comment:14 Changed 5 years ago by ramiro

  • Patch needs improvement unset

comment:15 Changed 4 years ago by julien

  • Severity set to Normal
  • Type set to Cleanup/optimization

comment:16 Changed 4 years ago by jezdez

  • Easy pickings unset
  • Triage Stage changed from Accepted to Ready for checkin

comment:17 Changed 4 years ago by jezdez

  • Resolution set to fixed
  • Status changed from new to closed

In [16178]:

Fixed #11212 -- Stopped using quoted-printable encoding for mails with non-ASCII characters but rely on 8bit encoding instead. Thanks, phr, gisle and Ramiro Morales.

comment:18 Changed 4 years ago by jacob

  • milestone 1.3 deleted

Milestone 1.3 deleted

Note: See TracTickets for help on using tickets.
Back to Top