Opened 8 years ago

Closed 6 years ago

Last modified 5 years ago

#11212 closed Cleanup/optimization (fixed)

Don't encode emails with base64/qp

Reported by: phr Owned by: nobody
Component: Core (Mail) Version: master
Severity: Normal Keywords: send_mail
Cc: petr.hroudny@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX:

Description

#3472 introduced a patch, which replaces base64 encoding with quoted-printable.

Anyway, there's even better way to do get rid of unnecessary encoding:

from email import Charset 
Charset.add_charset('utf-8',Charset.SHORTEST,None,'utf-8')

The above will set Content-Transfer-Encoding to 7bit or 8bit as needed - exactly as Mutt or Thunderbird do
and completely solve the problem even for non-latin characters.

Attachments (2)

0001-Ticket-11212-default-to-7bit-email.patch (6.6 KB) - added by gisle 7 years ago.
Patch with adjusted tests
11212.1.diff (4.9 KB) - added by Ramiro Morales 6 years ago.
Patch updated to trunk status as of now, also added test for #3472/r5143.

Download all attachments as: .zip

Change History (20)

comment:1 Changed 8 years ago by phr

Cc: petr.hroudny@… added; phr removed

comment:2 Changed 8 years ago by Russell Keith-Magee

Needs tests: set

Tests aren't optional. If you're changing code, you need tests - or a good excuse why a test isn't possible.

comment:3 Changed 8 years ago by phr

I'm not quite sure what to test here. Only a single parameter needs to be changed
in file django/core/mail.py at line 23:

-Charset.add_charset('utf-8', Charset.SHORTEST, Charset.QP, 'utf-8')
+Charset.add_charset('utf-8', Charset.SHORTEST, None, 'utf-8')

comment:4 Changed 7 years ago by Luke Plant

You need to update and add to the tests in tests/regressiontests/mail/tests.py, which fail with the above change. You should add tests that show what happens with 7bit and 8bit data.

comment:5 Changed 7 years ago by phr

Yes, sure those tests fail. They test for:

Content-Transfer-Encoding: quoted-printable

at multiple places, which after the proposed change will not pass.
If the input is pure ascii, that header will become:

Content-Transfer-Encoding: 7bit

If the input contains some accentuated character - e.g. 'áéí', there will be 8bit instead.

comment:6 Changed 7 years ago by Russell Keith-Magee

milestone: 1.2
Triage Stage: UnreviewedAccepted

comment:7 Changed 7 years ago by gisle

Owner: changed from nobody to gisle

Changed 7 years ago by gisle

Patch with adjusted tests

comment:8 Changed 7 years ago by gisle

Needs tests: unset
Owner: changed from gisle to nobody
Patch needs improvement: set

I've included a patch that makes the tests pass, but as demonstrated by the patch this make us start generating email with "Content-Transfer-Encoding: 8bit" when passing in Unicode strings as subject or content. I think that should be avoided, so I don't recommend this patch as is.

comment:9 Changed 7 years ago by James Bennett

milestone: 1.2

1.2 is feature-frozen, moving this feature request off the milestone.

comment:10 Changed 7 years ago by phr

Replying to gisle:

I've included a patch that makes the tests pass, but as demonstrated by the patch this make us start generating email with "Content-Transfer-Encoding: 8bit" when passing in Unicode strings as subject or content. I think that should be avoided, so I don't recommend this patch as is.


The use of 8bit encoding is exactly the aim of this ticket, so the patch does the right thing IMHO.

Except for a few legacy paths, SMTP is mostly 8bit today - and where not, it's the MTA's role to downconvert to 7bit.

Many email clients are generating 8bit-encoded messages (Thunderbird, Mutt, etc) and people routinely use 8bit email every day. Avoiding it makes no longer any sense, and the current practice of using quoted-printable does not work right for non-latin aplhabets.

Please note that other software already implemented my proposal - see e.g. http://trac.edgewall.org/ticket/8252

comment:11 Changed 7 years ago by phr

Patch needs improvement: unset

comment:12 Changed 6 years ago by Andrew Badr

milestone: 1.3

comment:13 Changed 6 years ago by Ramiro Morales

Patch needs improvement: set

Tests need to be converted to unit tests.

Changed 6 years ago by Ramiro Morales

Attachment: 11212.1.diff added

Patch updated to trunk status as of now, also added test for #3472/r5143.

comment:14 Changed 6 years ago by Ramiro Morales

Patch needs improvement: unset

comment:15 Changed 6 years ago by Julien Phalip

Severity: Normal
Type: Cleanup/optimization

comment:16 Changed 6 years ago by Jannis Leidel

Easy pickings: unset
Triage Stage: AcceptedReady for checkin

comment:17 Changed 6 years ago by Jannis Leidel

Resolution: fixed
Status: newclosed

In [16178]:

Fixed #11212 -- Stopped using quoted-printable encoding for mails with non-ASCII characters but rely on 8bit encoding instead. Thanks, phr, gisle and Ramiro Morales.

comment:18 Changed 5 years ago by Jacob

milestone: 1.3

Milestone 1.3 deleted

Note: See TracTickets for help on using tickets.
Back to Top