#26227 closed Bug (invalid)
Unicode attachment filename displays incorrectly in some clients
Reported by: | Sergey Gornostaev | Owned by: | nobody |
---|---|---|---|
Component: | Core (Mail) | Version: | 1.9 |
Severity: | Normal | Keywords: | email attachment, filenames, i18n |
Cc: | Thomi Richards, milosu, Pablo Castellano | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
When attaching a file with name containing non ASCII symbols, GMail display this attachment as "noname" and Zimbra 8.0.2 as percent-encoded.
from django.template.loader import get_template from django.core.mail import send_mail, EmailMultiAlternatives txt_msg_body = get_template('email.txt').render({}) html_msg_body = get_template('email.html').render({}) msg = EmailMultiAlternatives('Test', txt_msg_body, 'robot@somedomain.ru', ['sputterspark@gmail.com']) msg.attach_alternative(html_msg_body, "text/html") with open('test.pdf', 'rb') as fh: data = fh.read() msg.attach(u'Имя файла', data, 'application/pdf') msg.send()
Attachments (3)
Change History (18)
by , 9 years ago
Attachment: | GMailAndZimbra.png added |
---|
follow-up: 2 comment:1 by , 9 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
Originally, being able to have unicode in attachment file names was added in ticket #14964.
I tested this:
from django.core.mail import EmailMultiAlternatives msg = EmailMultiAlternatives('Test', 'email body\nend', 'from@example.com', ['to@example.com']) msg.attach_alternative('<html><body>email body<br>end</body></html>', 'text/html') msg.attach(u'fíle_with_ünicöde_çhårs', b'foobar', 'application/octet-stream') msg.send()
and got following email body:
Content-Type: multipart/mixed; boundary="===============5134186686965449755==" MIME-Version: 1.0 Subject: Test From: from@example.com To: to@example.com Date: Wed, 17 Feb 2016 07:17:41 -0000 Message-ID: <some_number@myhost> --===============5134186686965449755== Content-Type: multipart/alternative; boundary="===============0773237926637752706==" MIME-Version: 1.0 --===============0773237926637752706== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit email body end --===============0773237926637752706== MIME-Version: 1.0 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: 7bit <html><body>email body<br>end</body></html> --===============0773237926637752706==-- --===============5134186686965449755== Content-Type: application/octet-stream MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename*="utf-8''f%C3%ADle_with_%C3%BCnic%C3%B6de_%C3%A7h%C3%A5rs" Zm9vYmFy --===============5134186686965449755==--
According to RFC 2231 that is encoded correctly. My MUA also displays it correctly, so it seems to be an error with Google and Zimbra.
comment:2 by , 8 years ago
Resolution: | invalid |
---|---|
Status: | closed → new |
Is this filename encoding really correct? I have same problem, and none of tried MUAs (including Gmail, Outlook and Kmail) doesn't show filenames correctly.
by , 8 years ago
Attachment: | Screenshot_20160915_125209.png added |
---|
Django unicode filenames in Kmail
follow-up: 5 comment:3 by , 8 years ago
Just tested now on Thunderbird (Linux), Gmail and Roundcube, and the filename is displaying fine.
comment:4 by , 8 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
Summary: | Unicode attachment filename → Unicode attachment filename displays incorrectly in some clients |
@gasinvein, please tell us where the bug is in Django if it's an issue.
comment:5 by , 8 years ago
Just tested now on Thunderbird (Linux), Gmail and Roundcube, and the filename is displaying fine.
So what am I doing wrong? Tried your eample in comment:1. Django 1.10.1
comment:6 by , 6 years ago
Resolution: | invalid |
---|---|
Status: | closed → new |
Hi,
I came across this issue in django 1.11.11 - using the EmailMessage
class, attachments with non-ascii characters in their filenames render as 'noname' in GMail.
I'm no expert in MIME - I've read RFC2231 and RFC2047, which seem to be on-topic for this case. However, the exact "correct" behaviour here isn't obvious to me. However, I was able to fix the issue like so:
class EmailMessageWithAttachmentEncoding(EmailMessage): def _create_attachment(self, filename, content, mimetype=None): attachment = self._create_mime_attachment(content, mimetype) if filename: try: parameters = { 'filename': filename.encode('ascii'), } except UnicodeEncodeError: # Include both parameters manually because Python's implementation # only adheres to RFC2231 and not RFC2047 which breaks some clients # such as GMail. filename = Header(filename, 'utf-8').encode() parameters = { 'filename*': filename, # RFC2231 'filename': filename, # RFC2047 } attachment.add_header('Content-Disposition', 'attachment', **parameters) return attachment
I'm not sure if the django project would accept this as a patch, especially since it seems to me like the correct behaviour here is somewhat undefined (perhaps there's a MIME expert willing to testify?). In any case, this solution has worked for me, and might help others who stumble across this page while trying to debug the same issue.
I've re-opened the issue, since it seems like we probably want django's email features to work with GMail, even if the fix differs from what I've pasted above.
comment:7 by , 6 years ago
Cc: | added |
---|
comment:8 by , 6 years ago
Resolution: | → needsinfo |
---|---|
Status: | new → closed |
What are the steps to reproduce the issue? I tried the steps in the ticket description and the attachment name looks fine. Also, please test with Django master (or at least Django 2.1 beta) rather than Django 1.11 which is quite old at this point.
comment:9 by , 6 years ago
Resolution: | needsinfo |
---|---|
Status: | closed → new |
I managed to reproduce by sending an email to a @gmail.com address with an attachment containing non-ASCII characters on master.
from django.core.mail import EmailMultiAlternatives msg = EmailMultiAlternatives('Subject', 'Body', '...@gmail.com', [ '...@gmail.com']) msg.attach('Имя файла', b'data', 'text/plain') msg.send()
The issue seems to be that GMail ignores RFC2231 header parameters (e.g. filename*=
) and only accepts RFC2047 ones (filename=?UTF...
).
The code changes suggested by Thomi include both parameters if the attachment name is not ASCII encodable.
comment:10 by , 6 years ago
The name appears fine on the web version of gmail I'm using. I'll attach a screenshot with what I see.
by , 6 years ago
Attachment: | gmail-screenshot.png added |
---|
comment:11 by , 6 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
It looks like I can't reproduce against master anymore as the issue manifests itself on Python 2, sorry for the false alarm Tim.
Here's how the attachment is sent on Python 2
MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*="utf-8''%D0%98%D0%BC%D1%8F%20%D1%84%D0%B0%D0%B9%D0%BB%D0%B0" data
And on Python 3
Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*=utf-8''%D0%98%D0%BC%D1%8F%20%D1%84%D0%B0%D0%B9%D0%BB%D0%B0 data
Notice that both use a RFC 2231 filename*=
parameter but the value is within double quotes on Python 2 while it isn't on Python 3. That seems to be the reason why GMail rejects the encoded value.
This was changed in Python 3.1 dfd7eb and detailed in CPython#1693546
comment:12 by , 5 years ago
To be honest, I'm still having the "noname" problem in GMail, when some utf-8 characters are present in the filename.
I can see that the e-mail I'm sending does not have the double quotes around filename.
But my application is behind two Microsoft SMTP Servers (internal and outbound) and it looks like one of them will silently add the double quotes before sending the message to Google Gmail.
That being said, what works for me is really the patch to _create_attachment method as proposed by Thomi in Comment No. 6.
With his patch applied, the raw e-mail when received by Google looks like:
--===============0157380707== Content-Type: application/pdf MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="gartner_ěščřšěřšěčéříš909.pdf" --===============0157380707==--
and the filename will be displayed correctly.
Without the patch, Google will display no-name filename and the headers look like:
--===============2001112103== Content-Type: application/pdf MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename*="utf-8''gartner_%C4%9B%C5%A1%C4%8D%C5%99%C5%A1%C4%9B%C5%99%C5%A1%C4%9B%C4%8D%C3%A9%C5%99%C3%AD%C5%A1909.pdf"
While the raw e-mail when generated by Django without the patch looks like:
--===============0135089781== Content-Type: application/pdf MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename*=utf-8''gartner_%C4%9B%C5%A1%C4%8D%C5%99%C5%A1%C4%9B%C5%99%C5%A1%C4%9B%C4%8D%C3%A9%C5%99%C3%AD%C5%A1909.pdf
So somehow Microsoft SMTP Server or some third-party filter adds the double quotes during processing.
The error does not happen when sending the same e-mail via Postfix. Looks like tricky interoperability problem indeed..
Thank you Thomi anyway..
comment:13 by , 5 years ago
Cc: | added |
---|
comment:14 by , 5 years ago
comment:15 by , 5 years ago
Cc: | added |
---|
Emails screenshot