Opened 10 years ago
Closed 10 years ago
#22971 closed Bug (fixed)
Can't receive file with non-ascii filename according to rfc2388
Reported by: | homm | Owned by: | nobody |
---|---|---|---|
Component: | HTTP handling | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Requests, popular Python library, starting from version 2.0 sends files with non-ascii characters in filename in full compliance with rfc2388:
The original local file name may be supplied as well, either as a
"filename" parameter either of the "content-disposition: form-data"
header or, in the case of multiple files, in a "content-disposition:
file" header of the subpart. The sending application MAY supply a
file name; if the file name of the sender's operating system is not
in US-ASCII, the file name might be approximated, or encoded using
the method of RFC 2231.
Where RFC 2231 defines attributes with * char. And requests uses such attribute name to send non-ascii file names.
# requests 1.2.3 >>> requests.post('http://ya.ru', files={'file': (u'файл', '123')}).request.body --cb90e5c32429403b99966534716cda56 Content-Disposition: form-data; name="file"; filename="файл" Content-Type: application/octet-stream 123 --cb90e5c32429403b99966534716cda56-- # requests 2.0 >>> requests.post('http://ya.ru', files={'file': (u'файл', '123')}).request.body --40f2f1873ec843598773fe150b4f783a Content-Disposition: form-data; name="file"; filename*=utf-8''%D1%84%D0%B0%D0%B9%D0%BB 123 --40f2f1873ec843598773fe150b4f783a--
But Django doesn't recognize such files and puts raw files content in request.POST
instead of population request.FILES
.
Attachments (1)
Change History (10)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:4 by , 10 years ago
Thanks, claudep, it's great!
A want to note, according rfc5987(http://tools.ietf.org/html/rfc5987#section-4.2) unicode value should be preferred over ascii.
In this case, the sender provides an ASCII version of the title for
legacy recipients, but also includes an internationalized version for
recipients understanding this specification -- the latter obviously
ought to prefer the new syntax over the old one.
I still don't know is rfc5987 applicable to multipart headers, though.
comment:5 by , 10 years ago
In the current implementation, the last one always wins. I'm not sure it's worth complicating the implementation if we even don't know if some user agents are indeed using this feature (and with the ascii version appearing after the encoded one).
comment:6 by , 10 years ago
RFC # 5987 isn't relevant here as it's about the server sending files to the client but not the inverse.
comment:8 by , 10 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:9 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Related requests ticket: https://github.com/kennethreitz/requests/issues/2117