Opened 9 months ago

Closed 8 months ago

#23397 closed Bug (fixed)

Multipart base64 file decoding of fails with large files when the encoded string contains newlines.

Reported by: jhobbs Owned by: nobody
Component: HTTP handling Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by jhobbs)

Large files are files larger than the chunk size in MultiPartParser.parse().

parse() tries to process base64 encoded files in chunks with lengths a multiple of 4. The base64 encoded string can contain newline characters, which aren't significant in base64 and should be ignored, but are counted toward's a chunk's length in parse(). This means when whitespace is stripped from the string, the count of base64 encoded characters may not be a multiple of 4, leading to an "Incorrect padding" error.

This pull request has a testcase for producing the failure and a proposed patch to fix the issue.

https://github.com/django/django/pull/3151/files

Stacktrace:

Traceback (most recent call last):
  File "/home/jason/canonical/code/django/tests/file_uploads/tests.py", line 109, in test_big_base64_newlines_upload
    "Big data" * 68000, encode=base64.encodestring)
  File "/home/jason/canonical/code/django/tests/file_uploads/tests.py", line 96, in _test_base64_upload
    response = self.client.request(**r)
  File "/home/jason/canonical/code/django/django/test/client.py", line 443, in request
    six.reraise(*exc_info)
  File "/home/jason/canonical/code/django/django/core/handlers/base.py", line 121, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/home/jason/canonical/code/django/tests/file_uploads/views.py", line 103, in file_upload_echo_content
    r = dict((k, f.read().decode('utf-8')) for k, f in request.FILES.items())
  File "/home/jason/canonical/code/django/django/core/handlers/wsgi.py", line 152, in _get_files
    self._load_post_and_files()
  File "/home/jason/canonical/code/django/django/http/request.py", line 249, in _load_post_and_files
    self._post, self._files = self.parse_file_upload(self.META, data)
  File "/home/jason/canonical/code/django/django/http/request.py", line 214, in parse_file_upload
    return parser.parse()
  File "/home/jason/canonical/code/django/django/http/multipartparser.py", line 220, in parse
    six.reraise(MultiPartParserError, MultiPartParserError(msg), sys.exc_info()[2])
  File "/home/jason/canonical/code/django/django/http/multipartparser.py", line 216, in parse
    chunk = base64.b64decode(chunk)
  File "/usr/lib/python2.7/base64.py", line 76, in b64decode
    raise TypeError(msg)
MultiPartParserError: Could not decode base64 data: TypeError(Error('Incorrect padding',),)

Change History (8)

comment:1 Changed 9 months ago by jhobbs

  • Description modified (diff)
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

comment:2 Changed 9 months ago by jhobbs

  • Description modified (diff)

comment:3 Changed 9 months ago by jhobbs

  • Description modified (diff)
  • Has patch set

comment:4 Changed 9 months ago by bmispelon

  • Needs documentation set
  • Patch needs improvement set
  • Triage Stage changed from Unreviewed to Accepted
  • Version changed from 1.6 to master

Hi,

Indeed, there seem to be a bug here.

I can't say whether your fix is the right approach, but it does seem to fix the issue and the testcase you added currently fails which is a good sign.
The testcase fails on python3 however, and the PR is missing some documentation (a mention in the release notes for 1.8).

I'm checking patch needs improvement. Please uncheck it when you've fixed the two points above.

Thanks

comment:5 Changed 9 months ago by jhobbs

  • Patch needs improvement unset

I've fixed the python3 test failures and added documentation to the release note.

comment:6 Changed 9 months ago by timgraham

  • Needs documentation unset

This looks like more of a bug fix than a "feature" so the release notes probably aren't required. Could you give some background how you encountered this issue so I can better understand it?

comment:7 Changed 9 months ago by jhobbs

I hit this working on MAAS - http://launchpad.net/maas - which uses Django for its web interface and API. We recently added an API call for uploading large files, which is where I ran into this.

MAAS's cli uses python stdlib's Email.generator to build multipart requests (http://bazaar.launchpad.net/~maas-maintainers/maas/trunk/view/head:/src/apiclient/multipart.py).

Here's the original bug on launchpad:
https://bugs.launchpad.net/maas/+bug/1363348

comment:8 Changed 8 months ago by Claude Paroz <claude@…>

  • Resolution set to fixed
  • Status changed from new to closed

In e1424b237003723c12400b5e1cfd95c87650e62b:

Fixed #23397 -- Stripped whitespace from base64 during chunking

This insures the actual base64 content has a length a multiple of 4.
Also added a test case for the failure.

Note: See TracTickets for help on using tickets.
Back to Top