Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#30669 closed Bug (fixed)

Can FILE_UPLOAD_MAX_MEMORY_SIZE be set to None?

Reported by: Lincoln Owned by: nobody
Component: HTTP handling Version: dev
Severity: Normal Keywords: FILE_UPLOAD_MAX_MEMORY_SIZE, None, settings
Cc: Andrew Godwin Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The settings reference for FILE_UPLOAD_MAX_MEMORY_SIZE doesn't say anything about if this setting can be set to None.

The first usages of it in asgi.py does a is None check on it, suggesting it can be None.

But the second usage of it in uploadhandler.py does a <= comparison which raises a TypeError when it is None, suggesting it shouldn't be None

I would be happy to submit a patch if someone with more knowledge of Django can answer the question.

Change History (12)

comment:1 by Claude Paroz, 5 years ago

Triage Stage: UnreviewedAccepted

I don't think there is a use case for a None value. To deactivate memory-based file uploads, I think that the official method is to remove the MemoryFileUploadHandler from upload handlers.

In my opinion, the usage by asgi.py might be a bug. It should used DATA_UPLOAD_MAX_MEMORY_SIZE, and mimic the MultiPartParser.parse() code to raise RequestDataTooBig when it is exceeded. To be confirmed…

comment:2 by Mariusz Felisiak, 5 years ago

Cc: Andrew Godwin added

comment:3 by Carlton Gibson, 5 years ago

Nice spot. This is a good one...

The purpose here is to provide a buffer for the request body prior to it being read() by the client of the request object itself.
Otherwise, we read the body into memory for each request, regardless of whether it is consumed, or how big it is, or... — which is a DoS waiting to happen.

RequestDataTooBig is raised normally if and when body_file is consumed during request processing, so we don't want to raise it here. The check is only being used for "how much memory should I consume before using the disk?"

In one sense, it's a bit naughty re-using the existing setting this way...

  • Do we even need the is None branch, i.e. can't we just always used the SpooledTemporaryFile?
  • It's not 100% why either FILE_UPLOAD_MAX_MEMORY_SIZE or DATA_UPLOAD_MAX_MEMORY_SIZE is the right value for the roll-over to disk.
  • I wonder if a hook on ASGIHandler for get_body_file() or similar is worth adding (yet)?
    • Such would enable me to set the roll-over as I wanted without a setting.
    • And, if I ever get the bandwidth to actually work on it (sigh), there's an idea (on Channels repo) to wrap the recieve awaitable in a file-like, to avoid prematurely reading the body at all. It would be easy to plugin/try-out such a method with a hook in place.
    • Equally, I might say, "Always use memory", and any other strategy I like...
Last edited 5 years ago by Carlton Gibson (previous) (diff)

in reply to:  3 ; comment:4 by Claude Paroz, 5 years ago

Replying to Carlton Gibson:

  • Do we even need the is None branch, i.e. can't we just always used the SpooledTemporaryFile?

+1, receiving in a BytesIO() without upper limiting looks like a simple DOS attack vector.
I'm not sure about your explanation of later size check. What if a request has "unlimited" (i.e. *very* big) length?

comment:5 by Andrew Godwin, 5 years ago

I agree that there's no situation it should be None - the "is None" statement is likely a leftover from my testing and development.

DATA_UPLOAD_MAX_MEMORY_SIZE may well be the right thing to use - I would appreciate other people's opinion on this one since it's not totally clear to me how the failure/enforcement modes of those two settings are defined.

in reply to:  4 comment:6 by Carlton Gibson, 5 years ago

Component: File uploads/storageHTTP handling

Replying to Claude Paroz:

I'm not sure about your explanation of later size check. What if a request has "unlimited" (i.e. *very* big) length?

OK, so this is where it get's tricky. 🙂

As it stands if the request is "unlimited" we'd end up using a lot™ of disk space. Whatever frontend server is in play should put an absolute limit on request sizes, and we could document the need for that in the deployment checklist, but we should have some kind of ceiling here.

DATA_UPLOAD_MAX_MEMORY_SIZE is not really the correct measure. At this point we want to limit the total size of the request. DATA_UPLOAD_MAX_MEMORY_SIZE is applied to only to the POST data, with FILES being handled separately (in MultipartParser, as you point to Claude.) The danger is we limit total request size to DATA_UPLOAD_MAX_MEMORY_SIZE and in so doing stop people handling large file uploads.

So, total request size should be limited to (something like) the amount of POST data I want to handle in memory (DATA_UPLOAD_MAX_MEMORY_SIZE) plus the maximum file size I want to allow to be uploaded (which isn't FILE_UPLOAD_MAX_MEMORY_SIZE either).

Whilst we need to read in the request here, I guess we can decide a generous but sensible default, and allow folks to set their own value. Personally I'd rather avoid a setting and let people pass a parameter in their asgi.py when instantiating the application (or use an ASGIHandler subclass which specifies it) but what do others think?

comment:7 by Carlton Gibson, 5 years ago

As it stands if the request is "unlimited" we'd end up using a lot™ of disk space.

But this applies equally to the existing WSGI code right? Django doesn't have a limit on the total file data that can be uploaded. (Does it? — I've always set a sensible value at the nginx level...)

The only difference here is we have this extra buffer layer. (So we read the bytes, presumably to disk, twice, for the length of the request.)

Update: A quick survey of the landscape suggests the nginx-like layer is ‘’where this is handled’’. Neither werkzeug, nor gunicorn, nor Daphne, nor uvicorn enforce total request size, it seems because they don’t buffer the whole request to be able to check. Like Django, Flask requires you to consume the request before you can check the size. (Issue on Werkzeug summarising some discussions.)

Comment from Tom Christie when I asked his opinion:

I think the right answer to this is: middleware for app-level protection. Nginx or whatever platform-level gateway you're on for added robustness.

Last edited 5 years ago by Carlton Gibson (previous) (diff)

comment:8 by Carlton Gibson, 5 years ago

PR removing the `is None` branch.

Not sure currently if/what we want to do about the rest of the discussion.

comment:9 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

In eea0bf7b:

Refs #30669 -- Removed incorrect branch in ASGIHander.read_body().

None is not valid for settings.FILE_UPLOAD_MAX_MEMORY_SIZE.
Always use SpooledTemporaryFile.

comment:10 by Claude Paroz, 5 years ago

Thanks for the code update and the explorations. I think we can close this ticket.
I'll let you decide if you think the request size limits stuff would need a doc ticket.

comment:11 by Mariusz Felisiak, 5 years ago

Resolution: fixed
Status: newclosed

Agreed, we removed incorrect branch so this question is not valid anymore.

comment:12 by Carlton Gibson, 5 years ago

I think Leave if for now, but we're aware of it. i.e. Lets see how the async work rolls out. (Most likely it's not an issue...)

Note: See TracTickets for help on using tickets.
Back to Top