Code

Opened 5 years ago

Last modified 3 years ago

#10850 new Bug

Impossible to stop a large file upload mid-stream

Reported by: legutierr Owned by: nobody
Component: File uploads/storage Version: master
Severity: Normal Keywords: upload, StopUpload
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

As described in this post on the django-users group, raising django.core.files.uploadhandler.StopUpload(connection_reset=True) in order to cut off a too-large file upload does not work as documented. Nor does it work as specified in comments to the code, nor according to a discussion regarding the implementation of this feature.

The connection reset functionality does work correctly using the development server, but not using mod_python or fastcgi on lighttpd or Apache. It seems that in both cases the webserver pre-loads the entire file, regardless of its size, without Django taking the opportunity to interrupt the stream, even if StopUpload(connection_reset=True) is raised by receive_data_chunk() inside a subclass of FileUploadHandler.

The most egregious side effect of this defect is that while the file is uploading to the server, the server experiences significant slow-downs.

In addition, without a fix, any restrictions on file-size for upload will not be delivered to users until they have waited for many minutes uploading their unacceptable file, a user-interaction failure that should be avoidable.

Attachments (0)

Change History (11)

comment:1 Changed 5 years ago by grahamd

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

You do understand that even if if you can generate the failure page, you will in general not actually stop the browser from sending the post data. You may stop Django from reading it, but client has still sent it.

This is because except for Opera, browsers don't implement proper 100-continue processing. The 100-continue processing option for rejecting a request without data being sent probably only works for Apache/mod_wsgi in embedded mode. Other options such as Apache/mod_wsgi in daemon mode, fastcgi, mod_proxy etc, will trigger reading of data from client even before your application gets to do anything, so negating 100-continue feature. Even then, as I said, 100-continue only has a chance of working with Opera.

The best you would achieve is to use mod_wsgi is either embedded or daemon mode and use LimitRequestBody directive of Apache. The mod_wsgi module will honour that and ensure error response returned even before application gets to run. LimitRequestBody in mod_python doesn't work. Can't remember if fastcgi and mod_proxy play nice with it but I think not.

Anyway, doesn't necessarily help you, but gives a bit of background of the greater problems related to doing this sort of stuff.

comment:2 Changed 5 years ago by legutierr

  • Cc grahamd added

I think it might be most useful to contrast the behavior of manage.py runserver with the behavior of Apache + fastcgi that I'm seeing using Firefox as a client. It seems that manage.py runserver interrupts the connection on the socket level mid-stream once StopUpload(connection_reset=True) is raised, and Firefox responds by immediately terminating the upload.

In the case of Apache + fastcgi, however, Apache seems to maintain the connection with the browser regardless of what Django does. (In addition to throwing StopUpload() inside my upload handler, I have tried returning HttpResponse objects with a variety of success/error codes from within my view, to no avail.)

If I understand you correctly that Firefox does not implement 100-continue, this difference in behavior would not seem to have anything to do with 100-continue processing, because if it did you wouldn't see proper (or, rather, as-documented) behavior when throwing StopUpload() from within the manage.py runserver daemon.

Is there any way for fastcgi (or mod_python for that matter) to signal to Apache that a connection that has already been established be terminated, as it is terminated by manage.py runserver? This might be a way to have fastcgi work in the manner documented. Or, if there is some kind of HttpResponse that could be returned to Apache, or exception thrown from inside my view, that could cause Apache to terminate the connection, that would be a solution.

Also, am I understanding you correctly that the LimitRequestBody is not compatible with mod_fastcgi? The LimitRequestBody directive would, in fact, solve this problem, but I am, unfortunately, tied to fastcgi, so if LimitRequestBody is not an option I will have to look for another. I am far from being an expert in Apache, which is why I was planning on relying on the documented functionality of the upload handlers to control limiting the size of uploaded files in my app.

comment:3 Changed 5 years ago by legutierr

  • Cc grahamd removed

comment:4 Changed 5 years ago by grahamd

All I can say is it is likely that fastcgi complicates it, as in the case of runsever, the browser is connected directly to the process that Django is running in. In the case of fastcgi, the browser only connects to the Apache server child worker process, which is only acting as a proxy via the fastcgi protocol to the process that Django runs in. Thus you are at the mercy as to how the fastcgi virtual connection for that request is closed down and that action propogated back to Apche server child worker process. Also, in the case of runserver it can just close the connection. In the case of fastcgi, depending on how it is implemented, it may be necessary to still read in and discard the request input. Normally this shouldn't be the case for an error response, only a 200 success response when keep alive on, but couldn't be working different that for some reason. Anyway, if it is a fastcgi issue, possibly not much you can do about it.

comment:5 Changed 5 years ago by kmtracey

#10850 was a dupe with a sample upload handler.

comment:6 Changed 5 years ago by ramiro

Karen meant to link to #10902

comment:7 Changed 4 years ago by russellm

  • Triage Stage changed from Unreviewed to Accepted

comment:8 Changed 4 years ago by robin

Replying to grahamd:

The best you would achieve is to use mod_wsgi is either embedded or daemon mode and use LimitRequestBody directive of Apache. The mod_wsgi module will honour that and ensure error response returned even before application gets to run.

However all web browsers (except Opera) will give a 'connection fail' error before the proper error response page could be reached right? According to my chat with you here http://groups.google.com/group/modwsgi/browse_thread/thread/aa5b21632f99bc70

Any news since then?

comment:9 Changed 3 years ago by SmileyChris

  • Severity set to Normal
  • Type set to Bug

comment:10 Changed 2 years ago by aaugustin

  • UI/UX unset

Change UI/UX from NULL to False.

comment:11 Changed 2 years ago by aaugustin

  • Easy pickings unset

Change Easy pickings from NULL to False.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as new
The owner will be changed from nobody to anonymous. Next status will be 'assigned'
as The resolution will be set. Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.