Opened 18 years ago

Closed 16 years ago

Last modified 8 years ago

#2070 closed enhancement (fixed)

Large streaming uploads

Reported by: oyvind.saltvik@… Owned by: Jacob
Component: HTTP handling Version: dev
Severity: normal Keywords: streaming, upload, large, sprintsept14, feature
Cc: dsalvetti@…, matthias@…, oliver@…, Maniac@…, nesh@…, aenor.realm@…, gary.wilson@…, serialx.net@…, wonlay@…, lurker86@…, shaun@…, antonio.mele@…, herbert.poul@…, gabor@…, koebbe@…, axiak@…, jm.bugtracking@…, moritz.angermann@…, django@…, brosner@…, root.lastnode@…, akaihola+djtick@…, sam@…, trey@…, klaus.blindert@…, ville@…, reza@…, registrations@…, calvin@…, works@…, andrewbadr.etc@…, uptimebox@…, drfarina@…, beau@…, giuliani.v@…, prigun@…, zerok@…, johannes.beigel@…, david@…, django@…, mocksoul@…, schlaber@…, danpoe@…, django@…, django-2070@…, clintecker@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Based on code from ticket 1448.

Test for large file upload

from django.db import models

# Create your models here.
class FileList(models.Model):
    name = models.CharField(maxlength=255)
    email = models.EmailField()

    class Admin:
        pass

class AFile(models.Model):
    descr = models.CharField(maxlength=255,core=True)
    file = models.FileField(upload_to='files')
    inlist = models.ForeignKey(FileList,edit_inline=models.STACKED)

And two 500 megabyte files for upload.

Works with the patch with +20 mb above average memory and minimal cpu load usage by the httpd thread, file upload successful.

Without the patch httpd rages to +140 mb above average memory usage and over 60% cpu usage and fails miserably in the end.

Attachments (64)

3581-streaming_uploads_and_uploadprogress_middleware_x_progress_id.diff (16.7 KB ) - added by [530] 18 years ago.
Now using X_PROGRESS_ID instead of X-Progress-Id, accepts any lower/uppercase/ - / _ /prefix variant
3581-streaming_uploads_and_uploadprogress_middleware_x_progress_id_windowsfix.diff (16.9 KB ) - added by [Cha0S] 18 years ago.
windows temporary file locking fix
modpyton-ok-needs-fcgi-testing.3.diff (28.6 KB ) - added by Øyvind Saltvik <oyvind.saltvik@…> 17 years ago.
did not work without middleware, fixed
4459-streaming-file-upload.diff (25.9 KB ) - added by Joakim Sernbrant <serbaut@…> 17 years ago.
Simplified streaming uploads
4459-streaming-file-upload.2.diff (26.7 KB ) - added by Joakim Sernbrant <serbaut@…> 17 years ago.
Added FILE_UPLOAD_MIN_SIZE (default 100kb) to define minimum request size for streaming to disk. Propage exeptions. I'm not too happy with the names of the settings anymore :/
5065-streaming_file_upload_with_shutils.diff (26.6 KB ) - added by axiak@… 17 years ago.
I've updated the patch to include the shutils command and to work with [5065]. Please check to see if it works.
5065-streaming_file_upload_with_shutils_2.diff (27.5 KB ) - added by axiak@… 17 years ago.
Works in [5065], renamed settings variable, uses global settings, defaults STREAMING_MIN_POST_SIZE to .5MB (please test though!)
5070-streaming-file-upload.diff (26.7 KB ) - added by Øyvind Saltvik <oyvind@…> 17 years ago.
Updated to trunk, without changes
5078_streaming_file_upload_with_shutils_and_fallbacks.diff (28.1 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Usese shutils, but falls back.
5078_streaming_file_upload_with_shutils_and_fallbacks_2.diff (28.2 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Usese shutils, but falls back, this time it deletes the tmp file even when it falls back.
5078-streaming_file_upload_with_shutils_and_chunked_fallbacks.diff (28.4 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
This file uses python fallbacks, but using chunks to avoid loading the entire file into memory.
5079-streaming_file_upload_with_safe_file_move.diff (29.1 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Cleaned it up a bit. Moved file_move_safe into django.utils in case it should be used in future endeavors.
5089-streaming_file_upload_with_safe_file_move.diff (28.0 KB ) - added by Øyvind Saltvik <oyvind@…> 17 years ago.
Added missing else
5089-streaming_file_upload_with_safe_file_move.2.diff (29.1 KB ) - added by Øyvind Saltvik <oyvind@…> 17 years ago.
Forgot to add django/utils/file.py
5099_patch_for_streaming_uploads.diff (30.8 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Uses multiple mechanisms for determining the progress id.
5100_upload_request_meta.diff (33.8 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
This isn't important enough to get its own property in request...request.META now contains 'UPLOAD_PROGRESS_ID'
5100_forgot_to_revert_base.diff (32.5 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
I included a bit much in that last patch.
5100_file_upload_core.diff (34.1 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Meant to be cleaner, especially in light of #4165
5100_file_upload_core_with_middleware_hooks.diff (37.7 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Added middleware hooks
5100_file_upload_core_with_middleware_hooks_2.diff (37.7 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Added middleware hooks...this is better.
5100_file_upload_core_with_middleware_hooks_3.diff (41.8 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Apparently I don't know how not to lose files.
5100_file_upload_core_with_middleware_hooks_4.diff (38.9 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Apparently I don't know how not to lose files.
5113_file_upload_core_with_middleware_hooks.diff (39.1 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Fixes WSGI bug.
5116_streaming_upload_fixed_middleware_append.diff (35.4 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Fixed bug where .append(0, func) was called.
5116_streaming_upload_fixed_middleware_append_2.diff (42.4 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Fixed bug where .append(0, func) was called with the files this time.
5126_file_upload_wsgi_tested.diff (41.3 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Tested with WSGI and made a few changes.
5126_file_uploads_latest_patch.diff (39.4 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Added 'state':'starting' to be more like mod_uploadprogress.
5127_file_uploads_no_streaming_fixed.diff (39.2 KB ) - added by Chris Beaven 17 years ago.
There was an error uploading large files with streaming turned off
5313_updated_file_progress.diff (36.7 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Removed some unneeded things. No file progress tracking by default.
5343_1_streaming_file_upload.diff (38.4 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Newer, cleaner version of file upload script
5343_streaming_file_upload_no_assert.diff (39.3 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Newer, cleaner version of file upload script (doh! no random commented assert)
5343_streaming_file_upload_best.diff (38.4 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Sorry about that -- this one is the better one.
5343_cleaned_streaming_file_upload.diff (37.4 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
It's amazing what Trac helps you see.
5343_cleaned_streaming_file_upload_tweaked.diff (37.3 KB ) - added by Michael Axiak <axiak@…> 17 years ago.
Made a subtle fix after testing with #4165.
5343_cleaned_streaming_file_upload_tweaked2.diff (37.3 KB ) - added by klaus.blindert@… 17 years ago.
Fixed microscopic bug in an error path
5722.diff (26.4 KB ) - added by simonbun <simonbun@…> 17 years ago.
Updated patch against r5722
5722.2.diff (22.2 KB ) - added by simonbun <simonbun@…> 17 years ago.
Sorry, that last one patch also included #4165. This is the correct one.
5724_streaming_file_upload.diff (37.8 KB ) - added by Brian Rosner <brosner@…> 17 years ago.
updated to r5724 (previous patch was missing files too)
5820_streaming_file_upload.diff (37.8 KB ) - added by Brian Rosner <brosner@…> 17 years ago.
updated to r5820
6469_streaming_file_upload.diff (38.5 KB ) - added by Faheem Mitha <faheem@…> 17 years ago.
6525_all_tests_pass.diff (39.6 KB ) - added by Øyvind Saltvik <oyvind@…> 17 years ago.
Fixed unicode in POST issues, added django.http.utils, moved str_to_unicode there
6603_all_tests_pass_uploadedfile_wrapper_fixed.diff (42.1 KB ) - added by Øyvind Saltvik <oyvind@…> 16 years ago.
Fixed a problem with UploadedFile wrapper and making sure content is not read in Fie/ImageField
6652_isValidImage_using_tmpfilename.diff (42.7 KB ) - added by Øyvind Saltvik <oyvind@…> 16 years ago.
made the isValidImage validator try the tempfile before reading content
6654_fixed_tests_and_file_clean.diff (45.0 KB ) - added by Øyvind Saltvik <oyvind@…> 16 years ago.
diff did not apply cleanly, fixed
streaming.6710.patch (45.5 KB ) - added by Faheem Mitha <faheem@…> 16 years ago.
Updated streaming patch to trunk rev 6710.
streaming.7092.patch (44.8 KB ) - added by egon 16 years ago.
Made it apply cleanly to rev. 7092
streaming.7092.patch.partial_tests_fix (48.6 KB ) - added by Faheem Mitha 16 years ago.
Slightly modified version of streaming.7092.patch with more tests passing.
streaming.7106.patch (48.0 KB ) - added by egon 16 years ago.
Passes al tests.
ticket2070_rev7277.diff (48.1 KB ) - added by Michael Axiak 16 years ago.
Same patch applied to @7277
2070_revision7339_uploadhandling.diff (38.8 KB ) - added by Michael Axiak 16 years ago.
NEW Upload handling for revision 7339
2070_revision7339_formhandling.diff (11.4 KB ) - added by Michael Axiak 16 years ago.
NEW Form handling for uploaded files for revision 7339
2070_revision7351_latest.diff (67.7 KB ) - added by Michael Axiak 16 years ago.
Latest patch for 2070...includes everything (see comment)
2070_latest7354.diff (68.8 KB ) - added by Michael Axiak 16 years ago.
The latest #2070 patch. Small changes...see comment.
2070_revision7359.diff (76.9 KB ) - added by Michael Axiak 16 years ago.
New diff...some style changes and new documentation.
2070_revision7363.diff (93.2 KB ) - added by Michael Axiak 16 years ago.
Altered documentation to be more approachable.
2070_revision7363.2.diff (94.0 KB ) - added by Michael Axiak 16 years ago.
Slightly newer...handles exhaustion a little bit differently.
2070_revision7377.diff (95.3 KB ) - added by Michael Axiak 16 years ago.
New 2070 patch...includes fixes to the parser in addition to the newforms fix caught by EricB as well as better docs.
2070_revision7396.diff (98.6 KB ) - added by Michael Axiak 16 years ago.
Better patch...not sure how the newforms code in the previous patch was the way it was.
2070_revision7388.diff (98.6 KB ) - added by Michael Axiak 16 years ago.
new patch that fixes error with instantiating the upload handlers, thanks Eric!
2070_revision7484.diff (99.2 KB ) - added by Michael Axiak 16 years ago.
Updated diff per Jacob's requests and useful feedback.
2070_revision7484.2.diff (99.3 KB ) - added by Michael Axiak 16 years ago.
Fixed file_size bug with memory handling.
2070_revision7599.diff (45.6 KB ) - added by Leah Culver 16 years ago.
Updated patch to r7599
2070-r7695.patch (100.6 KB ) - added by Jacob 16 years ago.
Streaming uploads patch updated to [7695] and reviewed.
2070-r7728.patch (114.5 KB ) - added by Jacob 16 years ago.
"release candidate" patch

Change History (376)

comment:1 by oyvind.saltvik@…, 18 years ago

Severity: normalmajor

Should there be a setting to enable/disable this?

Needs testing on python 2.3.

comment:2 by James Bennett, 18 years ago

priority: highnormal
Resolution: duplicate
Severity: majornormal
Status: newclosed

This is a duplicate of #1484.

comment:3 by James Bennett, 18 years ago

Resolution: duplicate
Status: closedreopened

Or is it? Leaving it open until someone who knows more about it can sort this out.

comment:4 by [530], 18 years ago

Detailed description of patch.

Based of the #1448 patch that i experienced trouble with using edit inline and large files.

Changes from trunk:

  • Added STREAMING_UPLOADS=False to global_settings
  • Added parse_streaming_file_upload it http/__init__.py
  • Made changes to modpython and wsgi in handlers to use parse_streaming_file_upload
  • Subclassed cgi.FieldStorage
    • overriding of make_file to use NamedTemporaryFile
    • overriding of bouandry parsers to make readline read in 64k chunks to avoid reading extremely long lines into memory
  • Subclassed rfc822.Message
    • overriding of bouandry parsers to make readline read in 64k chunks to avoid reading extremely long lines into memory
  • Made FileField validate upload using tempfile size
  • Made save_file in db/models/fields/__init__.py pass temporary file name to _save_FIELD_file instead of file contents
  • Made _save_FIELD_file copy temporary file instead of write file from memory

comment:5 by [530], 18 years ago

Error in previous post.

Based of the #1448 patch that i experienced trouble with using edit inline and large files.

Should be.

Based of the #1484 patch that i experienced trouble with using edit inline and large files.

comment:6 by Matias Hermanrud Fjeld <mhf@…>, 18 years ago

This patch uses the rfc822 module. The Python docs says the following about rfc822:

  Deprecated since release 2.3. The email package should be used in preference to the rfc822 module.
  This module is present only to maintain backward compatibility.

comment:7 by Home, 18 years ago

Type: defect

comment:8 by [530], 18 years ago

Type: defect

The problem with the email package is that it reads the entire file into memory. Django supports python 2.3 anyway.

comment:9 by [530], 18 years ago

to use monitor from a view

def progress_view(request):
    
    import pickle

    sessionid = request.COOKIES.get('sessionid', None)
    progress = pickle.load(open('/tmp/'+sessionid, 'r'))

Accessible variables

progress.fullength - full request byte lenght
progress.at_status - current bytes read
progress.at_file - current file read
progress.file_status - current file bytes read


comment:10 by bnewbold@…, 18 years ago

Had a problem on FreeBSD with my /tmp being on a seperate filesystem: os.rename threw an "Invalid cross-device link" error.
The fix was to edit django/db/models/base.py, "import shutils" at the top and change the one instance of "os.rename" to "shutils.move". Works fine now... shutils came default with python2.4 I don't know about other releases, or if it works on other OSs

comment:11 by Adrian Holovaty, 18 years ago

In reference to bnewbold's comment above, it's the shutil module, and it looks like it's been around since at least Python 2.3. (Django supports Python 2.3 and up.)

comment:12 by [Cha0S], 18 years ago

to make it work on windows make the following changes in UploadStateKeeper class:

class UploadStateKeeper:

    def __init__(self, identifier, fullength=None):

        self.identifier = identifier

        if fullength:
            self.state = UploadState(self.identifier, fullength)

        else:
            import pickle
            self.state = pickle.load(self.get_status_path(), 'r')

    def get_status_path(self):
        from tempfile import gettempdir
        return os.path.join(gettempdir(), self.identifier)

    def set_status(self, filename, addlen):
        self.state.set_status(filename, addlen)
        import pickle
        pickle.dump(self.state,open(self.get_status_path(), 'w'))

    def get_state(self):
        return self.state

comment:13 by Jacob, 18 years ago

Can anyone tell me the differences between the latest patch on this ticket and the latest one on #1484? Are these two tickets/patches duplicates or complements?

comment:14 by bahamut@…, 18 years ago

Type: defectenhancement

This ticket is the continuation of #1484.

comment:15 by Jacob, 18 years ago

Sorry if I'm being dense, but does that mean that #1484 should be closed in favor of this one? I'm trying to figure out which patch(es) need to be applied here.

comment:16 by bahamut@…, 18 years ago

Keywords: timezone time zone UTC added

In my opinion, yes #1484 should be closed. The only patch you have to apply is the latest one in this ticket.

It may help to svn up -r 3358 before you apply the patch, or you can try at the current HEAD and see what goes.

comment:17 by Jacob, 18 years ago

Thanks.

comment:18 by Jacob, 18 years ago

Owner: changed from Adrian Holovaty to Jacob
Status: reopenednew

Right, I've finally had a chance to play with this patch. It's a big one, so I want to make sure I understand what's going on before I check it in. So apologies if I'm being dense, but I gotta understand this stuff:

  • I'm confused about why you needed to override Message.readline} just to change the argument to self.fp.readline() (django/http/init.py, line 105). Is the Python implementation broken in some way? What happens if you take that bit out?


  • Pretty much the same question for the FieldStorage subclass. This is a *lot* of duplicated code from the builtin cgi module, and it gives me pause. Can you explain why you need to do this (and perhaps think about if there's another way?)


  • The upload status info stored in /tmp is pretty awesome. However, it doesn't seem to work all the time for me; while uploading a big file if I keep cat'ing that /tmp file it's often just empty. Seems something keeps emptying it out? Also, using the session ID seems like it would cause issues if I tried to upload multiple files at the same time (a very real possibility if I'm uploading really big-ass things).


  • Related to that, setting os.envrion['SESSIONID'] (done in both the WSGI and the mod_python handler) seems very leaky. We've had bugs before with this type of env monkeybusiness... Could you use a threadlocal instead?


I should end, however, by saying that the behavior of this patch rocks. Being able to upload a 2 GB file without locking my server is super-nice, and the status stuff absolutely kicks ass (I think I already said that). I'm pretty sure I can clean up all the stuff I just bitched about, but I've got a pretty long todo.txt and if you can take care of these problems and/or answer my questions, I'll check this in PDQ.

comment:19 by Jacob, 18 years ago

Status: newassigned

comment:20 by [530], 18 years ago

  • If that bit is taken out memory usage soars because most browsers do not use newlines when sending files so it actually reads to EOF. Is it possible to override self.fp.readline() to call self.fp.readline(64000)? Then most of this code would be removed.
  • The upload status is a bit tricy, I need to override FieldStorage to store the status. But since FieldStorage is not a extended class of object there is no super(FieldStorage, self).read_lines_to_outerboundary() causing repetition of code.
  • Don't know how multiple uploads can be solved since it would require multiple id's, maybe use sessions instead if possible. Never experienced the emptiying if the temp file. Does this happen with both mod_python and wsgi?
  • About setting os.environ, the mod_python-handler does not set it, but wsgi does, need to fix that one instead of messing with threadlocal , this may be why the file is emptied if it only happens with wsgi.

comment:21 by anonymous, 18 years ago

Cc: Maniac@… added

comment:22 by anonymous, 18 years ago

Jacob, after reading your questions I'd like to share here my knowledge as an author of original patch in #1484.

  • About readline(N). Actually browsers do send newlines when sending binary files because newlines (byte 0x0A) are present in binary files also. In theory 0x0A in a binary body can be rare enough for readline to read very large chunks in memory. However in practice 0x0A are present often enough (I'm talking from the experience of two services working mainly with big jpegs, mp3 and zips). This is, btw, why my original patch in #1484 didn't include the boundary in readline. The downside of replacing readlines in FieldStorage's code with readline(N) is that the base class doesn't provide this for overriding so you have to actually rewrite one-to-one pretty large chunks of code.
  • To my liking storing upload info in files is not very nice. I mentioned my idea in django-developers about this that it's better be a hook for the user code (in middleware style) that is called from the handler accepting uploaded data. This is more flexible because a user can store this progress not in /tmp but wherever he likes and do more interesting stuff then just measuring progress. For example closing the upload if the file is too big for a server.
  • One more thing that we've already discussed with Oyvind in an old ticket that his patch alters both HTTP handlers and admin app. The old patch was only for HTTP and my point is that these things should be separate because it's much simpler to review and apply.

Hope my comment was useful!

comment:23 by anonymous, 18 years ago

One more nit... Oyvin what was the reason to use rfc822? It's a new thing in your variant of the old patch and I just don't get why it's needed :-)

comment:24 by [530], 18 years ago

  • When uploading two 500mb iso's, it broke down constantly with a memoryerror when not setting a buffer size, everything worked fine when a bufsize was added.
  • About upload storage I do agree about using a hook instead.
  • Rfc822 is still used by FieldStorage, now hidden from view. I just don not think of setting bufsize in make_file instead of passing it to readline (if that works). Have not had time to test yet.
  • Agree that reviewing is easier. But it is nice for easy testing via the admin to see why django is the right tool for the massive job. :)

comment:25 by anonymous, 18 years ago

Ignore my patch 3358-streaming-file_with_settings_and_progressmonitor.3.diff, adding bufsize does not affect readline. Also it seems chunked reads are required for storing the state. Don't see any easy way to make this less code yet. I'll add a new patch with just the WSGI fix.

comment:26 by [530], 18 years ago

I was planning to make a middleware for a progressmonitor hook, but not quite sure when middleware is applied and how to pass this hook from the middleware to FieldStorage. Any help to clarify/fix this would be great.

Also if somenoe could test this successfully using WSGI it would be great.

comment:27 by Maniac <Maniac@…>, 18 years ago

My idea of a hook is not fit into current middleware, I just wanted to retain the style. So to clarify I see it like this.

This can be a separate set of hooks similar to MIDDLEWARE_CLASSES or it can be done in middleware classes themselves. To do the latter we will specifiy that in addition to process_request, _response, _view and _exception methods a middleware can define a new kind of method: process_upload(request, data).

parse_file_upload then would search for these middleware methods just as http requests do in modpython.py and wsgi.py. The list of methods is passed to a FieldStorage decendant that will call them each time it's about to write a chunk of data to the target stream.

This is it... Sorry I can't come with the actual code: pretty busy rolling out my current project.

comment:28 by Todd O'Bryan, 18 years ago

I figured the last attachment was supposed to be a diff and tried to upload it with the right extension. Unfortunately, '.py' got added and now I don't know how to get rid of the attachment.

Sorry for the clutter.

comment:29 by [530], 18 years ago

Seems wsgi can't do streaming uploads this way, but ligthttpd 1.5 will chunk the uploads and has a mod_uploadprogress that serializes to JSON. Will try adapt this patch to serialize the same way so it wont matter if you use this streaming uploads patch or mod_uploadprogress.

comment:30 by Maniac@…, 18 years ago

Why do you say wsgi can't do this? It gives environwsgi.input to the handler that is a file-like object with uploading data. Handler then can read it in chunks (it actually does this in parse_file_upload).

comment:31 by [530], 18 years ago

WSGI may, but it seems like lighthttpd does not pass anything to wsgi before the request is done. I may be wrong, but I have no way of debugging properly.

comment:32 by [530], 18 years ago

Bug in last patch, did not manage to override readline to use a buffer size. If this cannot be done, code must be repeated.

comment:33 by [530], 18 years ago

Wsgi users using lighttpd should use mod_uploadprogress instead. There is some kind of delay before the upload state is updated using wsgi, the same does not happen on mod_python. Is this a issue with my code, the fastcgi/scgi spec or lighttpd's implementation? Can someone test this with fastcgi/scgi on apache?

comment:34 by [530], 18 years ago

GET identifer for upload state.

http://foo.com?bar saves json data in /tmp/bar

For javascript, making a /progress/ url and view see http://blog.lighttpd.net/articles/2006/08/01/mod_uploadprogress-is-back

Makes it easy to add a admin js option to a model to enable upload state in admin.

comment:35 by Todd O'Bryan, 18 years ago

I just provided a patch which steals liberally from Oyvind, but decouples the progress monitor from the rest of the file upload behavior.

In a nutshell, if you don't want to read uploads into memory, you set request.save_file_class to the class of the kind of file-like object you'd like uploads routed to. A sample middleware class is provided which uses a temp file.

I haven't looked at Oyvind's latest code, but my guess is that we could easily move the upload monitor from the guts of Django by providing a file-like object class that keeps track of how much stuff it has received from the POST data stream.

Comments welcome. (But be nice; it's my first patch.)

comment:36 by [530], 18 years ago

To toddobryan, I made some fatal mistakes before 3358-streaming-file_with_settings_and_progressmonitor_fixed.diff + the wsgi fix, see this patch, you may have fixed it but i'm not sure. the latest patch I would say is really important for multiple uploads. There are some more upload progress issues, the json encoder gives two value sets but i can't find out why. Posting my latest diff now.

Btw I like one more developer helping me.

comment:37 by Todd O'Bryan, 18 years ago

OK, you don't have to be very nice. It turns out that I missed perhaps the most important thing, which is wrapping the stream from the POST in some kind of object which prevents lines from being too long and filling up memory.

Oyvind's PostStream seems like the kind of thing to do, but I'd suggest wrapping the stream before we pass it into the FieldStorage object, not inside. I was misled by the fact that the read_binary() method inside there has a fixed 8k buffer.

comment:38 by [530], 18 years ago

Toddobryan I agree. Making it more modular is great.

comment:39 by Todd O'Bryan, 18 years ago

If you could concentrate on getting your progress monitor to work from a file-like object instead of FieldStorage, I'll integrate the PostStream idea into my code so that the readline()s in the FieldStorage class don't read too much.

comment:40 by [530], 18 years ago

Thinking of making the upload monitor as a hook in streaming_upload_middleware.

Then make a upload monitor middleware that hooks onto streaming_upload_middleware by loading before streaming_upload_middleware.

Upload monitor will act exacly like mod_uploadprogress so wsgi users just load streaming_upload_middleware.

comment:41 by Malcolm Tredinnick, 18 years ago

By the way, when you guys think you've addressed all the problems and simplified the code as much as you can, etc, just add a comment pointing to the right patch that needs another review (and mention that you think it's "done").

comment:42 by [530], 18 years ago

To test 'svn up -r 3518', apply patch and add:

django.middleware.upload.UploadStateMiddleware and django.middleware.upload.StreamingUploadMiddleware to MIDDLEWARE_CLASSES.

Uploadstate test currently builtin to admin change form. 'tail /tmp/test123' for upload state as json.

comment:43 by anonymous, 18 years ago

Regular uploads seem to fail if the middlewares are disabled. Backtrace:

Traceback (most recent call last):
  File "/opt/local/lib/python2.4/site-packages/django/core/servers/basehttp.py", line 272, in run
    self.result = application(self.environ, self.start_response)
  File "/opt/local/lib/python2.4/site-packages/django/core/servers/basehttp.py", line 611, in __call__
    return self.application(environ, start_response)
  File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 158, in __call__
    response = self.get_response(request.path, request)
  File "/opt/local/lib/python2.4/site-packages/django/core/handlers/base.py", line 102, in get_response
    return self.get_technical_error_response(request)
  File "/opt/local/lib/python2.4/site-packages/django/core/handlers/base.py", line 134, in get_technical_error_response
    return debug.technical_500_response(request, *sys.exc_info())
  File "/opt/local/lib/python2.4/site-packages/django/views/debug.py", line 131, in technical_500_response
    return HttpResponseServerError(t.render(c), mimetype='text/html')
  File "/opt/local/lib/python2.4/site-packages/django/template/__init__.py", line 155, in render
    return self.nodelist.render(context)
  File "/opt/local/lib/python2.4/site-packages/django/template/__init__.py", line 688, in render
    bits.append(self.render_node(node, context))
  File "/opt/local/lib/python2.4/site-packages/django/template/__init__.py", line 716, in render_node
    raise wrapped
TemplateSyntaxError: Caught an exception while rendering: 'NoneType' object has no attribute 'items'

Original Traceback (most recent call last):
  File "/opt/local/lib/python2.4/site-packages/django/template/__init__.py", line 706, in render_node
    result = node.render(context)
  File "/opt/local/lib/python2.4/site-packages/django/template/defaulttags.py", line 189, in render
    value = bool_expr.resolve(context, True)
  File "/opt/local/lib/python2.4/site-packages/django/template/__init__.py", line 548, in resolve
    obj = resolve_variable(self.var, context)
  File "/opt/local/lib/python2.4/site-packages/django/template/__init__.py", line 657, in resolve_variable
    raise VariableDoesNotExist, "Failed lookup for key [%s] in %r" % (bits[0], current) # missing attribute
  File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 61, in __repr__
    return '<WSGIRequest\nGET:%s,\nPOST:%s,\nCOOKIES:%s,\nMETA:%s>' % \
  File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 97, in _get_post
    self._load_post_and_files()
  File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 75, in _load_post_and_files
    self._post, self._files = http.parse_file_upload(self)
  File "/opt/local/lib/python2.4/site-packages/django/http/__init__.py", line 83, in parse_file_upload
    return default_parse_file_upload(req)
  File "/opt/local/lib/python2.4/site-packages/django/http/__init__.py", line 50, in default_parse_file_upload
    raw_message = '\r\n'.join(['%s:%s' % pair for pair in req.header_dict.items()])
AttributeError: 'NoneType' object has no attribute 'items'

comment:44 by Djordjevic Nebojsa <nesh at studioquattro co yu>, 18 years ago

Cc: nesh@… added

comment:45 by wiz, 18 years ago

Cc: aenor.realm@… added

comment:46 by bahamut@…, 18 years ago

This is anecdotal evidence at best, but  rapid testing of this patch seems to indicate it does not work with the development server.

I basically applied it, updated to head, added the 2 middlewares, tried one of my upload forms, and it just sat there forever, never reaching the Django view code.

comment:47 by [530], 18 years ago

I would not expect this patch to work with the dev server, since it is a single thread you cannot do multiple requests at the same time.

comment:48 by anonymous, 18 years ago

I may be mistaken about my last comment, but the dev server uses the wsgi handler. So it may be the delay problem, or that stdin is not passed before the request is done. So since wsgi using lighttpd works, the dev server probably handle things differently. Don't really know alot about the dev server.

comment:49 by anonymous, 18 years ago

Cc: gary.wilson@… added

comment:50 by hruske, 18 years ago

This doesn't work for me at all. With Opera 9 and Firefox 1.5, I get KeyError on X-Progress-Id, which is supposed to be HTTP_X_PROGRESS_ID? (Webserver is Lighttpd, distro Debian unstable)

Also, isn't this line a bit insecure, given the get_temp_file does an "os.path.join":

content = open(get_temp_file(request.header_dict['X-Progress-Id']), 'r').read()

I've replaced my code to use cache framework instead. But mind you need server-wide cache, per process will probably not be enough.

And then, you probably don't want to open a new TCP connection every time you make a xml request, since this will effectively DOS your server.

I was also having problems with JS - the browser did not want to do a POST at all. It started polling server for progress, but it was not posting at all.

comment:51 by [530], 18 years ago

Lighttpd has a delay before it saves data and progress info.

Supposed to be HTTP_X_PROGRESS_ID?

I did it this way to be compatible with mod_uploadprogress that you should use instead.

Is there another way than opening another tcp connection using xmlhttp?

comment:52 by hruske, 18 years ago

How much should that delay usually be? Since there is no POST for Django, progress bar does not work. This probably means I need to use mod_uploadprogress to get it working, but the new mod_uploadprogress has not been released yet officially.

Yes, HTTP_X_PROGRESS_ID. When I printed contents of request.header_dict I could see a key named HTTP_X_PROGRESS_ID which corresponds to X-Progress-Id.

Sure, one could open a TCP connection and server periodically sends info and client reads from it. But I see that you aim for mod_uploadprogress compatibility.

comment:53 by hruske, 18 years ago

Also, it's X-Progress-ID in lighttpd, with capital ID.

by [530], 18 years ago

Now using X_PROGRESS_ID instead of X-Progress-Id, accepts any lower/uppercase/ - / _ /prefix variant

comment:54 by bahamut@…, 18 years ago

Keywords: streaming upload large added; timezone time zone UTC removed

The patch does not work with Apache 2 + mod_fcgid.

I'm running on Mac OS X 10.4.7 with a pure MacPorts installation (save Django and flup -- both are a svn HEAD).

Apache 2 configuration:

{{{<IfModule mod_fcgid.c>

AddHandler fcgid-script .fcgi


IdleTimeout 600
ProcessLifeTime 3600
MaxProcessCount 8
DefaultMinClassProcessCount 3
DefaultMaxClassProcessCount 3
IPCConnectTimeout 8
IPCCommTimeout 48

</IfModule>

ScriptAlias /myapp "cgi-bin/myapp.fcgi/"}}}

myapp.fcgi:

{{{#!/opt/local/bin/python
# -*- coding: utf-8 -*-
import sys, os

# Add a custom Python path.
sys.path.insert(0, "/opt/local/www")

# Switch to the directory of your project. (Optional.)
os.chdir("/opt/local/www/myapp")

# Set the DJANGO_SETTINGS_MODULE environment variable.
os.environDJANGO_SETTINGS_MODULE = "myapp.settings"

from django.core.servers.fastcgi import runfastcgi
runfastcgi(["method=threaded", "daemonize=false"])}}}

comment:55 by anonymous, 18 years ago

Repost with proper formatting :/

Apache 2 configuration:

<IfModule mod_fcgid.c>
    AddHandler fcgid-script .fcgi
    
    IdleTimeout 600
    ProcessLifeTime 3600
    MaxProcessCount 8
    DefaultMinClassProcessCount 3
    DefaultMaxClassProcessCount 3
    IPCConnectTimeout 8
    IPCCommTimeout 48
</IfModule>

ScriptAlias /myapp "cgi-bin/myapp.fcgi/"

myapp.fcgi:

#!/opt/local/bin/python
# -*- coding: utf-8 -*-
import sys, os

# Add a custom Python path.
sys.path.insert(0, "/opt/local/www")

# Switch to the directory of your project. (Optional.)
os.chdir("/opt/local/www/myapp")

# Set the DJANGO_SETTINGS_MODULE environment variable.
os.environ['DJANGO_SETTINGS_MODULE'] = "myapp.settings"

from django.core.servers.fastcgi import runfastcgi
runfastcgi(["method=threaded", "daemonize=false"])

comment:56 by bahamut@…, 18 years ago

Cc: bahamut@… added

comment:57 by bahamut@…, 18 years ago

I get the following error, again with my mod_fcgid setup, when using the admin interface with the progress status monitor js.

Unhandled exception in thread started by <bound method ThreadPool._worker of <flup.server.threadpool.ThreadPool object at 0x715270>>Traceback (most recent call last):  File "/opt/local/www/data/flup/server/threadpool.py", line 103, in _worker    job.run()  File "/opt/local/www/data/flup/server/fcgi_base.py", line 642, in run    self.process_input()  File "/opt/local/www/data/flup/server/fcgi_base.py", line 678, in process_input    self._do_params(rec)  File "/opt/local/www/data/flup/server/fcgi_base.py", line 777, in _do_params    self._start_request(req)  File "/opt/local/www/data/flup/server/fcgi_base.py", line 761, in _start_request    req.run()  File "/opt/local/www/data/flup/server/fcgi_base.py", line 563, in run    self.server.error(self)  File "/opt/local/www/data/flup/server/fcgi_base.py", line 1157, in error    req.stdout.write('Content-Type: text/html\r\n\r\n' +  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/cgitb.py", line 158, in html    dump.append('%s&nbsp;= %s' % (name, pydoc.html.repr(value)))  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/pydoc.py", line 360, in repr    return Repr.repr(self, object)  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/repr.py", line 24, in repr    return self.repr1(x, self.maxlevel)  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/pydoc.py", line 367, in repr1    return self.escape(cram(stripid(repr(x)), self.maxother))  File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 65, in __repr__    return '<WSGIRequest\nGET:%s,\nPOST:%s,\nCOOKIES:%s,\nMETA:%s>' % \  File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 101, in _get_post    self._load_post_and_files()  File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 79, in _load_post_and_files    self._post, self._files = http.parse_file_upload(self)  File "/opt/local/lib/python2.4/site-packages/django/http/__init__.py", line 82, in parse_file_upload    return req.parse_file_upload(req)  File "/opt/local/lib/python2.4/site-packages/django/middleware/upload.py", line 83, in parse_streaming_file_upload    upload_state = req.upload_state(req)  File "/opt/local/lib/python2.4/site-packages/django/middleware/upload.py", line 126, in __init__    self.state = {'size': int(req.header_dict.get('content-length')),TypeError: int() argument must be a string or a number

comment:58 by bahamut@…, 18 years ago

Something ate line breaks, apparently.

Unhandled exception in thread started by <bound method ThreadPool._worker of <flup.server.threadpool.ThreadPool object at 0x715270>>
Traceback (most recent call last):
  File "/opt/local/www/data/flup/server/threadpool.py", line 103, in _worker
    job.run()
  File "/opt/local/www/data/flup/server/fcgi_base.py", line 642, in run
    self.process_input()
  File "/opt/local/www/data/flup/server/fcgi_base.py", line 678, in process_input
    self._do_params(rec)
  File "/opt/local/www/data/flup/server/fcgi_base.py", line 777, in _do_params
    self._start_request(req)
  File "/opt/local/www/data/flup/server/fcgi_base.py", line 761, in _start_request
    req.run()
  File "/opt/local/www/data/flup/server/fcgi_base.py", line 563, in run
    self.server.error(self)
  File "/opt/local/www/data/flup/server/fcgi_base.py", line 1157, in error
    req.stdout.write('Content-Type: text/html\r\n\r\n' +
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/cgitb.py", line 158, in html
    dump.append('%s&nbsp;= %s' % (name, pydoc.html.repr(value)))
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/pydoc.py", line 360, in repr
    return Repr.repr(self, object)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/repr.py", line 24, in repr
    return self.repr1(x, self.maxlevel)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/pydoc.py", line 367, in repr1
    return self.escape(cram(stripid(repr(x)), self.maxother))
  File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 65, in __repr__
    return '<WSGIRequest\nGET:%s,\nPOST:%s,\nCOOKIES:%s,\nMETA:%s>' % \
  File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 101, in _get_post
    self._load_post_and_files()
  File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 79, in _load_post_and_files
    self._post, self._files = http.parse_file_upload(self)
  File "/opt/local/lib/python2.4/site-packages/django/http/__init__.py", line 82, in parse_file_upload
    return req.parse_file_upload(req)
  File "/opt/local/lib/python2.4/site-packages/django/middleware/upload.py", line 83, in parse_streaming_file_upload
    upload_state = req.upload_state(req)
  File "/opt/local/lib/python2.4/site-packages/django/middleware/upload.py", line 126, in __init__
    self.state = {'size': int(req.header_dict.get('content-length')),
TypeError: int() argument must be a string or a number

comment:59 by bahamut@…, 18 years ago

I have just tested the patch with Apache 2 and mod_python. It works correctly, however I had to make 2 small modifications because my Django application isn't running at the root context of my server.

Basically, the admin JavaScript code hardcodes a request to "/progress/". This needs to be changed appropriately for each install, e.g. in my case "/django/progress/".

Similarly, the UploadStateMiddleware checks that the request path is exactly "/progress/", which needs to be changed to match.

comment:60 by anonymous, 18 years ago

I svn up'ed, applied 3581-streaming_uploads_and_uploadprogress_middleware_x_progress_id.diff, deleted the Django directory under site-packages, reinstalled Django with 'python setup.py install' and restarted apache/modpython but Django appears to be behaving as before. My memory usage still climbs beyond what I'd expect and I don't see any files in /tmp like I'd expect from tempfile.NamedTemporaryFile().

I added these lines to my settings.py file:

STREAMING_UPLOADS=True

UPLOAD_BUFFER_SIZE=2048

Am I missing something?

Thanks!

comment:61 by oyvind@…, 18 years ago

It is 2 middleware now.

django.middleware.upload.UploadStateMiddleware and django.middleware.upload.StreamingUploadMiddleware

Order is important

comment:62 by anonymous, 18 years ago

Thanks for the info!

I changed middleware_classes to look like this:

MIDDLEWARE_CLASSES = (
    'django.middleware.common.CommonMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.middleware.doc.XViewMiddleware',
    'django.middleware.upload.UploadStateMiddleware',
    'django.middleware.upload.StreamingUploadMiddleware',
)

I was able to upload a couple large test files and tempfiles were showing up as expected in /tmp, but I'm intermittently receiving this exception when uploading:

Traceback (most recent call last):
File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/core/handlers/base.py" in get_response
  74. response = callback(request, *callback_args, **callback_kwargs)
File "/home/misjxm/dp1/file/app/views.py" in createmessage
  275. if request.POST:
File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/core/handlers/modpython.py" in _get_post
  51. self._load_post_and_files()
File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/core/handlers/modpython.py" in _load_post_and_files
  32. self._post, self._files = http.parse_file_upload(self)
File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/http/__init__.py" in parse_file_upload
  81. return req.parse_file_upload(req)
File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/middleware/upload.py" in parse_streaming_file_upload
  104. FILES.appendlist(key, FileDict({
File "/usr/lib/python2.4/cgi.py" in __getattr__
  540. raise AttributeError, name

  AttributeError at /file/compose/
  tmp_name

Have you seen this before?

Thanks!

comment:63 by anonymous, 18 years ago

And I only seem to get that exception when uploading text files such as M3U playlists, bashrc files, etc.

comment:64 by anonymous, 18 years ago

Text files in general only cause the problem intermittently. I have one M3U file that consistantly causes the exception to be thrown. This is what local vars looks like when I upload that file:

name      'tmp_name'
self      FieldStorage('file1', 'klipschtest.m3u', "MP3\\2pac\\Greatest Hits (Disc 1)\\1-05 Hail Mary.mp3\r\nFLAC\\James Blunt\\Back to Bedlam\\03-Wisemen.flac\r\nMP3\\Jimi_Hendrix\\UNKNOWN\\Little wing (extended instrumental).mp3\r\nFLAC\\Led Zeppelin\\Remasters (disc II)\\04-No Quarter.flac\r\nFLAC\\Pink Floyd\\The Wall - Disc 1\\03-Another Brick in the Wall. Part 1.flac\r\nFLAC\\Pink Floyd\\The Wall - Disc 1\\04-The Happiest Days of Our Lives.flac\r\nFLAC\\Natalie Merchant\\Live In Concert\\05-Carnival.flac\r\nMP3\\Metallica\\...And Justice For All\\08-To Live Is To Die.mp3\r\nMP3\\Red Hot Chili Peppers\\Blood Sugar Sex Magik\\10-Blood Sugar Sex Magik.mp3\r\nFLAC\\Sade\\Lovers Rock\\03-King Of Sorrow.flac\r\nFLAC\\Stone, Joss\\The Soul Sessions\\05-Dirty Man.flac\r\nFLAC\\U.G.K\\Ridin' Dirty\\03-Murder.flac\r\nMP3\\Weezer\\Weezer (Green Album)\\04-Weezer _ Island In The Sun.mp3\r\nMP3\\Weezer\\Weezer (Green Album)\\03-Weezer _ Hash Pipe.mp3\r\n")

comment:65 by anonymous, 18 years ago

I'm getting an error about 'read only accepts 1 arguement (2 given)", from the following line in cgi.py:

self.fp.read(self.length)

comment:66 by anonymous, 18 years ago

This is the error I'm getting:

C:\web\fxedit>python manage.py runserver
Validating models...
0 errors found.

Django version 0.95, using settings 'fxedit.settings'
Development server is running at http://127.0.0.1:8000/
Quit the server with CTRL-BREAK.
[16/Sep/2006 02:55:31] "GET /file/ HTTP/1.1" 200 1188
[16/Sep/2006 02:55:31] "GET /media/stylesheets/style.css HTTP/1.1" 304 0
[16/Sep/2006 02:55:31] "GET /media/stylesheets/print.css HTTP/1.1" 304 0
Traceback (most recent call last):

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\servers\

basehttp.py", line 272, in run

self.result = application(self.environ, self.start_response)

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\servers\

basehttp.py", line 615, in call

return self.application(environ, start_response)

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\handlers

\wsgi.py", line 156, in call

response = self.get_response(request.path, request)

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\handlers

\base.py", line 102, in get_response

return self.get_technical_error_response(request)

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\handlers

\base.py", line 134, in get_technical_error_response

return debug.technical_500_response(request, *sys.exc_info())

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\views\debug.p

y", line 131, in technical_500_response

return HttpResponseServerError(t.render(c), mimetype='text/html')

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\template\in

it.py", line 155, in render

return self.nodelist.render(context)

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\template\in

it.py", line 688, in render

bits.append(self.render_node(node, context))

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\template\in

it.py", line 716, in render_node

raise wrapped

TemplateSyntaxError: Caught an exception while rendering: read() takes exactly 1

argument (2 given)

Original Traceback (most recent call last):

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\template\in

it.py", line 706, in render_node

result = node.render(context)

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\template\defa

ulttags.py", line 189, in render

value = bool_expr.resolve(context, True)

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\template\in

it.py", line 548, in resolve

obj = resolve_variable(self.var, context)

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\template\in

it.py", line 657, in resolve_variable

raise VariableDoesNotExist, "Failed lookup for key [%s] in %r" % (bits[0], c

urrent) # missing attribute

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\handlers

\wsgi.py", line 61, in repr

return '<WSGIRequest\nGET:%s,\nPOST:%s,\nCOOKIES:%s,\nMETA:%s>' % \

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\handlers

\wsgi.py", line 97, in _get_post

self._load_post_and_files()

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\handlers

\wsgi.py", line 75, in _load_post_and_files

self._post, self._files = http.parse_file_upload(self)

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\http\init

.py", line 80, in parse_file_upload

return req.parse_file_upload(req)

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\middleware\up

load.py", line 87, in parse_streaming_file_upload

fs = FieldStorage(req.raw_request, environ=req.META, headers=req.header_dict

, upload_state=upload_state )

File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\middleware\up

load.py", line 71, in init

keep_blank_values=keep_blank_values, strict_parsing=strict_parsing)

File "C:\Python24\lib\cgi.py", line 524, in init

self.read_urlencoded()

File "C:\Python24\lib\cgi.py", line 629, in read_urlencoded

qs = self.fp.read(self.length)

TypeError: read() takes exactly 1 argument (2 given)

[16/Sep/2006 02:55:53] "POST /file/stats/ HTTP/1.1" 500 3721

by [Cha0S], 18 years ago

windows temporary file locking fix

comment:67 by Chris Beaven, 18 years ago

I'm getting the same error as anonymous above (and am also using Windows).

comment:68 by oyvind@…, 18 years ago

Idea for fixing the m3u upload problem. May need a copy of this util.py to make it work with wsgi.

But I have almost given up on wsgi for this patch, uless anyone are able to shed som light on why wsgi does not like chunked reading from wsgi.input.

http://svn.apache.org/viewvc/httpd/mod_python/trunk/lib/python/mod_python/util.py?view=markup&pathrev=451861

comment:69 by [Cha0S], 17 years ago

There is a problem using this patch with Nginx (and maybe some other webservers). Nginx doesn't pass content-type and content-length headers and the result of this is that FieldStorage can't parse request correctly and nothing is uploaded. A workaround I've found is modification of cgi.FieldStorage to use HTTP_CONTENT_TYPE and HTTP_CONTENT_LENGTH. But I don't think this is elegant solution of this problem. Does anyone have other suggestions on how to solve it ?

comment:70 by Sung-Jin Hong <serialx.net@…>, 17 years ago

Worked for me(Apache mod_python). Working like charm. :) But heres some instructions..

  1. Do the patch (cd django; patch -Np0 < file.diff)
  2. Add two middlewares, modify the model like below:
    class FileList(models.Model):
        name = models.CharField(maxlength=255)
        email = models.EmailField()
    
        class Admin:
            js = ['js/UploadProgress.js']
    
    class AFile(models.Model):
        descr = models.CharField(maxlength=255,core=True)
        file = models.FileField(upload_to='files')
        inlist = models.ForeignKey(FileList,edit_inline=models.STACKED)
    
  1. You have progressbar in admin. :)

Submitting patch to the latest trunk.

comment:71 by Sung-Jin Hong <serialx.net@…>, 17 years ago

Cc: serialx.net@… added

comment:72 by [Cha0S], 17 years ago

There are some issues with this patch. I have it working on my local Linux and Windows boxes. But i can't get to work it completely correct on my server with the same setup as my local linux box (python 2.5, latest Django).
The issue is that when uploading a file WSGIRequest . _load_post_and_files is called only after all data is uploaded. So no streaming occurs and no progress is displayed and I didn't find a solution for this yet..

comment:73 by Øyvind Saltvik <oyvind.saltvik@…>, 17 years ago

[Cha0S] using lighttpd or apache, same lighttpd/apache version on both servers?
I found out lighttpd pre 1.4.0 supported streaming, so that has to be fixed before this patch will work on lighttpd.

comment:74 by anonymous, 17 years ago

This worked great on my system [Apache/2.0.58 (Win32) mod_python/3.2.8 Python/2.4.3]

Following up on Sung-Jin Hong's good instructions with a few tips for newbies like me:

  1. When you apply the patch, make sure you get django/middleware/upload.py. I found that when I did an SVN update to Rev 4065 and applied 4065-streaming_uploads_and_uploadprogress_middleware_x_progress_id_windowsfix.diff I did not get the file. I had to apply 3581-streaming_uploads_and_uploadprogress_middleware_x_progress_id_windowsfix.diff to get it.
  1. Make sure js/UploadProgress.js in accessible. The Admin page's path to it will be in the media directory, e.g. http://yoursite.com/media/js/UploadProgress.js. I had to copy UploadProgress.js from django/contrib/admin/media/js to my local media/js directory.
  1. Make sure the Apache httpd.conf file section for your site allows paths starting with "/progress", i.e. has a line like '<Location ~ "/(admin|shop|progress)">' in it, because the JS code is doing a "req.open("GET", "/progress/", 1)" repeatedly to track progress.

I find that the Upload progress box displays fine in Firefox and Opera, but it is at the very bottom left of the absolute page in IE6 (i.e. you need to scroll to the very bottom to see it). I'm still working on this problem.

Thank you to all who contributed this!

comment:75 by [Cha0S], 17 years ago

[Cha0S] using lighttpd or apache, same lighttpd/apache version on both servers? I found out lighttpd pre 1.4.0 supported streaming, so that has to be fixed before this patch will work on lighttpd.

I've tested with lighttpd and nginx web-servers. same versions on both servers. with apache it works fine on windows.

comment:76 by jacobm3 A T gmail.com, 17 years ago

Component: Core frameworkContrib apps
Type: enhancementdefect

Has anyone come up with a fix for the AttributeError exceptions when uploading text files? I'm still getting exceptions like this:

AttributeError at /file/compose/
tmp_name
Request Method: POST
Request URL: https://myserver/file/compose/
Exception Type: AttributeError
Exception Value: tmp_name
Exception Location: /usr/lib/python2.4/cgi.py in getattr, line 540

Traceback (most recent call last):
File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/core/handlers/base.py" in get_response

  1. response = callback(request, *callback_args, callback_kwargs)

File "/files/lib/file/app/views.py" in createmessage

  1. if request.POST:

File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/core/handlers/modpython.py" in _get_post

  1. self._load_post_and_files()

File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/core/handlers/modpython.py" in _load_post_and_files

  1. self._post, self._files = http.parse_file_upload(self)

File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/http/init.py" in parse_file_upload

  1. return req.parse_file_upload(req)

File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/middleware/upload.py" in parse_streaming_file_upload

  1. FILES.appendlist(key, FileDict({

File "/usr/lib/python2.4/cgi.py" in getattr

  1. raise AttributeError, name

AttributeError at /file/compose/
tmp_name

comment:77 by anonymous, 17 years ago

Type: defectenhancement

oops, didn't mean to change the type on this ticket... changing back to enhancement.

comment:78 by anonymous, 17 years ago

Component: Contrib appsCore framework

comment:79 by wonlay@…, 17 years ago

Cc: wonlay@… added

Hi, jacob
The 'AttributeError' occurred, because FieldStorge does not make_temp_file for small files, and the tmp_name is not defined.
The progress middleware has problem too. You will notice that your upload speed will down to a significant lower rate, if you use this middleware. The performance bottleneck is that, every single 'read' from request reads only a few bytes, even if you set a large buffer_size, and the serialization and save to disc of 'UploadState' is significant.

The attachments is my working copy of code.

comment:80 by anonymous, 17 years ago

Cc: lurker86@… added

comment:81 by Simon G. <dev@…>, 17 years ago

Needs documentation: set
Patch needs improvement: set
Triage Stage: UnreviewedAccepted

Can someone who has used this give us a progress report on this - which of the 3000 patches above are needed? can these be merged into one patch that works?

comment:82 by anonymous, 17 years ago

The one that i can vouch for is 3581-streaming_uploads_and_uploadprogress_middleware_x_progress_id.diff

But it has problems with m3u uploads's for some reason i can't figure out.

Was thinking about the possibility of using email.FeedParser instead of rfc822 but that is not available in python 2.3

This patch does not work with lighttpd, since lighttpd can't stream requests.

comment:83 by Ivan Sagalaev <Maniac@…>, 17 years ago

This patch does not work with lighttpd, since lighttpd can't stream requests.

This is not true. When Django works with Lighttpd over FastCGI lighttpd immediately offloads received data to Django process and it can stream it wherever it wants to. In fact I use one of my services that uploads zipped mp3 albums under lighttpd and streaming works (with my patch in #1484).

comment:84 by Øyvind Saltvik <oyvind.saltvik@…>, 17 years ago

I think you are correct. I guess the problem with this patch and fcgi has to do with not doing blocking reads with a buffer size. But a buffer size is needed if the browser sends a big file without eol's, and for the upload progress.

comment:85 by Ivan Sagalaev <Maniac@…>, 17 years ago

I'm not sure I understand what you're saying... I was replying solely to a claim that lighttpd can't do streaming. It certainly does streaming to a FCGI app. Whatever the app does then (streaming by blocks, by eols or no streaming at all) is a whole another question.

comment:86 by shaunc <shaun@…>, 17 years ago

Cc: shaun@… added

comment:87 by Øyvind Saltvik <oyvind.saltvik@…>, 17 years ago

Making a attempt at using email.FeedParser

http://fivethreeo.dynalias.org/repos/streaming
login: public, password: public

Trac http://fivethreeo.dynalias.org/trac/browser/streaming

comment:88 by Øyvind Saltvik <oyvind@…>, 17 years ago

Need to override FeedParser._parsegen to write the payload directly to a file instead of a in-memory string.

Probably involves copying FeedParser._parsegen and do some modificatioms.

comment:89 by Øyvind Saltvik <oyvind.saltvik@…>, 17 years ago

I am pleased to announce, its alive!!

Just some minor work left ( validators, refactoring, bugfixing ) volunteers?

FeedParser should work with python2.3 if it included with django.

comment:90 by anonymous, 17 years ago

Cc: bahamut@… removed

comment:91 by Øyvind Saltvik <oyvind.saltvik@…>, 17 years ago

To get the most recent diff, now with validator

svn co http://fivethreeo.dynalias.org/repos/streaming

then

cd streaming

svn diff -r371

by Øyvind Saltvik <oyvind.saltvik@…>, 17 years ago

did not work without middleware, fixed

comment:92 by Øyvind Saltvik <oyvind.saltvik@…>, 17 years ago

Anyone able to help out on fastcgi testing?

comment:93 by anonymous, 17 years ago

Cc: antonio.mele@… added

will this be added to django 1.0?

comment:94 by Øyvind Saltvik <oyvind.saltvik@…>, 17 years ago

Tested on lighttpd/fastcgi without errors or memory hogging.

Should feedparser be included with django so python2.3 users can use this feature too?

The way the feedparser works there is no way to not duplicate the entire _parsegen function as I see it.

comment:95 by anonymous, 17 years ago

Cc: herbert.poul@… added

comment:96 by Joakim Sernbrant <serbaut@…>, 17 years ago

Streaming file uploads, take #3

I would really like to see something like this committed to
trunk. However I think the previous patches has been too complicated
and tried to do too much. Lack of documentation and tests has probably
been a factor as well.

I have rewritten the file upload to make it simpler and added some
documentation and tests.

  • Simplified parsing by using a specialized RFC 2388 parser. The

previous patches used different RFC 2822 parsers which were not suited
for the task. The new parser is very efficient both cpu and memory
wise. It uses mx.TextTools for boudary searches if available.

  • Removed AJAX code and progress meter. While I think it is a good

thing I think it hinders the commitment of this patch. We should open
a new ticket for a progress meter in the Admin interface. The
X-Progress-ID header is still supported and a pickled progress file is
written if supplied.

  • Added error handling.
  • Added Documentation.
  • Added tests.

Tested with mod_python and lighttpd.

You need to specify the FILE_UPLOAD_DIR setting to enable streaming (read forms doc).

Patch attached (4459-streaming-file-upload.diff)

by Joakim Sernbrant <serbaut@…>, 17 years ago

Simplified streaming uploads

comment:97 by Øyvind Saltvik <oyvind.saltvik@…>, 17 years ago

Nice work. Agree with making ajax progressmeter a own ticket.

comment:98 by Ivan Sagalaev <Maniac@…>, 17 years ago

A couple of comments from quick reading of 4459-streaming-file-upload.diff

  • It streams uploads to files only if FILE_UPLOAD_DIR is set. But it looks that it stores files in temp directory anyway. What's the point of FILE_UPLOAD_DIR then?
  • Instead of FILE_UPLOAD_DIR it would be nice to have a setting like MAX_FILE_UPLOAD_BUFFER with the maximum number of bytes stored in memory during an upload. And only if the file is bigger it is streamed to disc. This will have a nice effect of not touching disc when dealing with small files which is faster
  • It looks like any Exception during the parsing is swallowed and returned as a dict of errors that is then stored in a POST data. This actually breaks the whole exceptions paradigm :-).
  • All the save_FIELD_file now accept raw_field. This is backward-incompatible but worse it limits this method to work only with uploaded files while now save_FIELD_file can accept any content not matter how it was created in a program. I understand that just renaming streamed upload is a fast feature but I believe it deserves a new method. Like 'save_FIELD_from_file' or something...

in reply to:  98 ; comment:99 by Joakim Sernbrant <serbaut@…>, 17 years ago

Replying to Ivan Sagalaev <Maniac@SoftwareManiacs.Org>:

A couple of comments from quick reading of 4459-streaming-file-upload.diff

  • It streams uploads to files only if FILE_UPLOAD_DIR is set. But it looks that it stores files in temp directory anyway. What's the point of FILE_UPLOAD_DIR then?

A StringIO() buffer is used in that case so no files are used.

  • Instead of FILE_UPLOAD_DIR it would be nice to have a setting like MAX_FILE_UPLOAD_BUFFER with the maximum number of bytes stored in memory during an upload. And only if the file is bigger it is streamed to disc. This will have a nice effect of not touching disc when dealing with small files which is faster

The problem is that you do not know in advance how large a file is. However you know how large the POST is in total so you could enable file streaming based on that. I did not want to create too many settings variables but I agree that it might be a good idea. Another option would be to have a setting where you name the form fields what you want to have streamed.

  • It looks like any Exception during the parsing is swallowed and returned as a dict of errors that is then stored in a POST data. This actually breaks the whole exceptions paradigm :-).

Yes, I did that to avoid too many changes in wsgi.py/modpython.py since you have to take care not to try to read the input stream again in case of an exception. Meaning that I always wanted to return something. Bad choice maybe but I tried to make as few changes as possible so that the powers that be could start looking at the patch without too much confusion ;)

  • All the save_FIELD_file now accept raw_field. This is backward-incompatible but worse it limits this method to work only with uploaded files while now save_FIELD_file can accept any content not matter how it was created in a program. I understand that just renaming streamed upload is a fast feature but I believe it deserves a new method. Like 'save_FIELD_from_file' or something...

Aren't those methods invoked automatically for you when you do .save()? I didnt think they were public but I dont understand exacly how they work so I take your word for it...

by Joakim Sernbrant <serbaut@…>, 17 years ago

Added FILE_UPLOAD_MIN_SIZE (default 100kb) to define minimum request size for streaming to disk. Propage exeptions. I'm not too happy with the names of the settings anymore :/

comment:100 by Adrian Holovaty, 17 years ago

(I've removed many attachments, as per a request by oyvind.saltvik@… on django-developers.)

in reply to:  99 comment:101 by Ivan Sagalaev <Maniac@…>, 17 years ago

The problem is that you do not know in advance how large a file is. However you know how large the POST is in total so you could enable file streaming based on that.

Yeah that what I meant... The whole point is to not make Django eat everything in memory so a settings topping this amount seems most logical to me.

I saw you made this setting. Here are couple of nits:

file_upload_min_size = getattr(settings, 'FILE_UPLOAD_MIN_SIZE', 100000)

  1. There is a convention in Django to put a setting with its default value into django/conf/global_settings.py. This keeps all defaults in one place and ensures that settings are always present so you don't have to fail-safe them with getattr, just use settings.SETTING_NAME
  1. FILE_UPLOAD_MIN_SIZE sounds confusing, like we don't allow users to upload small files :-). I still like an idea of calling it a buffer size somehow (and actually use it as a buffer size when reading chunks from input). What do you (and everyone) think?
  1. 100 KB is more like 100 * 1024. But I'd make it about 512 * 1024 or 1024 * 1024. This is small enough to not even show on memory stats but big enough to handle most profile photos uploaded in practice without touching disk most of the times.

Yes, I did that to avoid too many changes in wsgi.py/modpython.py since you have to take care not to try to read the input stream again in case of an exception.

Well... An exception ensures just this: the program won't run normally in case of errors, return values on the other hand are easy to forget to check. I see that you have removed swallowing the Exception. Why not MultiPartParserError then? As I understand this will happen only when the input is really malformed (like not created by a browser) so it's very little a user application can do about it. And I think it shouldn't. I think we should catch this exception in handlers/base.py and unconditionally return '400 Bad Request' on it.

Aren't those methods invoked automatically for you when you do .save()? I didnt think they were public but I dont understand exacly how they work so I take your word for it...

They are called from save(), yes. But they are also intended as a public API and documented (it's in model's API doc, too lazy to get a link :-) ).

comment:102 by Gábor Farkas <gabor@…>, 17 years ago

Cc: gabor@… added

comment:103 by David Cramer <dcramer@…>, 17 years ago

Cc: dcramer@… added

comment:104 by Nebojša Đorđević <nesh@…>, 17 years ago

Cc: nesh@… added; nesh@… removed

in reply to:  102 comment:105 by michell.jo@…, 17 years ago

Replying to Gábor Farkas <gabor@nekomancer.net>:

Sorry to clutter the thread.

Can this patch be used with the newforms framework, or does it only work with oldforms?

comment:106 by Øyvind Saltvik <oyvind.saltvik@…>, 17 years ago

Can be used in newforms if you handle the incoming data, form_for_model has no automatic filefields yet.

in reply to:  11 comment:107 by michell.jo@…, 17 years ago

Replying to adrian:

In reference to bnewbold's comment above, it's the shutil module, and it looks like it's been around since at least Python 2.3. (Django supports Python 2.3 and up.)

os.rename('old', 'new') does result in an 'invalid cross-link device' exception when 'old' and 'new' are on different filesystems. It doesn't make sense to do this, as once a large file has been streamed to disk, copying to another filesystem is a pointless additional overhead. However, the code could catch the error, and do a copy and then delete operation (and generate a warning that the system can be configured more efficiently).

Or it could use shutil as suggested by a previous poster.

comment:108 by michell.jo@…, 17 years ago

Is it possible that this patch could be updated so that it works with the current source version? At the moment if one checks out revision 4459, patches it and then executes svn update, clashes arise at several points in the code.

comment:109 by Brian Koebbe <koebbe@…>, 17 years ago

Cc: koebbe@… added

by axiak@…, 17 years ago

I've updated the patch to include the shutils command and to work with [5065]. Please check to see if it works.

by axiak@…, 17 years ago

Works in [5065], renamed settings variable, uses global settings, defaults STREAMING_MIN_POST_SIZE to .5MB (please test though!)

comment:110 by Michael Axiak <axiak@…>, 17 years ago

Cc: axiak@… added

by Øyvind Saltvik <oyvind@…>, 17 years ago

Updated to trunk, without changes

comment:111 by Øyvind Saltvik <oyvind@…>, 17 years ago

Axiak

But does this not limit streaming to gnu os'es by using shutils. Should it not try shutils first, dont import shutils at the top of the .py Then try rename, and finally just write from filecontent if that fails.

by Michael Axiak <axiak@…>, 17 years ago

Usese shutils, but falls back.

by Michael Axiak <axiak@…>, 17 years ago

Usese shutils, but falls back, this time it deletes the tmp file even when it falls back.

in reply to:  111 comment:112 by Michael Axiak <axiak@…>, 17 years ago

Replying to Øyvind Saltvik <oyvind@saltvik.no>:

Axiak

But does this not limit streaming to gnu os'es by using shutils. Should it not try shutils first, dont import shutils at the top of the .py Then try rename, and finally just write from filecontent if that fails.

I did not know it only worked in GNU os's. Also, AFAIK, there's no OS-independent way to copy files in python, and if file streaming was enabled, file_fieldcontent will not have the desired content. Thus, the only sure-fire way to make it work is to do a file.write file.read combination. Luckily it probably won't have to do this, since it will try both shutils and os.rename before it gets to that point.

comment:113 by Øyvind Saltvik <oyvind@…>, 17 years ago

One more thing, the last fallback should read and write incrementally, to avoid reading the whole file before it is written to another.

by Michael Axiak <axiak@…>, 17 years ago

This file uses python fallbacks, but using chunks to avoid loading the entire file into memory.

by Michael Axiak <axiak@…>, 17 years ago

Cleaned it up a bit. Moved file_move_safe into django.utils in case it should be used in future endeavors.

comment:114 by Michael Axiak <axiak@…>, 17 years ago

Needs documentation: unset
Summary: [patch] Large streaming uploadsLarge streaming uploads
Version: SVN

This last patch seems clean. If people wanted to start testing, now would be a good time :). It works in [5079] and should work on all platforms while considering all the ideas posted here.
It also has documentation. I will not remove the `Patch needs improvement' flag to let others review this patch with a finer comb than I have.

Please post here if you have additional concerns, comments, or enhancements.

comment:115 by Joakim Sernbrant <serbaut@…>, 17 years ago

The raise at http/init__.py:114 looks broken, else: missing.

comment:116 by anonymous, 17 years ago

How to use x-progress-bar ??

by Øyvind Saltvik <oyvind@…>, 17 years ago

Added missing else

by Øyvind Saltvik <oyvind@…>, 17 years ago

Forgot to add django/utils/file.py

comment:117 by anonymous, 17 years ago

Hi,

Is x-progress-bar is a header that come with apache only or it is a start header?

Is it work on fcgi, if not, can I change x-progress-bar to hidden form?

Thank.

comment:118 by Øyvind Saltvik <oyvind@…>, 17 years ago

It is just a header that has to be sent along with the request.

X-Progress-ID with a value

then FILE_UPLOAD_DIR + the value of the header X-Progress-ID is a pickled python object.

That can be used by any view and displayed using javascript.

comment:119 by anonymous, 17 years ago

Thank for your replying,

I have tested this code by modified old javascript (middleware enable). I never seen X-Progress-ID in header, so, a temp file with that id never created.

Can I have example usage?

Thank again

comment:120 by Øyvind Saltvik <oyvind@…>, 17 years ago

Actually, this patch does this wrong, the header is only used for getting progress status, the id should be the querystring.

comment:121 by Michael Axiak <axiak@…>, 17 years ago

As this site states:
http://search.cpan.org/~ceeshek/Apache2-UploadProgress-0.2/lib/Apache2/UploadProgress.pm#HANDLERS

We should be using multiple mechanisms to get this progress ID.
After I mull over this some more and understand it better, I'll make a new patch.

by Michael Axiak <axiak@…>, 17 years ago

Uses multiple mechanisms for determining the progress id.

comment:122 by Michael Axiak <axiak@…>, 17 years ago

The patch has now been merged into [5099] and I added a new property to request: progress_id. This may simplify JavaScript Middleware (we should make a new ticket for that) as it only has to do request.progress_id. If it's None, there was no progress ID sent. Otherwise, there was.

comment:123 by Michael Axiak <axiak@…>, 17 years ago

This ticket is only for making sure file uploads work. I have created ticket #4165 which s specifically for the progress bar enhancements.

in reply to:  117 comment:124 by anonymous, 17 years ago

Replying to anonymous:

Hi,

Is x-progress-bar is a header that come with apache only or it is a start header?

Is it work on fcgi, if not, can I change x-progress-bar to hidden form?

Thank.

X-Progress-ID is used to tell the server between loads "This is the same post". I'm actually not sure how the headers are supposed to be populated, only that we're supposed to "support them". After reading a perl module for quite some time, I've determined that they use javascript to append a random 32-digit hex at the end of their form action ("?progress_id=abcdef0123456789abcdef0123456789") right before the form submits.

If you do this, it will work.

NB: Using a hidden POST field will NOT work. Why? Because it's the POST upload we're trying to track!

by Michael Axiak <axiak@…>, 17 years ago

This isn't important enough to get its own property in request...request.META now contains 'UPLOAD_PROGRESS_ID'

by Michael Axiak <axiak@…>, 17 years ago

I included a bit much in that last patch.

by Michael Axiak <axiak@…>, 17 years ago

Attachment: 5100_file_upload_core.diff added

Meant to be cleaner, especially in light of #4165

by Michael Axiak <axiak@…>, 17 years ago

Added middleware hooks

by Michael Axiak <axiak@…>, 17 years ago

Added middleware hooks...this is better.

in reply to:  27 comment:125 by anonymous, 17 years ago

Replying to Maniac <Maniac@SoftwareManiacs.Org>:

My idea of a hook is not fit into current middleware, I just wanted to retain the style. So to clarify I see it like this.

This can be a separate set of hooks similar to MIDDLEWARE_CLASSES or it can be done in middleware classes themselves. To do the latter we will specifiy that in addition to process_request, _response, _view and _exception methods a middleware can define a new kind of method: process_upload(request, data).

parse_file_upload then would search for these middleware methods just as http requests do in modpython.py and wsgi.py. The list of methods is passed to a FieldStorage decendant that will call them each time it's about to write a chunk of data to the target stream.

This is it... Sorry I can't come with the actual code: pretty busy rolling out my current project.

After reviewing all the comments, this one struck me in particular. I have added a process_upload(UploadException) method to MiddleWare.
I'm sure I might get heat for this recent addition, but I think it's the cleanest way to allow extensibility.

Essentially, you can create a descriptor
for the file_progress attribute of request. It gets treated like a dictionary so it will be passed things like {'received': 1414,'size': 100000} and is expected to return it when its asked for. You are given an exception to raise if something should cancel the file upload, so I could imagine someone writing a descriptor that would compare received and some target size and cancel the upload if it's more than that.

For more details, please look at django/http/file_descriptor.py for the fallback descriptor. If you named that exact class process_upload and stuck it in a middleware, it would be used (and would work). I'll come up with another example soon, and stick it in #4165.

by Michael Axiak <axiak@…>, 17 years ago

Apparently I don't know how not to lose files.

by Michael Axiak <axiak@…>, 17 years ago

Apparently I don't know how not to lose files.

by Michael Axiak <axiak@…>, 17 years ago

Fixes WSGI bug.

by Michael Axiak <axiak@…>, 17 years ago

Fixed bug where .append(0, func) was called.

by Michael Axiak <axiak@…>, 17 years ago

Fixed bug where .append(0, func) was called with the files this time.

comment:126 by anonymous, 17 years ago

Break!

Traceback (most recent call last):

File "C:\Python25\lib\site-packages\django\core\servers\basehttp.py", line 272, in run

self.result = application(self.environ, self.start_response)

File "C:\Python25\lib\site-packages\django\core\servers\basehttp.py", line 614, in call

return self.application(environ, start_response)

File "C:\Python25\lib\site-packages\django\core\handlers\wsgi.py", line 207, in call

request = WSGIRequest(environ)

File "C:\Python25\lib\site-packages\django\core\handlers\wsgi.py", line 78, in init

self.METAUPLOAD_PROGRESS_ID = self._get_file_progress_id()

File "C:\Python25\lib\site-packages\django\core\handlers\wsgi.py", line 187, in _get_file_progress_id

self._req.args)

AttributeError: 'WSGIRequest' object has no attribute '_req'

by Michael Axiak <axiak@…>, 17 years ago

Tested with WSGI and made a few changes.

comment:127 by Michael Axiak <axiak@…>, 17 years ago

Alright, I tested/tweaked the patch on a few different platforms. Here are my notes:

  • Apache2/mod_python: Works well. Works with #4165 and generates pretty progress bar
  • Apache2/mod_fastcgi: Works well. Works with #4165 and generates pretty progress bar
  • Apache2/mod_scgi: Untested.
  • LighTTPD/mod_fastcgi: Does NOT work. However, version 1.5 will have mod_uploadprogress which will provide a JSON interface just like the one in #4165. Thus, using mod_upgradeprogress to generate a progress bar should be a breeze. I don't have lighttpd 1.5 to test, though.
  • manage.py development server: World well. Progress bar does not work AFAIK, the development server is singlethreaded and it will stay that way. The progress bar is an inherently multithreaded process (one to serve progress information, one to receive the uploaded information).

Let me know if you have anymore problems.

by Michael Axiak <axiak@…>, 17 years ago

Added 'state':'starting' to be more like mod_uploadprogress.

comment:128 by anonymous, 17 years ago

Hi,

I have tested under window xp, apache 2.2.4, mod_python 3.3. When I upload a large file via admin interface.

A temp file with named by session generated by javascript client is created. However, a file with random name and '.upload' extension is also created. File stream is append to '.upload' version instead of session id one.

Any idea??

in reply to:  128 comment:129 by Michael Axiak <axiak@…>, 17 years ago

Replying to anonymous:

Hi,

I have tested under window xp, apache 2.2.4, mod_python 3.3. When I upload a large file via admin interface.

A temp file with named by session generated by javascript client is created. However, a file with random name and '.upload' extension is also created. File stream is append to '.upload' version instead of session id one.

Any idea??

Yes, this is how it is supposed to work, which is why there are hooks for middleware (see illustration middleware in #4165).

You see, we don't need to keep track of the actual file that's being uploaded, because it's always being uploaded (by the same thread, same loop, etc.). We do need to be able to look for the 'progress' of that file upload, which has to be stored in some persistent manner, keyed by the session id. Since the file upload code has no model, can't rely on cache, and does have a safe place to put temp files, it by default puts that state information in that location.

If you install the patch 5126_file_upload_contrib_app.diff and add 'django.contrib.uploadprogress.middleware.FileProgressCached' to your MIDDLEWARE_CLASSES setting, the cache framework will be used, and no session_id file will be created.

Alternatively, you can use that patch and add 'django.contrib.uploadprogress.middleware.FileProgressDB' to your MIDDLEWARE_CLASSES and add 'django.contrib.uploadprogress' to your INSTALLED_APPS (and run a syncdb...).

Or, you are welcome to look at the code in django.contrib.uploadprogress.middleware.FileProgressCached and write your own method of storing this file progress information (it's in a dictionary format).

Sounds like it's working though, thanks for the feedback.

comment:130 by Michael Axiak <axiak@…>, 17 years ago

I should also explain the rationale for the naming of the files:

We use the X-Progress-ID value for the name of the file that stores the progress because in subsequent requests we're going to need to get the file by that same value.

We use a (python) temporary file name for the actual file that's being uploaded because it guarantees uniqueness. Since X-Progress-ID is provided by the client, we can't guarantee someone won't start auto-generating requests and cause a collision on the files themselves. I have thought of putting the X-Progress-ID in the session. This can be done but it relies on the session app -- which is not a prereq for Django to work -- being activated and file uploading is a core feature.

It doesn't seem like too much of an issue if the progress tracking has a small chance of getting munged. The actual files, OTOH, would be a bit worrisome.

That being said, if you really don't like the naming, and have a better suggestion, by all means suggest it.

in reply to:  127 ; comment:131 by anonymous, 17 years ago

Replying to Michael Axiak <axiak@mit.edu>:

Alright, I tested/tweaked the patch on a few different platforms. Here are my notes:
...

My new report:

  • Apache2/mod_python: Works well. Works with #4165 and generates pretty progress bar
  • Apache2/mod_fastcgi: Works well. Works with #4165 and generates pretty progress bar
  • Apache2/mod_scgi: Untested.
  • LighTTPD/mod_fastcgi: I stand corrected (Thanks SmileyChris). It does help with LighTTPD inasmuch as it will make the interaction between lighttpd and python not eat all the memory and CPU. However, it will not show a progress bar (nor are other features in #4165 supported), as the streamed 'upload' only exists between lighttpd and python, not between the web client and the web server.
  • manage.py development server: World well. Progress bar does not work AFAIK, the development server is singlethreaded and it will stay that way. The progress bar is an inherently multithreaded process (one to serve progress information, one to receive the uploaded information).

Also, when uploading a 4.5 GB file, the content length was reported incorrectly. This happened with both FastCGI and mod_python on apache2.

in reply to:  131 comment:132 by anonymous, 17 years ago

Replying to anonymous:

...

Also, when uploading a 4.5 GB file, the content length was reported incorrectly. This happened with both FastCGI and mod_python on apache2.

It turns out this happens when you don't have Large File support (on either the client or the server?).

by Chris Beaven, 17 years ago

There was an error uploading large files with streaming turned off

comment:133 by Chris Beaven, 17 years ago

I also tidied up the code a bit (just white space tweaking mostly).

I noticed that currently, you need to set an upload directory even if you are using a middleware which overrides the FileProgressDescriptor of request because of calls like this in http.__init__:

            if self._progress_filename:
                self._request.file_progress = {'state': 'done'}

I'm not sure if relying on _progress_filename is the best way to do this.

comment:134 by Malcolm Tredinnick, 17 years ago

Can we have a moratorium on uploading yet more changes to this ticket for a few days, please?

I've started reading through a few of the patches carefully with an eye to working out what parts look like they're ready to be committed. However, this isn't going to work if every day there is yet another slight tweak to something. It's going to take a few days to review all this carefully and think about all the issues raised in the comments, so everybody just take a deep breath and back away from the keyboards for a bit.

comment:135 by lukeman@…, 17 years ago

I've yet to test large files with the patch applied, but I've noticed my files are being created with 600 permissions instead of the 644 they received before patching. Not sure if it's just my setup, but I can't recall any other changes over the past day or so that would have affected it.

in reply to:  135 comment:136 by Michael Axiak <axiak@…>, 17 years ago

Replying to lukeman@gmail.com:

I've yet to test large files with the patch applied, but I've noticed my files are being created with 600 permissions instead of the 644 they received before patching. Not sure if it's just my setup, but I can't recall any other changes over the past day or so that would have affected it.

It should inherit the permissions from the temporary directory. I can't think of a better setup than this, but if people think this is unexpected/problematic maybe we can change it.

If people understand streaming files go to the temp folder then to the other folder, then I think inheriting the permissions from the temp folder would be expected behavior.

-Mike

comment:137 by anonymous, 17 years ago

I have tested multiple upload at the same time (difference browser instance). If there are more than two upload request, when a first request
is sent, it work properly. If I make a new call, an old stop immediately and a new one not create a session and temp file anymore.

Django is great in advance, please keep it simple and work in common task.

in reply to:  137 comment:138 by Michael Axiak <axiak@…>, 17 years ago

Replying to anonymous:

I have tested multiple upload at the same time (difference browser instance). If there are more than two upload request, when a first request
is sent, it work properly. If I make a new call, an old stop immediately and a new one not create a session and temp file anymore.

Django is great in advance, please keep it simple and work in common task.

Hi,

I have tested it with simultaneous file uploads and it worked fine. Could you please include more details? For instance, I don't think there would be any chance it would work in the development server. It may break in some FastCGI configurations, too. (The reason being that accepting multiple file uploads requires some multithreadedness.)

-Mike

comment:139 by anonymous, 17 years ago

Cc: rudolph.froger@… added

comment:140 by anonymous, 17 years ago

Hi

If I stop upload process by clicking a "Stop" button. It look like sever see a same upload request and never generate process id.

How to solve this problem?

Thank

comment:141 by anonymous, 17 years ago

Cc: jm.bugtracking@… added

in reply to:  140 comment:142 by Michael Axiak <axiak@…>, 17 years ago

Replying to anonymous:

Hi

If I stop upload process by clicking a "Stop" button. It look like sever see a same upload request and never generate process id.

How to solve this problem?

Thank

I'm not sure why you'd be seeing this. If you're using #4165 then the javascript should create a new progress id on *every* upload. If you're not using #4165, then progress id is irrelevant. While testing I was able to stop and re-upload in FF 2, 1.5, IE 7, and IE 6 without a problem. Could you please describe your environment with more detail?

-Mike

by Michael Axiak <axiak@…>, 17 years ago

Removed some unneeded things. No file progress tracking by default.

comment:143 by Michael Axiak <axiak@…>, 17 years ago

I've cleaned the FileProgressDescriptor architecture a little bit. Three changes for this patch:

  • using request.__class__ = descriptor rather than using a weird descriptor wrapper.
  • defaulting to a NullFileProgressDescriptor (which does nothing) versus one that stored the information in temp upload directory.
  • making safe_file_move do nothing if the source and destination are the same string.

If you want to track file upload progress you need to install the stuff from #4165. This ticket just contains the hooks to get #4165 there with just a minimal of contrib. This patch makes that even clearer by removing the last remnants of functional file progress tracking.

I *think* this is probably where we want to leave this ticket (in terms of what should be cut out of it, what should be put in). Although, I can see several other separation (take care of form stuff in another ticket, for instance).

It is worth noting (because someone had issue with this in IRC), that if you try to use this save_FIELD_file(name, content) should now be given a request.FILE item, rather than the actual content of the file.

For example:

    object.save_video_file(request.FILE['video']['filename'], request.FILE['video']['content'])

is bad, since the mere invocation request.FILE['video']['content'] will load all the content into memory, exactly what we're trying to avoid.

The new version looks like:

    object.save_video_file(request.FILE['video']['filename'], request.FILE['video'])

In reply to SmileyChris, we have to have an upload directory anyway, since the FileProgressDescriptor only changes how file *progress* is tracked, not the actual uploading of files (that is, it has to go into a directory, or else this entire thing is disabled). We could rename and clean up if people prefer it changed.

in reply to:  143 ; comment:144 by anonymous, 17 years ago

Replying to Michael Axiak <axiak@mit.edu>:

The new version looks like:

    object.save_video_file(request.FILE['video']['filename'], request.FILE['video'])

Why not skip request.FILEvideofilename at all? This is provided by request.FILEvideo anyway.
IMHO having filename as an separate parameter makes only sense if this would be used somehow directly, but currently the filename/path where the file gets saved is changed by the field (field.get_directory_name() and field.get_filename(filename)).

in reply to:  144 comment:145 by Michael Axiak <axiak@…>, 17 years ago

Replying to anonymous:

Why not skip request.FILEvideofilename at all? This is provided by request.FILEvideo anyway.
IMHO having filename as an separate parameter makes only sense if this would be used somehow directly, but currently the filename/path where the file gets saved is changed by the field (field.get_directory_name() and field.get_filename(filename)).

Well, save_video_file() is meant to allow someone to save a file. Unless that file is specifically provided in a post (it's not necessarily, and in many of my applications it's not...generated images come to mind).

We *can* have it accept both sets of arguments, but then it's too nonorthogonal. Perhaps we should just create a new one? move_video_file()??

by Michael Axiak <axiak@…>, 17 years ago

Newer, cleaner version of file upload script

by Michael Axiak <axiak@…>, 17 years ago

Newer, cleaner version of file upload script (doh! no random commented assert)

by Michael Axiak <axiak@…>, 17 years ago

Sorry about that -- this one is the better one.

comment:146 by Michael Axiak <axiak@…>, 17 years ago

A list of changes in this new version:

  • I removed the file_progress descriptor handling and middleware -- if you need to track file progress, you can use process_request to achieve what you need.
  • I cleaned up a lot of things, removed many catch-all excepts and replaced with except OSError etc.
  • Included cross-platform locking for saving/moving files manually (django/utils/file_locks.py)
  • Made move_file_safe(src,dest) to care if a file already exists and throw an error if one does by default.
  • Moved stuff out of http/__init__.py into various locations.
  • In reply to SmileyChris' comment earlier, no longer uses self._progress_file to determine if a file is being tracked. Now using self._track_progress (a boolean)
  • save_field_file can now optionally not take a filename (if filename = None). To avoid confusion, there is an additional function, instance.move_field_file which takes one parameter: the request.FILES value. Code examples should use move_field_file to distinguish this new functionality.
  • Stay tuned for updates to #4165 to reflect these and other changes!

Example of new move_video_file Model instance function:

#from this
foo.save_video_file(request.FILES['video']['filename'], request.FILES['video'])

#to this
foo.move_video_file(request.FILES['video'])

by Michael Axiak <axiak@…>, 17 years ago

It's amazing what Trac helps you see.

by Michael Axiak <axiak@…>, 17 years ago

Made a subtle fix after testing with #4165.

comment:147 by anonymous, 17 years ago

def init(self, headers, input, request, file_upload_dir=None, streaming_min_post_size=None, chunk_size=1024*64):

parser = MultiPartParser(headers, input, request, file_upload_dir, streaming_min_post_size)

Is chunk_size is configurable? Do you hard code, aren't you.

Thank

in reply to:  147 comment:148 by Michael Axiak <axiak@…>, 17 years ago

Replying to anonymous:

def init(self, headers, input, request, file_upload_dir=None, streaming_min_post_size=None, chunk_size=1024*64):

parser = MultiPartParser(headers, input, request, file_upload_dir, streaming_min_post_size)

Is chunk_size is configurable? Do you hard code, aren't you.

Thank

Actually, that's a great point. I am going to change a few things (shift around save_FIELD_file and move_FIELD_file to make things more explicit) and on my list for the next upload would be to allow chunk sizes to be a settings variable. (I was actually discussing this with Øyvind in IRC, http://simon.bofh.ms/logger/django/2007/05/26/15/25 .)

Thanks for the comment, though. Feel free to post any other suggestions!

comment:149 by bricetebbs@…, 17 years ago

Works for me. Windows/Apache2.2.4/ModPython.
My page actually uploads 4 large files at once so I added 'filename' : self._filename to the file_progress dict created in multparser.py then I can display the current file in my javascript.

comment:150 by anonymous, 17 years ago

Cc: oliver@… added

comment:151 by djangoproject.com@…, 17 years ago

5343_cleaned_streaming_file_upload_tweaked.diff works perfect for me (Linux/Apache 2.0.59/modPython). Uploading large zip files (10 to 100 MB each) to my photo gallery with no problems ... memory usage is dead straight at 65MB ... where as it explored to 600 MB per apache process before. Good work Michael!

comment:152 by anonymous, 17 years ago

111 def parse(self):
112 try:
113 self._parse()
114 finally:
115 if self._track_progress:
116 self._request.file_progress = {'state': 'done'}
117 return self._post, self._files

When user click 'stop' while uploading file, server return {'state':'done'} even file is not completely uploaded.
This cause unexpected behaviors under different browser. I think we should return actual file size and received size to make
client to be able to check if file is completely uploaded.

Thank

comment:153 by anonymous, 17 years ago

def parse(self):

try:

self._parse()

finally:

if self._progress_filename:

if self._received == self._size:

self._request.file_progress = {'received':self._received,

'size':self._size,
'state': 'done'}

else:

self._request.file_progress = {'received':self._received,

'size':self._size,
'state': 'abort'}

How about this?

comment:154 by anonymous, 17 years ago

Cc: moritz.angermann@… added

by klaus.blindert@…, 17 years ago

Fixed microscopic bug in an error path

comment:155 by simonbun <simonbun@…>, 17 years ago

Cc: django@… added

comment:156 by anonymous, 17 years ago

Cc: paltman@… added

by simonbun <simonbun@…>, 17 years ago

Attachment: 5722.diff added

Updated patch against r5722

by simonbun <simonbun@…>, 17 years ago

Attachment: 5722.2.diff added

Sorry, that last one patch also included #4165. This is the correct one.

by Brian Rosner <brosner@…>, 17 years ago

updated to r5724 (previous patch was missing files too)

comment:157 by Brian Rosner <brosner@…>, 17 years ago

Cc: brosner@… added

comment:158 by Simon G. <dev@…>, 17 years ago

Patch needs improvement: unset
Triage Stage: AcceptedReady for checkin

Reading over this, I think the latest version here looks good (5724_streaming_file_upload.diff). Is everyone happy with it?

comment:159 by simonbun <simonbun@…>, 17 years ago

Yes, I've been using it intensively for the last couple of days and it works beautifully.

comment:160 by Brian Rosner <brosner@…>, 17 years ago

I have been using this patch for a little while now and it does work very well. Haven't seen anything wrong with it.

comment:161 by eibaan@…, 17 years ago

I manually applied patch 5724... to the latest trunk version and noticed that there's a syntax error in line 254 of multipartparser.py. Instead of

except OSError, IOError:

it must be

except (OSError, IOError), e:

Besides that, the patch seems to work quite nicely and I hope it will be added to trunk ASAP.

comment:162 by simonbun <simonbun@…>, 17 years ago

Ran into a very small issue:
Files uploads that are streamed end up having 0600 permissions, while files that are uploaded normally have 0644 permissions. I can't put my finger on where this happens at a glance, will look into it tomorrow.

(In case anyone's wondering, I need read access to be able to scan newly uploaded files with clamav.)

in reply to:  162 ; comment:163 by anonymous, 17 years ago

Please read 136. This is standard Unix permissions. For example, if someone (heaven forbid) used PHP to upload, the files would inherit the permissions from the temporary upload directory.

-Mike

Replying to simonbun <simonbun@versea.be>:

Ran into a very small issue:
Files uploads that are streamed end up having 0600 permissions, while files that are uploaded normally have 0644 permissions. I can't put my finger on where this happens at a glance, will look into it tomorrow.

(In case anyone's wondering, I need read access to be able to scan newly uploaded files with clamav.)

comment:164 by simonbun <simonbun@…>, 17 years ago

Well the strange thing is my temporary dir and my uploads dir both have 777 permissions. I just checked again and the temporary .upload file has 600 permissions.

I'm using apache2 + mod_python + python2.5 if that would make a difference

comment:165 by anonymous, 17 years ago

Cc: matthias@… added

in reply to:  163 comment:166 by Arthur Debert, 17 years ago

Replying to anonymous:

I'll second that. This one has bitten me too.
While setting permissions to 600 is sound, maybe a two liner on the documentation would help make that more obvious. It took me a while to understand what as going on.

Asides from that, thanks for the patch. I've been using it for a while (mod_python at webfaction's setup, post unicode merge) and it works well.

Cheers
Arthur

comment:167 by anonymous, 17 years ago

Cc: root.lastnode@… added

comment:168 by akaihola <antti.kaihola@…>, 17 years ago

Cc: akaihola+djtick@… added

by Brian Rosner <brosner@…>, 17 years ago

updated to r5820

comment:169 by anonymous, 17 years ago

Cc: sam@… added

comment:170 by Trey <trey@…>, 17 years ago

Cc: trey@… added

comment:171 by anonymous, 17 years ago

Cc: klaus.blindert@… added

comment:172 by Øyvind Saltvik <oyvind@…>, 17 years ago

Patch for r5820 does not work with forms.FileField and forms.ImageField, raw_field is now a StrAndUnicode object not a dict containing the raw data.

comment:173 by Øyvind Saltvik <oyvind@…>, 17 years ago

comment:174 by Michael Axiak, 17 years ago

Keywords: sprint added
Owner: changed from nobody to Michael Axiak
Status: assignednew
Triage Stage: Ready for checkinAccepted

I will take this on to as full extent as I can over the upcoming sprint.
If anyone has any more issues, please post them on here before Friday.

comment:175 by anonymous, 17 years ago

I would like to see the filename be part of the progress returned for when a form has multiple files:

self._request.file_progress = {'received': self._received,
                                'size':     self._size,
                                'filename' : self._filename,
                                'state':    'uploading'}

comment:176 by Matthias Urlichs <smurf@…>, 17 years ago

The checks after an image upload (as seen in http://dpaste.com/hold/18468/) can be improved somewhat; at first glance it is entirely unclear what's supposed to happen, and the error message is repeated.

See http://dpaste.com/hold/19212/ (untested but looks OK).

comment:177 by klaus.blindert@…, 17 years ago

Uploading empty files is treated as an error, why?
I would also like to see a file propgress tracker example implementation with unit tests and a stress test for uploading
large files.

On request I can provide an example of such a stress test programm.

comment:178 by Matthias Urlichs <smurf@…>, 17 years ago

I frankly fail to see the use case for uploading an empty file (as opposed to simply not uploading it in the first place).

If you have one, feel free to (a) tell us why you think it makes sense, (b) provide a patch which optionally allows that.

in reply to:  178 comment:179 by anonymous, 17 years ago

Replying to Matthias Urlichs <smurf@smurf.noris.de>:

I frankly fail to see the use case for uploading an empty file (as opposed to simply not uploading it in the first place).

There are several situations where one might want to upload empty files.

  1. It could be that one does not know the file is empty
  2. Maybe one actually needs to have an empty file, for example a ini.py in python indicates that a directory is a package

There is nothing special about an empty file, it is a file like any other file, it only contains a single character, the End Of File character. Why would you want to treat this file in any different way? If you do so it only means that your code is not generic enough.

One more note:

The reason that this ticket is languishing like this is that it is filled with tacit assumptions, unusual behavior (like the one I'm commenting on) and extra functionality that is not relevant to the actual problem it needs to solve first - streaming the uploads.

comment:180 by Matthias Urlichs <smurf@…>, 17 years ago

I'd argue that case 1 is a prime candidate for keeping the test, while case 2 can be handled by uploading a file with a comment in it.

In any case: if you think that's the reason, then please post either a list of (other) problems you have with this patch, or an improved patch.

Anythig else is not helpful. IMHO.

comment:181 by Philippe Raoult, 17 years ago

Keywords: sprintsept14 feature added; sprint removed

comment:182 by Ville Säävuori, 17 years ago

Cc: ville@… added

in reply to:  180 comment:183 by anonymous, 17 years ago

Replying to Matthias Urlichs <smurf@smurf.noris.de>:

while case 2 can be handled by uploading a file with a comment in it.

I would be willing to elaborate but this commment of yours indicates to me you have your mind set and don't actually want to understand the point that I was making: That is sometimes a file have to be empty.

You can't manually start adding bytes into files just because someone decided that uploading an empty file is an error. All you need is think about it for a second. As I said before an empty file is still a valid file, it has a name, a modification date ... even a size!. Maybe it is used to indicate a state, maybe it is log file that has yet to be written to etc. there are countless usecases that require the presence of a file that may not contain anything yet.

In your design if one were to simultaneously upload 10 files one of which is empty, the upload should fail once processing reaches the empty file ...

comment:184 by Matthias Urlichs, 17 years ago

I do admit that you have a point; I admit that my response was a bit knee-jerk.

My core point, now that I think about it, is not that I'm dead set against uploading empty files. That test is easily removed, or made optional.

The real point is: What's holding up the change? Did axiak just not have enough time?

This is not the only patch in this bug tracker which fixes a real-world problem. We're carrying ~30 of these in our own repository and, frankly, I'd like that number to go down, one of these days. :-/

comment:185 by oliver@…, 17 years ago

I know this is maybe a little late to chime in, but as opinion on this seems to be heavily divided as to whether empty files should be allowed, would it not just be best all-round to add a way to define whether empty files should be allowed or not?

Surely this wouldn't be too hard (I'd be willing to work on this if I get a few moments here-and-there).

Just a thought'''

comment:186 by Trey <trey@…>, 17 years ago

I am going to have to go with +1 for not distinguishing between an empty file or a non-empty file.

Technically there is no difference between and empty and non-empty file so I don't see why django should distinguish between the two. It's definitely a user defined preference at that point, which should be possible after the file has been uploaded. Not wanting and empty file is an assumption, a popular one yes, but still an assumption about the upload.

comment:187 by Michael Axiak <axiak@…>, 17 years ago

Hey,

As well as Trey, I am also +1 for not making this distinction. If people need the validation, it belongs in another layer than the core http processing. (Perhaps the newforms filefield)

-Mike

comment:188 by Sung-jin Hong, 17 years ago

I too +1 for not making that distinction.

-SJ

comment:189 by Trey <trey@…>, 17 years ago

What do we need to do to get some movement on this patch?

comment:190 by Øyvind Saltvik <oyvind@…>, 17 years ago

Make it work with current trunk first

comment:191 by Reza Mohammadi, 17 years ago

Cc: reza@… added

comment:192 by Faheem Mitha <faheem@…>, 17 years ago

I'm getting

Ran 155 tests in 55.010s

FAILED (failures=1, errors=1)

when I run the tests on 5820 with 5820_streaming_file_upload.diff applied.

They look like

1) Traceback (most recent call last):

File "/usr/local/src/django/django-trunk.hg.5820+streaming/tests/modeltests/test_client/models.py", line 91, in test_post_file_view

response = self.client.post('/test_client/post_file_view/', post_data)

File "/usr/local/lib/python2.4/site-packages/django/test/client.py", line 222, in post

return self.request(r)

File "/usr/local/lib/python2.4/site-packages/django/core/handlers/base.py", line 77, in get_response

response = callback(request, *callback_args, callback_kwargs)

File "/usr/local/src/django/django-trunk.hg.5820+streaming/tests/modeltests/test_client/views.py", line 52, in post_file_view

c = Context({'file': request.FILESfile_file})

File "/usr/local/lib/python2.4/site-packages/django/utils/datastructures.py", line 143, in getitem

raise MultiValueDictKeyError, "Key %r not found in %r" % (key, self)

MultiValueDictKeyError: 'Key \'file_file\' not found in <MultiValueDict: {\'file\': [{\'content\': \'# coding: utf-8
n"""
n38. Testing usi
[...]

2) FAIL: test_simple_upload (regressiontests.test_client_regress.models.AssertFileUploadTests)

Traceback (most recent call last):

File "/usr/local/src/django/django-trunk.hg.5820+streaming/tests/regressiontests/test_client_regress/models.py", line 215, in test_simple_upload

self.assertEqual(response.status_code, 200)

AssertionError: 500 != 200

Am I doing something wrong here?

by Faheem Mitha <faheem@…>, 17 years ago

comment:193 by Faheem Mitha <faheem@…>, 17 years ago

Uploaded new diff 6469_streaming_file_upload.diff, applying against trunk.

Currently one unit test is failing, traceback follows. There were two tests failing in the earlier version (5820), one of which was this one, but [560] helped me fix one of them on IRC.

[560] thinks this is an unicode error. If this version works for people, I'd like to hear about it. It would really nice to get this merged into trunk.

Thanks, Faheem.

*
FAIL: test_simple_upload (regressiontests.test_client_regress.models.FileUploadTests)


Traceback (most recent call last):

File "/usr/local/src/django/django-trunk.hg+mystreaming/tests/regressiontests/test_client_regress/models.py", line 244, in test_simple_upload

self.assertEqual(response.status_code, 200)

AssertionError: 500 != 200


Ran 205 tests in 85.939s

FAILED (failures=1)

comment:194 by anonymous, 17 years ago

Okay, so if I want to test this with a clean trunk, which of the above do I use? Trying to figure out if I need the middleware (if so which one), or just 6469 or what? I am needed to upload a 500M+ file on a weekly basis and would love to know HOW to use this.

in reply to:  194 ; comment:195 by Faheem Mitha <faheem@…>, 17 years ago

Replying to anonymous:

Okay, so if I want to test this with a clean trunk, which of the above do I use? Trying to > figure out if I need the middleware (if so which one), or just 6469 or what? I am needed to > upload a 500M+ file on a weekly basis and would love to know HOW to use this.

6469_streaming_file_upload.diff should apply cleanly against trunk (hasn't been that long). Just tested it (patched against 6469 trunk) and it appears to work fine. You don't need anything else if you just want the streaming upload functionality. Note that apparently there are some issues with FileField/ImageField in forms/newforms, but I have no direct experience of this.

Would be interested to hear how it works for you.

Faheem.

in reply to:  195 comment:196 by anonymous, 17 years ago

Isn't that just the way. I wrote this post, re-re-rechecked out the trunk and applied the patch. Works marvelously. Combined with a crontab for file conversion and everything is working wonderfully.

Thanks!

Replying to Faheem Mitha <faheem@email.unc.edu>:

Replying to anonymous:

Okay, so if I want to test this with a clean trunk, which of the above do I use? Trying to > figure out if I need the middleware (if so which one), or just 6469 or what? I am needed to > upload a 500M+ file on a weekly basis and would love to know HOW to use this.

6469_streaming_file_upload.diff should apply cleanly against trunk (hasn't been that long). Just tested it (patched against 6469 trunk) and it appears to work fine. You don't need anything else if you just want the streaming upload functionality. Note that apparently there are some issues with FileField/ImageField in forms/newforms, but I have no direct experience of this.

Would be interested to hear how it works for you.

Faheem.

comment:197 by eibaan, 17 years ago

I noticed a problem with an older patch that seems to be still in 6469. Uploading a FORM with both a file and text INPUT and the input text containing umlauts or other non-ascii characters, you get an exception. I fixed it by replacing line 272 of multipartparser.py with self._post.appendlist(self._fieldname, unicode(self._field.getvalue(), "utf-8")).

by Øyvind Saltvik <oyvind@…>, 17 years ago

Attachment: 6525_all_tests_pass.diff added

Fixed unicode in POST issues, added django.http.utils, moved str_to_unicode there

comment:198 by Øyvind Saltvik <oyvind@…>, 17 years ago

Added a new patch, fixed unicode in POST, moved str_to_unicode

by Øyvind Saltvik <oyvind@…>, 16 years ago

Fixed a problem with UploadedFile wrapper and making sure content is not read in Fie/ImageField

comment:199 by Øyvind Saltvik <oyvind@…>, 16 years ago

Needs tests: set

Everything seems to work now, just needs some form tests.

comment:200 by Evgenii Morozov, 16 years ago

I've installed the latest trunk, applied this patch (6603_all_tests_pass_uploadedfile_wrapper_fixed.diff) and discovered two problems. Under mod_python 3.3.1 uploading 12Mb image results in apache consuming all available memory and dying. This is probably a mod_python error as it doesn't occur when using development server or mod_wsgi.

When I'm trying to upload 600Mb file under mod_wsgi, though, I get another problem:

Traceback (most recent call last):
File "/var/lib/python-support/python2.5/django/core/handlers/base.py" in _real_get_response
  81. response = callback(request, *callback_args, **callback_kwargs)
File "/var/lib/python-support/python2.5/django/contrib/admin/views/decorators.py" in _checklogin
  55. return view_func(request, *args, **kwargs)
File "/var/lib/python-support/python2.5/django/views/decorators/cache.py" in _wrapped_view_func
  39. response = view_func(request, *args, **kwargs)
File "/var/lib/python-support/python2.5/django/contrib/admin/views/main.py" in change_stage
  332. errors = manipulator.get_validation_errors(new_data)
File "/var/lib/python-support/python2.5/django/oldforms/__init__.py" in get_validation_errors
  61. errors.update(field.get_validation_errors(new_data))
File "/var/lib/python-support/python2.5/django/oldforms/__init__.py" in get_validation_errors
  378. self.run_validator(new_data, validator)
File "/var/lib/python-support/python2.5/django/oldforms/__init__.py" in run_validator
  368. validator(new_data.get(self.field_name, ''), new_data)
File "/var/lib/python-support/python2.5/django/oldforms/__init__.py" in isValidImage
  713. validators.isValidImage(field_data, all_data)
File "/var/lib/python-support/python2.5/django/core/validators.py" in isValidImage
  180. content = field_data['content']

  KeyError at /admin/account/pokerroom/5/
  'content'

Seems that patch is already outdated. Where can I get a newer version?

Eugene

comment:201 by Evgenii Morozov, 16 years ago

Cc: registrations@… added

comment:202 by Evgenii Morozov, 16 years ago

BTW, it seems that problem was caused by insufficient space on /tmp partition I'm using for uploads. Here's additional info:

POST
Variable 	Value
_file_upload_error 	u'Failed to write to temporary file.'

It's a bug anyways.

by Øyvind Saltvik <oyvind@…>, 16 years ago

made the isValidImage validator try the tempfile before reading content

comment:203 by Faheem Mitha <faheem@…>, 16 years ago

Hi Oyvind,

Just wondering what problem your most recent patch fixes. I tried using newforms and the patch breaks with it. (See traceback below), Does your patch fix this problem? I've heard of some problems related to newforms. Is this what it is?

Thanks, Faheem.

*
from models import FileUpload, FolderUpload
from django.newforms import form_for_model

FileUploadForm = form_for_model(FileUpload)
FolderUploadForm = form_for_model(FolderUpload)
*

*

Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/core/handlers/base.py" in _real_get_response

  1. response = callback(request, *callback_args, callback_kwargs)

File "/usr/local/lib/python2.4/site-packages/django/contrib/auth/decorators.py" in _checklogin

  1. return view_func(request, *args, kwargs)

File "/usr/local/lib/python2.4/site-packages/bixfile_dev/views.py" in file_upload

  1. success = form.save()

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py" in save

  1. return save_instance(self, model(), fields, fail_message, commit)

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py" in save_instance

  1. f.save_form_data(instance, cleaned_data[f.name])

File "/usr/local/lib/python2.4/site-packages/django/db/models/fields/init.py" in save_form_data

  1. getattr(instance, "save_%s_file" % self.name)(data.filename, data.content, save=False)

File "/usr/local/lib/python2.4/site-packages/django/db/models/fields/init.py" in

  1. setattr(cls, 'save_%s_file' % self.name, lambda instance, filename, raw_field, save=True: instance._save_FIELD_file(self,

filename, raw_field, save))
File "/usr/local/lib/python2.4/site-packages/django/db/models/base.py" in _save_FIELD_file

  1. if raw_field.has_key('tmpfilename'):

AttributeError at /proto_dev/bixfile_dev/
'str' object has no attribute 'has_key'

comment:204 by Øyvind Saltvik <oyvind@…>, 16 years ago

Just changed the oldforms image validator to use the tempfile. The problem you have was fixed in the 6603 patch.

comment:205 by Faheem Mitha <faheem@…>, 16 years ago

Hi oyvind,

I'm getting some test breakage with your patch.

Ran 211 tests in 90.369s

FAILED (failures=3)

The most common error is:

AttributeError: 'UploadedFile' object has no attribute 'content'.

by Øyvind Saltvik <oyvind@…>, 16 years ago

diff did not apply cleanly, fixed

comment:206 by Øyvind Saltvik <oyvind@…>, 16 years ago

All tests run fine now

comment:207 by Øyvind Saltvik <oyvind@…>, 16 years ago

Django patched with the latest streaming patch can now be checked out using mercurial.

hg clone http://streaming.cylon.no/streaming django_src

comment:208 by Bastian Kleineidam <calvin@…>, 16 years ago

Cc: calvin@… added

comment:209 by anonymous, 16 years ago

will this get into trunk any time soon?

comment:210 by anonymous, 16 years ago

6654_fixed_tests_and_file_clean.diff fails against 6671.

comment:211 by anonymous, 16 years ago

Cc: rudolph.froger@… removed

comment:212 by anonymous, 16 years ago

Cc: works@… added

comment:213 by Andrew Badr <andrewbadr.etc@…>, 16 years ago

Cc: andrewbadr.etc@… added

comment:214 by Joshua Works, 16 years ago

Could someone post updated directions to get the latest incarnation of this patch working? I'm using Oyvind's mercurial source, but I'm not sure if I still need to modify the middleware, change my models at all, add settings to settings.py, etc.

Also, is the progress meter still included in this patch, and what do I need to see it in the Admin?

There hasn't been much activity here lately -- is this thing close to making it into the trunk and getting some proper documentation?

in reply to:  214 comment:215 by anonymous, 16 years ago

Replying to Joshua Works:

Could someone post updated directions to get the latest incarnation of this patch working? I'm using Oyvind's mercurial source, but I'm not sure if I still need to modify the middleware, change my models at all, add settings to settings.py, etc.

It works out of the box for me, at least it does for the last version I used it for, which corresponds to Oyvinds last patch. However, apparently this patch does not apply cleanly to current trunk. Oyvind asked me to do a merge, but I've been busy with other things. I'll do my best to get to this in the next day or two.

Also, is the progress meter still included in this patch, and what do I need to see it in the Admin?

No, that is in another patch http://code.djangoproject.com/ticket/4165, and is buggy.
See my comments there. Apart from a backend problem with the Python code, the JS code has
problems too. It should be easy to rewrite for someone who knows JS, though. I may end up doing it myself if nobody else does, but I'll probably use a library like PrototypeJS or Jquery.

There hasn't been much activity here lately -- is this thing close to making it into the trunk and getting some proper documentation?

I think it will need a bunch of people to give it a push. I've been intending to try and coordinate something, but I've been busy with other things.

If you don't hear anything, feel free to ping me via private email. I'm quite keen that the patch continue working till it is merged into trunk. I need it for some stuff myself.

Faheem.

comment:216 by Malcolm Tredinnick, 16 years ago

It absolutely doesn't need "a bunch of people to give it a push". We know this ticket is here and at some point one of the maintainers will sit down and review it again to see if all the problems from the last review have been fixed and no new ones introduced.

by Faheem Mitha <faheem@…>, 16 years ago

Attachment: streaming.6710.patch added

Updated streaming patch to trunk rev 6710.

comment:217 by Faheem Mitha <faheem@…>, 16 years ago

Added streaming.6710.patch against trunk rev 6710. This passes all Django tests. I haven't tested it extensively, but it appears to work. Some quick notes on how I generated this patch in case anyone is interested. I used mercurial and kdiff3 for merging (OS Debian etch).

Note that the only file that required manual intervention was django/http/utils.py. This file could probably use a little cleanup.
Everything else was handled by kdiff3.

See Section 12.8 of the hgbook for a recipe for doing this http://hgbook.red-bean.com/hgbookch12.html#x16-28700012.8 and also this wiki page,
http://www.selenic.com/mercurial/wiki/index.cgi/MqMerge

Note that the method described in the wiki is currently less likely to give problems than the method in the hgbook (otherwise similar).

See this issue for further discussion, http://www.selenic.com/mercurial/bts/issue844

comment:218 by anonymous, 16 years ago

Successfully patched trunk rev 7018 with streaming.6710.patch. Works well.

What are the main issues blocking trunk integration? This functionality is of high importance to many, as one could argue that without this patch the current FileField and ImageField implementation is both buggy and exposes a security risk (denial of service by uploading large files as Apache is eating up memory and peaking CPU usage + large file upload plainly fails in the end without the patch).

Thus strong +1 for integration ASAP.

comment:219 by ipse <ipse.reg@…>, 16 years ago

Cc: ipse.reg@… added

comment:220 by Soeren Sonnenburg <bugreports@…>, 16 years ago

I also urgently need that feature ... any chance to see this merged soon?

comment:221 by James Bennett, 16 years ago

Please do not ask "when will this be merged" questions in the ticket tracker; all it does is generate noise. If you're genuinely interested in helping to get a specific patch merged, consider asking on the developers list about why it hasn't yet been committed or, better yet, searching the archive of that list to see if there's a reason given.

comment:222 by anonymous, 16 years ago

Cc: uptimebox@… added

by egon, 16 years ago

Attachment: streaming.7092.patch added

Made it apply cleanly to rev. 7092

comment:223 by Faheem Mitha, 16 years ago

streaming.7092.patch gives some errors. As usual, I don't understand
what is going on well enough to fix them. See below.

*
======================================================================
FAIL: Doctest: modeltests.model_forms.models.test.API_TESTS


Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 2180, in runTest

raise self.failureException(self.format_failure(new.getvalue()))

AssertionError: Failed doctest test for modeltests.model_forms.models.test.API_TESTS

File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line unknown line number, in API_TESTS


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f.is_valid()

Expected:

True

Got:

False


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

type(f.cleaned_datafile)

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[198]>", line 1, in ?

type(f.cleaned_datafile)

AttributeError: 'TextFileForm' object has no attribute 'cleaned_data'


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance = f.save()

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[199]>", line 1, in ?

instance = f.save()

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save

return save_instance(self, self.instance, self._meta.fields, fail_message, commit)

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance

raise ValueError("The %s could not be %s because the data didn't"

ValueError: The TextFile could not be created because the data didn't validate.


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance.file

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[200]>", line 1, in ?

instance.file

NameError: name 'instance' is not defined


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f = TextFileForm(data={'description': u'Assistance'}, instance=instance)

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[201]>", line 1, in ?

f = TextFileForm(data={'description': u'Assistance'}, instance=instance)

NameError: name 'instance' is not defined


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f.is_valid()

Expected:

True

Got:

False


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f.cleaned_datafile

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[203]>", line 1, in ?

f.cleaned_datafile

AttributeError: 'TextFileForm' object has no attribute 'cleaned_data'


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance = f.save()

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[204]>", line 1, in ?

instance = f.save()

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save

return save_instance(self, self.instance, self._meta.fields, fail_message, commit)

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance

raise ValueError("The %s could not be %s because the data didn't"

ValueError: The TextFile could not be created because the data didn't validate.


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance.file

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[205]>", line 1, in ?

instance.file

NameError: name 'instance' is not defined


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

os.unlink(instance.get_file_filename())

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[206]>", line 1, in ?

os.unlink(instance.get_file_filename())

NameError: name 'instance' is not defined


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f = TextFileForm(data={'description': u'Assistance'}, files={'file': {'filename': 'test2.txt', 'content': 'hello world'}}, instance=instance)

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[207]>", line 1, in ?

f = TextFileForm(data={'description': u'Assistance'}, files={'file': {'filename': 'test2.txt', 'content': 'hello world'}}, instance=instance)

NameError: name 'instance' is not defined


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f.is_valid()

Expected:

True

Got:

False


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance = f.save()

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[209]>", line 1, in ?

instance = f.save()

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save

return save_instance(self, self.instance, self._meta.fields, fail_message, commit)

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance

raise ValueError("The %s could not be %s because the data didn't"

ValueError: The TextFile could not be created because the data didn't validate.


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance.file

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[210]>", line 1, in ?

instance.file

NameError: name 'instance' is not defined


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance.delete()

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[211]>", line 1, in ?

instance.delete()

NameError: name 'instance' is not defined


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f.is_valid()

Expected:

True

Got:

False


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance = f.save()

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[219]>", line 1, in ?

instance = f.save()

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save

return save_instance(self, self.instance, self._meta.fields, fail_message, commit)

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance

raise ValueError("The %s could not be %s because the data didn't"

ValueError: The TextFile could not be changed because the data didn't validate.


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance.file

Expected:

u'.../test3.txt'

Got:



File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f.is_valid()

Expected:

True

Got:

False


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

type(f.cleaned_dataimage)

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[226]>", line 1, in ?

type(f.cleaned_dataimage)

AttributeError: 'ImageFileForm' object has no attribute 'cleaned_data'


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance = f.save()

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[227]>", line 1, in ?

instance = f.save()

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save

return save_instance(self, self.instance, self._meta.fields, fail_message, commit)

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance

raise ValueError("The %s could not be %s because the data didn't"

ValueError: The ImageFile could not be created because the data didn't validate.


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance.image

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[228]>", line 1, in ?

instance.image

AttributeError: 'TextFile' object has no attribute 'image'


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f.is_valid()

Expected:

True

Got:

False


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f.cleaned_dataimage

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[231]>", line 1, in ?

f.cleaned_dataimage

AttributeError: 'ImageFileForm' object has no attribute 'cleaned_data'


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance = f.save()

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[232]>", line 1, in ?

instance = f.save()

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save

return save_instance(self, self.instance, self._meta.fields, fail_message, commit)

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance

raise ValueError("The %s could not be %s because the data didn't"

ValueError: The TextFile could not be created because the data didn't validate.


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance.image

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[233]>", line 1, in ?

instance.image

AttributeError: 'TextFile' object has no attribute 'image'


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

os.unlink(instance.get_image_filename())

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[234]>", line 1, in ?

os.unlink(instance.get_image_filename())

AttributeError: 'TextFile' object has no attribute 'get_image_filename'


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f.is_valid()

Expected:

True

Got:

False


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance = f.save()

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[237]>", line 1, in ?

instance = f.save()

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save

return save_instance(self, self.instance, self._meta.fields, fail_message, commit)

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance

raise ValueError("The %s could not be %s because the data didn't"

ValueError: The TextFile could not be created because the data didn't validate.


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance.image

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[238]>", line 1, in ?

instance.image

AttributeError: 'TextFile' object has no attribute 'image'


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance.delete()

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[239]>", line 1, in ?

instance.delete()

File "/usr/local/lib/python2.4/site-packages/django/db/models/base.py", line 324, in delete

assert self._get_pk_val() is not None, "%s object can't be deleted because its %s attribute is set to None." % (self._meta.object_name, self._meta.pk.attname)

AssertionError: TextFile object can't be deleted because its id attribute is set to None.


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f.is_valid()

Expected:

True

Got:

False


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance = f.save()

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[247]>", line 1, in ?

instance = f.save()

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save

return save_instance(self, self.instance, self._meta.fields, fail_message, commit)

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance

raise ValueError("The %s could not be %s because the data didn't"

ValueError: The ImageFile could not be changed because the data didn't validate.


File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance.image

Expected:

u'.../test3.png'

Got:


======================================================================
FAIL: Doctest: regressiontests.forms.tests.test.fields_tests


Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 2180, in runTest

raise self.failureException(self.format_failure(new.getvalue()))

AssertionError: Failed doctest test for regressiontests.forms.tests.test.fields_tests

File "/usr/local/src/django/django-trunk.patched/tests/regressiontests/forms/tests.py", line unknown line number, in fields_tests


File "/usr/local/src/django/django-trunk.patched/tests/regressiontests/forms/tests.py", line ?, in regressiontests.forms.tests.test.fields_tests
Failed example:

type(f.clean({'filename': 'name', 'content': 'Some File Content'}, 'files/test4.pdf'))

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest regressiontests.forms.tests.test.fields_tests[278]>", line 1, in ?

type(f.clean({'filename': 'name', 'content': 'Some File Content'}, 'files/test4.pdf'))

File "/usr/local/lib/python2.4/site-packages/django/newforms/fields.py", line 453, in clean

raise ValidationError(self.error_messagesempty)

ValidationError: [u'The submitted file is empty.']


Ran 234 tests in 63.274s

FAILED (failures=2)

by Faheem Mitha, 16 years ago

Slightly modified version of streaming.7092.patch with more tests passing.

comment:225 by Faheem Mitha, 16 years ago

With the patch I just uploaded (basically the changes are liberal use of 'content-length' as suggested by Oyvind, the errors are now reduced
to
======================================================================
FAIL: Doctest: modeltests.model_forms.models.test.API_TESTS


Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 2180, in runTest

raise self.failureException(self.format_failure(new.getvalue()))

AssertionError: Failed doctest test for modeltests.model_forms.models.test.API_TESTS

File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line unknown line number, in API_TESTS


File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f.is_valid()

Expected:

True

Got:

False


File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

type(f.cleaned_dataimage)

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[226]>", line 1, in ?

type(f.cleaned_dataimage)

AttributeError: 'ImageFileForm' object has no attribute 'cleaned_data'


File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance = f.save()

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[227]>", line 1, in ?

instance = f.save()

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save

return save_instance(self, self.instance, self._meta.fields, fail_message, commit)

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance

raise ValueError("The %s could not be %s because the data didn't"

ValueError: The ImageFile could not be created because the data didn't validate.


File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance.image

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[228]>", line 1, in ?

instance.image

AttributeError: 'TextFile' object has no attribute 'image'


File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f.is_valid()

Expected:

True

Got:

False


File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f.cleaned_dataimage

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[231]>", line 1, in ?

f.cleaned_dataimage

AttributeError: 'ImageFileForm' object has no attribute 'cleaned_data'


File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance = f.save()

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[232]>", line 1, in ?

instance = f.save()

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save

return save_instance(self, self.instance, self._meta.fields, fail_message, commit)

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance

raise ValueError("The %s could not be %s because the data didn't"

ValueError: The TextFile could not be created because the data didn't validate.


File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance.image

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[233]>", line 1, in ?

instance.image

AttributeError: 'TextFile' object has no attribute 'image'


File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

os.unlink(instance.get_image_filename())

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[234]>", line 1, in ?

os.unlink(instance.get_image_filename())

AttributeError: 'TextFile' object has no attribute 'get_image_filename'


File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

f.is_valid()

Expected:

True

Got:

False


File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance = f.save()

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[237]>", line 1, in ?

instance = f.save()

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save

return save_instance(self, self.instance, self._meta.fields, fail_message, commit)

File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance

raise ValueError("The %s could not be %s because the data didn't"

ValueError: The TextFile could not be created because the data didn't validate.


File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance.image

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[238]>", line 1, in ?

instance.image

AttributeError: 'TextFile' object has no attribute 'image'


File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:

instance.delete()

Exception raised:

Traceback (most recent call last):

File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run

compileflags, 1) in test.globs

File "<doctest modeltests.model_forms.models.test.API_TESTS[239]>", line 1, in ?

instance.delete()

File "/usr/local/lib/python2.4/site-packages/django/db/models/base.py", line 324, in delete

assert self._get_pk_val() is not None, "%s object can't be deleted because its %s attribute is set to None." % (self._meta.object_name, self._meta.pk.attname)

AssertionError: TextFile object can't be deleted because its id attribute is set to None.


Ran 234 tests in 63.014s

FAILED (failures=1)

comment:226 by anonymous, 16 years ago

content-length is in the wrong dict in ImageField tests and probably some others too.

>>> f = ImageFileForm(data={'description': u'An image'}, files={'image': {'filename': 'test.png', 'content': image_data}, 'content-length':len(image_data)})

should be

>>> f = ImageFileForm(data={'description': u'An image'}, files={'image': {'filename': 'test.png', 'content': image_data, 'content-length':len(image_data)}})

by egon, 16 years ago

Attachment: streaming.7106.patch added

Passes al tests.

comment:227 by anonymous, 16 years ago

Cc: drfarina@… added

comment:228 by beau, 16 years ago

Cc: beau@… added

comment:229 by anonymous, 16 years ago

Cc: giuliani.v@… added

comment:230 by fdr, 16 years ago

I've been rewriting the multipart parsing to be lazier and found something that seems to me like a DDOS exploit in this code.

Consider line 231-233 in streaming.7106.patch. What happens here is:

  1. A multipart boundary has been seen recently, so we are in the HEADER state
  2. We are looking to match the end of of the header
  3. If there's not enough data we simply break and add more bytes to our unmatched bytes from the network and try matching again later

Is there any reason why a "bad" client could not send a multipart request that *never* properly sends the header termination bytes and thus forces the multipart parser to load as much of the stream as possible (up to Content-Length bytes) into memory? As the amount of bytes increases towards the maximum Content-Length the call to .find() itself will become incredibly expensive. A few malformed requests might also cause the server to swap into oblivion or run out of address space on 32 bit systems.

comment:231 by fdr, 16 years ago

Bad wording in the prior comment.

First sentence should be read as follows:

This patch appears to introduce a DOS attack vector into Django.

comment:232 by egon, 16 years ago

I've tested this with:

  • Lighttpd 1.4 with fastcgi/scgi: doesn't work. As somebody else said above it looks like it isn't streaming the upload at all, or maybe lighttpd waits for the POST to be over before streaming it to the fcgi app.
  • Apache 2.2 with scgi: works fine so far.

comment:233 by avain <avainitri@…>, 16 years ago

Component: Core frameworkHTTP handling
Has patch: unset
Needs tests: unset
Patch needs improvement: set

I've tested streaming.7106.patch. It works for the upload file with 860MB. However I tested it with 1.5G. It fails. The message is below.

Request Method:  	POST
Request URL: 	http://192.168.1.4/imagecity/upload/
Exception Type: 	MemoryError
Exception Value: 	out of memory
Exception Location: 	/usr/lib/python2.5/site-packages/django/http/multipartparser.py in _read, line 285
/usr/lib/python2.5/site-packages/django/http/multipartparser.py in _read
 285. self._file.write(data[start:end]) ...

comment:234 by Roman Tolkachyov <rt@…>, 16 years ago

save_FOO_file() works wrong with streaming.7106.patch
If raises

Traceback (most recent call last):
  ...
  File "/usr/lib/python2.5/site-packages/django/db/models/fields/__init__.py", line 783, in <lambda>
    setattr(cls, 'save_%s_file' % self.name, lambda instance, filename, raw_field, save=True: instance._save_FIELD_file(self, filename, raw_field, save))
  File "/usr/lib/python2.5/site-packages/django/db/models/base.py", line 414, in _save_FIELD_file
    if raw_field.has_key('tmpfilename'):
AttributeError: 'str' object has no attribute 'has_key'

comment:235 by fdr, 16 years ago

I worked on a lazier (i.e. you can read from the stream without loading the entire thing onto disk) and hopefully better patch for this, in case anyone is getting antsy. I haven't been able to return to it for a week or two, but all the annoying parts are pretty much done.

It does, however, gut the current implementation. At that time I also tested mod_wsgi2's buffering behavior and found that it does not do aggressive buffering (a good thing), so sites running behind mod_wsgi2 can do flow control on clients. This was done by augmenting the current patch with .sleep() calls between blocks and then posting a large file.

This is particularly important if you have to send the material to another server and would prefer (or require) that it not be held on disk in full, if the stream is of indeterminate length, or if you wanted to hold the request in full to begin with (for example: filtering a dataset as it goes by and populating the database with a structured representation)

comment:236 by Michael Axiak, 16 years ago

Status: newassigned

Alright I'm going to be working on this ticket this week. Before I do any work, I'm uploading the current diff updated to work against HEAD largely untouched. After reviewing the comments, before we can consider this ready for checking or close to that, I think that the following is needed:

  1. Updating this to work with the new-ish FileField backends.
  2. Pull the MultiPartParser logic apart from the actual dealing with the chunked data. (I.e. a user should be able to do whatever they want with the data.)
  3. Make the MultiPartParser lazier (i.e. not look for boundaries, but let them come when available).
  4. Go through the bugs that people have reported (thanks for doing so!) and ensuring that they no longer exist.

-Mike

by Michael Axiak, 16 years ago

Attachment: ticket2070_rev7277.diff added

Same patch applied to @7277

comment:237 by Malcolm Tredinnick, 16 years ago

I think the general design