#2070 closed enhancement (fixed)
Large streaming uploads
Reported by: | Owned by: | Jacob | |
---|---|---|---|
Component: | HTTP handling | Version: | dev |
Severity: | normal | Keywords: | streaming, upload, large, sprintsept14, feature |
Cc: | dsalvetti@…, matthias@…, oliver@…, Maniac@…, nesh@…, aenor.realm@…, gary.wilson@…, serialx.net@…, wonlay@…, lurker86@…, shaun@…, antonio.mele@…, herbert.poul@…, gabor@…, koebbe@…, axiak@…, jm.bugtracking@…, moritz.angermann@…, django@…, brosner@…, root.lastnode@…, akaihola+djtick@…, sam@…, trey@…, klaus.blindert@…, ville@…, reza@…, registrations@…, calvin@…, works@…, andrewbadr.etc@…, uptimebox@…, drfarina@…, beau@…, giuliani.v@…, prigun@…, zerok@…, johannes.beigel@…, david@…, django@…, mocksoul@…, schlaber@…, danpoe@…, django@…, django-2070@…, clintecker@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Based on code from ticket 1448.
Test for large file upload
from django.db import models # Create your models here. class FileList(models.Model): name = models.CharField(maxlength=255) email = models.EmailField() class Admin: pass class AFile(models.Model): descr = models.CharField(maxlength=255,core=True) file = models.FileField(upload_to='files') inlist = models.ForeignKey(FileList,edit_inline=models.STACKED)
And two 500 megabyte files for upload.
Works with the patch with +20 mb above average memory and minimal cpu load usage by the httpd thread, file upload successful.
Without the patch httpd rages to +140 mb above average memory usage and over 60% cpu usage and fails miserably in the end.
Attachments (64)
Change History (376)
comment:1 by , 18 years ago
Severity: | normal → major |
---|
comment:2 by , 18 years ago
priority: | high → normal |
---|---|
Resolution: | → duplicate |
Severity: | major → normal |
Status: | new → closed |
This is a duplicate of #1484.
comment:3 by , 18 years ago
Resolution: | duplicate |
---|---|
Status: | closed → reopened |
Or is it? Leaving it open until someone who knows more about it can sort this out.
comment:4 by , 18 years ago
Detailed description of patch.
Based of the #1448 patch that i experienced trouble with using edit inline and large files.
Changes from trunk:
- Added STREAMING_UPLOADS=False to global_settings
- Added parse_streaming_file_upload it http/__init__.py
- Made changes to modpython and wsgi in handlers to use parse_streaming_file_upload
- Subclassed cgi.FieldStorage
- overriding of make_file to use NamedTemporaryFile
- overriding of bouandry parsers to make readline read in 64k chunks to avoid reading extremely long lines into memory
- Subclassed rfc822.Message
- overriding of bouandry parsers to make readline read in 64k chunks to avoid reading extremely long lines into memory
- Made FileField validate upload using tempfile size
- Made save_file in db/models/fields/__init__.py pass temporary file name to _save_FIELD_file instead of file contents
- Made _save_FIELD_file copy temporary file instead of write file from memory
comment:5 by , 18 years ago
comment:6 by , 18 years ago
This patch uses the rfc822 module. The Python docs says the following about rfc822:
Deprecated since release 2.3. The email package should be used in preference to the rfc822 module. This module is present only to maintain backward compatibility.
comment:7 by , 18 years ago
Type: | defect |
---|
comment:8 by , 18 years ago
Type: | → defect |
---|
The problem with the email package is that it reads the entire file into memory. Django supports python 2.3 anyway.
comment:9 by , 18 years ago
to use monitor from a view
def progress_view(request): import pickle sessionid = request.COOKIES.get('sessionid', None) progress = pickle.load(open('/tmp/'+sessionid, 'r'))
Accessible variables
progress.fullength - full request byte lenght
progress.at_status - current bytes read
progress.at_file - current file read
progress.file_status - current file bytes read
comment:10 by , 18 years ago
Had a problem on FreeBSD with my /tmp being on a seperate filesystem: os.rename threw an "Invalid cross-device link" error.
The fix was to edit django/db/models/base.py, "import shutils" at the top and change the one instance of "os.rename" to "shutils.move". Works fine now... shutils came default with python2.4 I don't know about other releases, or if it works on other OSs
follow-up: 107 comment:11 by , 18 years ago
In reference to bnewbold's comment above, it's the shutil module, and it looks like it's been around since at least Python 2.3. (Django supports Python 2.3 and up.)
comment:12 by , 18 years ago
to make it work on windows make the following changes in UploadStateKeeper class:
class UploadStateKeeper: def __init__(self, identifier, fullength=None): self.identifier = identifier if fullength: self.state = UploadState(self.identifier, fullength) else: import pickle self.state = pickle.load(self.get_status_path(), 'r') def get_status_path(self): from tempfile import gettempdir return os.path.join(gettempdir(), self.identifier) def set_status(self, filename, addlen): self.state.set_status(filename, addlen) import pickle pickle.dump(self.state,open(self.get_status_path(), 'w')) def get_state(self): return self.state
comment:13 by , 18 years ago
Can anyone tell me the differences between the latest patch on this ticket and the latest one on #1484? Are these two tickets/patches duplicates or complements?
comment:15 by , 18 years ago
Sorry if I'm being dense, but does that mean that #1484 should be closed in favor of this one? I'm trying to figure out which patch(es) need to be applied here.
comment:16 by , 18 years ago
Keywords: | timezone time zone UTC added |
---|
In my opinion, yes #1484 should be closed. The only patch you have to apply is the latest one in this ticket.
It may help to svn up -r 3358 before you apply the patch, or you can try at the current HEAD and see what goes.
comment:18 by , 18 years ago
Owner: | changed from | to
---|---|
Status: | reopened → new |
Right, I've finally had a chance to play with this patch. It's a big one, so I want to make sure I understand what's going on before I check it in. So apologies if I'm being dense, but I gotta understand this stuff:
- I'm confused about why you needed to override
Message.readline
} just to change the argument toself.fp.readline()
(django/http/init.py, line 105). Is the Python implementation broken in some way? What happens if you take that bit out?
- Pretty much the same question for the FieldStorage subclass. This is a *lot* of duplicated code from the builtin cgi module, and it gives me pause. Can you explain why you need to do this (and perhaps think about if there's another way?)
- The upload status info stored in /tmp is pretty awesome. However, it doesn't seem to work all the time for me; while uploading a big file if I keep cat'ing that /tmp file it's often just empty. Seems something keeps emptying it out? Also, using the session ID seems like it would cause issues if I tried to upload multiple files at the same time (a very real possibility if I'm uploading really big-ass things).
- Related to that, setting
os.envrion['SESSIONID']
(done in both the WSGI and the mod_python handler) seems very leaky. We've had bugs before with this type of env monkeybusiness... Could you use a threadlocal instead?
I should end, however, by saying that the behavior of this patch rocks. Being able to upload a 2 GB file without locking my server is super-nice, and the status stuff absolutely kicks ass (I think I already said that). I'm pretty sure I can clean up all the stuff I just bitched about, but I've got a pretty long todo.txt and if you can take care of these problems and/or answer my questions, I'll check this in PDQ.
comment:19 by , 18 years ago
Status: | new → assigned |
---|
comment:20 by , 18 years ago
- If that bit is taken out memory usage soars because most browsers do not use newlines when sending files so it actually reads to EOF. Is it possible to override self.fp.readline() to call self.fp.readline(64000)? Then most of this code would be removed.
- The upload status is a bit tricy, I need to override FieldStorage to store the status. But since FieldStorage is not a extended class of object there is no super(FieldStorage, self).read_lines_to_outerboundary() causing repetition of code.
- Don't know how multiple uploads can be solved since it would require multiple id's, maybe use sessions instead if possible. Never experienced the emptiying if the temp file. Does this happen with both mod_python and wsgi?
- About setting os.environ, the mod_python-handler does not set it, but wsgi does, need to fix that one instead of messing with threadlocal , this may be why the file is emptied if it only happens with wsgi.
comment:21 by , 18 years ago
Cc: | added |
---|
comment:22 by , 18 years ago
Jacob, after reading your questions I'd like to share here my knowledge as an author of original patch in #1484.
- About readline(N). Actually browsers do send newlines when sending binary files because newlines (byte 0x0A) are present in binary files also. In theory 0x0A in a binary body can be rare enough for readline to read very large chunks in memory. However in practice 0x0A are present often enough (I'm talking from the experience of two services working mainly with big jpegs, mp3 and zips). This is, btw, why my original patch in #1484 didn't include the boundary in readline. The downside of replacing readlines in FieldStorage's code with readline(N) is that the base class doesn't provide this for overriding so you have to actually rewrite one-to-one pretty large chunks of code.
- To my liking storing upload info in files is not very nice. I mentioned my idea in django-developers about this that it's better be a hook for the user code (in middleware style) that is called from the handler accepting uploaded data. This is more flexible because a user can store this progress not in /tmp but wherever he likes and do more interesting stuff then just measuring progress. For example closing the upload if the file is too big for a server.
- One more thing that we've already discussed with Oyvind in an old ticket that his patch alters both HTTP handlers and admin app. The old patch was only for HTTP and my point is that these things should be separate because it's much simpler to review and apply.
Hope my comment was useful!
comment:23 by , 18 years ago
One more nit... Oyvin what was the reason to use rfc822? It's a new thing in your variant of the old patch and I just don't get why it's needed :-)
comment:24 by , 18 years ago
- When uploading two 500mb iso's, it broke down constantly with a memoryerror when not setting a buffer size, everything worked fine when a bufsize was added.
- About upload storage I do agree about using a hook instead.
- Rfc822 is still used by FieldStorage, now hidden from view. I just don not think of setting bufsize in make_file instead of passing it to readline (if that works). Have not had time to test yet.
- Agree that reviewing is easier. But it is nice for easy testing via the admin to see why django is the right tool for the massive job. :)
comment:25 by , 18 years ago
Ignore my patch 3358-streaming-file_with_settings_and_progressmonitor.3.diff, adding bufsize does not affect readline. Also it seems chunked reads are required for storing the state. Don't see any easy way to make this less code yet. I'll add a new patch with just the WSGI fix.
comment:26 by , 18 years ago
I was planning to make a middleware for a progressmonitor hook, but not quite sure when middleware is applied and how to pass this hook from the middleware to FieldStorage. Any help to clarify/fix this would be great.
Also if somenoe could test this successfully using WSGI it would be great.
follow-up: 125 comment:27 by , 18 years ago
My idea of a hook is not fit into current middleware, I just wanted to retain the style. So to clarify I see it like this.
This can be a separate set of hooks similar to MIDDLEWARE_CLASSES or it can be done in middleware classes themselves. To do the latter we will specifiy that in addition to process_request, _response, _view and _exception methods a middleware can define a new kind of method: process_upload(request, data).
parse_file_upload then would search for these middleware methods just as http requests do in modpython.py and wsgi.py. The list of methods is passed to a FieldStorage decendant that will call them each time it's about to write a chunk of data to the target stream.
This is it... Sorry I can't come with the actual code: pretty busy rolling out my current project.
comment:28 by , 18 years ago
I figured the last attachment was supposed to be a diff and tried to upload it with the right extension. Unfortunately, '.py' got added and now I don't know how to get rid of the attachment.
Sorry for the clutter.
comment:29 by , 18 years ago
Seems wsgi can't do streaming uploads this way, but ligthttpd 1.5 will chunk the uploads and has a mod_uploadprogress that serializes to JSON. Will try adapt this patch to serialize the same way so it wont matter if you use this streaming uploads patch or mod_uploadprogress.
comment:30 by , 18 years ago
Why do you say wsgi can't do this? It gives environwsgi.input to the handler that is a file-like object with uploading data. Handler then can read it in chunks (it actually does this in parse_file_upload).
comment:31 by , 18 years ago
WSGI may, but it seems like lighthttpd does not pass anything to wsgi before the request is done. I may be wrong, but I have no way of debugging properly.
comment:32 by , 18 years ago
Bug in last patch, did not manage to override readline to use a buffer size. If this cannot be done, code must be repeated.
comment:33 by , 18 years ago
Wsgi users using lighttpd should use mod_uploadprogress instead. There is some kind of delay before the upload state is updated using wsgi, the same does not happen on mod_python. Is this a issue with my code, the fastcgi/scgi spec or lighttpd's implementation? Can someone test this with fastcgi/scgi on apache?
comment:34 by , 18 years ago
GET identifer for upload state.
http://foo.com?bar saves json data in /tmp/bar
For javascript, making a /progress/ url and view see http://blog.lighttpd.net/articles/2006/08/01/mod_uploadprogress-is-back
Makes it easy to add a admin js option to a model to enable upload state in admin.
comment:35 by , 18 years ago
I just provided a patch which steals liberally from Oyvind, but decouples the progress monitor from the rest of the file upload behavior.
In a nutshell, if you don't want to read uploads into memory, you set request.save_file_class to the class of the kind of file-like object you'd like uploads routed to. A sample middleware class is provided which uses a temp file.
I haven't looked at Oyvind's latest code, but my guess is that we could easily move the upload monitor from the guts of Django by providing a file-like object class that keeps track of how much stuff it has received from the POST data stream.
Comments welcome. (But be nice; it's my first patch.)
comment:36 by , 18 years ago
To toddobryan, I made some fatal mistakes before 3358-streaming-file_with_settings_and_progressmonitor_fixed.diff + the wsgi fix, see this patch, you may have fixed it but i'm not sure. the latest patch I would say is really important for multiple uploads. There are some more upload progress issues, the json encoder gives two value sets but i can't find out why. Posting my latest diff now.
Btw I like one more developer helping me.
comment:37 by , 18 years ago
OK, you don't have to be very nice. It turns out that I missed perhaps the most important thing, which is wrapping the stream from the POST in some kind of object which prevents lines from being too long and filling up memory.
Oyvind's PostStream seems like the kind of thing to do, but I'd suggest wrapping the stream before we pass it into the FieldStorage object, not inside. I was misled by the fact that the read_binary() method inside there has a fixed 8k buffer.
comment:39 by , 18 years ago
If you could concentrate on getting your progress monitor to work from a file-like object instead of FieldStorage, I'll integrate the PostStream idea into my code so that the readline()s in the FieldStorage class don't read too much.
comment:40 by , 18 years ago
Thinking of making the upload monitor as a hook in streaming_upload_middleware.
Then make a upload monitor middleware that hooks onto streaming_upload_middleware by loading before streaming_upload_middleware.
Upload monitor will act exacly like mod_uploadprogress so wsgi users just load streaming_upload_middleware.
comment:41 by , 18 years ago
By the way, when you guys think you've addressed all the problems and simplified the code as much as you can, etc, just add a comment pointing to the right patch that needs another review (and mention that you think it's "done").
comment:42 by , 18 years ago
To test 'svn up -r 3518', apply patch and add:
django.middleware.upload.UploadStateMiddleware and django.middleware.upload.StreamingUploadMiddleware to MIDDLEWARE_CLASSES.
Uploadstate test currently builtin to admin change form. 'tail /tmp/test123' for upload state as json.
comment:43 by , 18 years ago
Regular uploads seem to fail if the middlewares are disabled. Backtrace:
Traceback (most recent call last): File "/opt/local/lib/python2.4/site-packages/django/core/servers/basehttp.py", line 272, in run self.result = application(self.environ, self.start_response) File "/opt/local/lib/python2.4/site-packages/django/core/servers/basehttp.py", line 611, in __call__ return self.application(environ, start_response) File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 158, in __call__ response = self.get_response(request.path, request) File "/opt/local/lib/python2.4/site-packages/django/core/handlers/base.py", line 102, in get_response return self.get_technical_error_response(request) File "/opt/local/lib/python2.4/site-packages/django/core/handlers/base.py", line 134, in get_technical_error_response return debug.technical_500_response(request, *sys.exc_info()) File "/opt/local/lib/python2.4/site-packages/django/views/debug.py", line 131, in technical_500_response return HttpResponseServerError(t.render(c), mimetype='text/html') File "/opt/local/lib/python2.4/site-packages/django/template/__init__.py", line 155, in render return self.nodelist.render(context) File "/opt/local/lib/python2.4/site-packages/django/template/__init__.py", line 688, in render bits.append(self.render_node(node, context)) File "/opt/local/lib/python2.4/site-packages/django/template/__init__.py", line 716, in render_node raise wrapped TemplateSyntaxError: Caught an exception while rendering: 'NoneType' object has no attribute 'items' Original Traceback (most recent call last): File "/opt/local/lib/python2.4/site-packages/django/template/__init__.py", line 706, in render_node result = node.render(context) File "/opt/local/lib/python2.4/site-packages/django/template/defaulttags.py", line 189, in render value = bool_expr.resolve(context, True) File "/opt/local/lib/python2.4/site-packages/django/template/__init__.py", line 548, in resolve obj = resolve_variable(self.var, context) File "/opt/local/lib/python2.4/site-packages/django/template/__init__.py", line 657, in resolve_variable raise VariableDoesNotExist, "Failed lookup for key [%s] in %r" % (bits[0], current) # missing attribute File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 61, in __repr__ return '<WSGIRequest\nGET:%s,\nPOST:%s,\nCOOKIES:%s,\nMETA:%s>' % \ File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 97, in _get_post self._load_post_and_files() File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 75, in _load_post_and_files self._post, self._files = http.parse_file_upload(self) File "/opt/local/lib/python2.4/site-packages/django/http/__init__.py", line 83, in parse_file_upload return default_parse_file_upload(req) File "/opt/local/lib/python2.4/site-packages/django/http/__init__.py", line 50, in default_parse_file_upload raw_message = '\r\n'.join(['%s:%s' % pair for pair in req.header_dict.items()]) AttributeError: 'NoneType' object has no attribute 'items'
comment:44 by , 18 years ago
Cc: | added |
---|
comment:45 by , 18 years ago
Cc: | added |
---|
comment:46 by , 18 years ago
This is anecdotal evidence at best, but rapid testing of this patch seems to indicate it does not work with the development server.
I basically applied it, updated to head, added the 2 middlewares, tried one of my upload forms, and it just sat there forever, never reaching the Django view code.
comment:47 by , 18 years ago
I would not expect this patch to work with the dev server, since it is a single thread you cannot do multiple requests at the same time.
comment:48 by , 18 years ago
I may be mistaken about my last comment, but the dev server uses the wsgi handler. So it may be the delay problem, or that stdin is not passed before the request is done. So since wsgi using lighttpd works, the dev server probably handle things differently. Don't really know alot about the dev server.
comment:49 by , 18 years ago
Cc: | added |
---|
comment:50 by , 18 years ago
This doesn't work for me at all. With Opera 9 and Firefox 1.5, I get KeyError on X-Progress-Id, which is supposed to be HTTP_X_PROGRESS_ID? (Webserver is Lighttpd, distro Debian unstable)
Also, isn't this line a bit insecure, given the get_temp_file does an "os.path.join":
content = open(get_temp_file(request.header_dict['X-Progress-Id']), 'r').read()
I've replaced my code to use cache framework instead. But mind you need server-wide cache, per process will probably not be enough.
And then, you probably don't want to open a new TCP connection every time you make a xml request, since this will effectively DOS your server.
I was also having problems with JS - the browser did not want to do a POST at all. It started polling server for progress, but it was not posting at all.
comment:51 by , 18 years ago
Lighttpd has a delay before it saves data and progress info.
Supposed to be HTTP_X_PROGRESS_ID?
I did it this way to be compatible with mod_uploadprogress that you should use instead.
Is there another way than opening another tcp connection using xmlhttp?
comment:52 by , 18 years ago
How much should that delay usually be? Since there is no POST for Django, progress bar does not work. This probably means I need to use mod_uploadprogress to get it working, but the new mod_uploadprogress has not been released yet officially.
Yes, HTTP_X_PROGRESS_ID. When I printed contents of request.header_dict I could see a key named HTTP_X_PROGRESS_ID which corresponds to X-Progress-Id.
Sure, one could open a TCP connection and server periodically sends info and client reads from it. But I see that you aim for mod_uploadprogress compatibility.
by , 18 years ago
Attachment: | 3581-streaming_uploads_and_uploadprogress_middleware_x_progress_id.diff added |
---|
Now using X_PROGRESS_ID instead of X-Progress-Id, accepts any lower/uppercase/ - / _ /prefix variant
comment:54 by , 18 years ago
Keywords: | streaming upload large added; timezone time zone UTC removed |
---|
The patch does not work with Apache 2 + mod_fcgid.
I'm running on Mac OS X 10.4.7 with a pure MacPorts installation (save Django and flup -- both are a svn HEAD).
Apache 2 configuration:
{{{<IfModule mod_fcgid.c>
AddHandler fcgid-script .fcgi
IdleTimeout 600
ProcessLifeTime 3600
MaxProcessCount 8
DefaultMinClassProcessCount 3
DefaultMaxClassProcessCount 3
IPCConnectTimeout 8
IPCCommTimeout 48
</IfModule>
ScriptAlias /myapp "cgi-bin/myapp.fcgi/"}}}
myapp.fcgi:
{{{#!/opt/local/bin/python
# -*- coding: utf-8 -*-
import sys, os
# Add a custom Python path.
sys.path.insert(0, "/opt/local/www")
# Switch to the directory of your project. (Optional.)
os.chdir("/opt/local/www/myapp")
# Set the DJANGO_SETTINGS_MODULE environment variable.
os.environDJANGO_SETTINGS_MODULE = "myapp.settings"
from django.core.servers.fastcgi import runfastcgi
runfastcgi(["method=threaded", "daemonize=false"])}}}
comment:55 by , 18 years ago
Repost with proper formatting :/
Apache 2 configuration:
<IfModule mod_fcgid.c> AddHandler fcgid-script .fcgi IdleTimeout 600 ProcessLifeTime 3600 MaxProcessCount 8 DefaultMinClassProcessCount 3 DefaultMaxClassProcessCount 3 IPCConnectTimeout 8 IPCCommTimeout 48 </IfModule> ScriptAlias /myapp "cgi-bin/myapp.fcgi/"
myapp.fcgi:
#!/opt/local/bin/python # -*- coding: utf-8 -*- import sys, os # Add a custom Python path. sys.path.insert(0, "/opt/local/www") # Switch to the directory of your project. (Optional.) os.chdir("/opt/local/www/myapp") # Set the DJANGO_SETTINGS_MODULE environment variable. os.environ['DJANGO_SETTINGS_MODULE'] = "myapp.settings" from django.core.servers.fastcgi import runfastcgi runfastcgi(["method=threaded", "daemonize=false"])
comment:56 by , 18 years ago
Cc: | added |
---|
comment:57 by , 18 years ago
I get the following error, again with my mod_fcgid setup, when using the admin interface with the progress status monitor js.
Unhandled exception in thread started by <bound method ThreadPool._worker of <flup.server.threadpool.ThreadPool object at 0x715270>>Traceback (most recent call last): File "/opt/local/www/data/flup/server/threadpool.py", line 103, in _worker job.run() File "/opt/local/www/data/flup/server/fcgi_base.py", line 642, in run self.process_input() File "/opt/local/www/data/flup/server/fcgi_base.py", line 678, in process_input self._do_params(rec) File "/opt/local/www/data/flup/server/fcgi_base.py", line 777, in _do_params self._start_request(req) File "/opt/local/www/data/flup/server/fcgi_base.py", line 761, in _start_request req.run() File "/opt/local/www/data/flup/server/fcgi_base.py", line 563, in run self.server.error(self) File "/opt/local/www/data/flup/server/fcgi_base.py", line 1157, in error req.stdout.write('Content-Type: text/html\r\n\r\n' + File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/cgitb.py", line 158, in html dump.append('%s = %s' % (name, pydoc.html.repr(value))) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/pydoc.py", line 360, in repr return Repr.repr(self, object) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/repr.py", line 24, in repr return self.repr1(x, self.maxlevel) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/pydoc.py", line 367, in repr1 return self.escape(cram(stripid(repr(x)), self.maxother)) File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 65, in __repr__ return '<WSGIRequest\nGET:%s,\nPOST:%s,\nCOOKIES:%s,\nMETA:%s>' % \ File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 101, in _get_post self._load_post_and_files() File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 79, in _load_post_and_files self._post, self._files = http.parse_file_upload(self) File "/opt/local/lib/python2.4/site-packages/django/http/__init__.py", line 82, in parse_file_upload return req.parse_file_upload(req) File "/opt/local/lib/python2.4/site-packages/django/middleware/upload.py", line 83, in parse_streaming_file_upload upload_state = req.upload_state(req) File "/opt/local/lib/python2.4/site-packages/django/middleware/upload.py", line 126, in __init__ self.state = {'size': int(req.header_dict.get('content-length')),TypeError: int() argument must be a string or a number
comment:58 by , 18 years ago
Something ate line breaks, apparently.
Unhandled exception in thread started by <bound method ThreadPool._worker of <flup.server.threadpool.ThreadPool object at 0x715270>> Traceback (most recent call last): File "/opt/local/www/data/flup/server/threadpool.py", line 103, in _worker job.run() File "/opt/local/www/data/flup/server/fcgi_base.py", line 642, in run self.process_input() File "/opt/local/www/data/flup/server/fcgi_base.py", line 678, in process_input self._do_params(rec) File "/opt/local/www/data/flup/server/fcgi_base.py", line 777, in _do_params self._start_request(req) File "/opt/local/www/data/flup/server/fcgi_base.py", line 761, in _start_request req.run() File "/opt/local/www/data/flup/server/fcgi_base.py", line 563, in run self.server.error(self) File "/opt/local/www/data/flup/server/fcgi_base.py", line 1157, in error req.stdout.write('Content-Type: text/html\r\n\r\n' + File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/cgitb.py", line 158, in html dump.append('%s = %s' % (name, pydoc.html.repr(value))) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/pydoc.py", line 360, in repr return Repr.repr(self, object) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/repr.py", line 24, in repr return self.repr1(x, self.maxlevel) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/pydoc.py", line 367, in repr1 return self.escape(cram(stripid(repr(x)), self.maxother)) File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 65, in __repr__ return '<WSGIRequest\nGET:%s,\nPOST:%s,\nCOOKIES:%s,\nMETA:%s>' % \ File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 101, in _get_post self._load_post_and_files() File "/opt/local/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 79, in _load_post_and_files self._post, self._files = http.parse_file_upload(self) File "/opt/local/lib/python2.4/site-packages/django/http/__init__.py", line 82, in parse_file_upload return req.parse_file_upload(req) File "/opt/local/lib/python2.4/site-packages/django/middleware/upload.py", line 83, in parse_streaming_file_upload upload_state = req.upload_state(req) File "/opt/local/lib/python2.4/site-packages/django/middleware/upload.py", line 126, in __init__ self.state = {'size': int(req.header_dict.get('content-length')), TypeError: int() argument must be a string or a number
comment:59 by , 18 years ago
I have just tested the patch with Apache 2 and mod_python. It works correctly, however I had to make 2 small modifications because my Django application isn't running at the root context of my server.
Basically, the admin JavaScript code hardcodes a request to "/progress/". This needs to be changed appropriately for each install, e.g. in my case "/django/progress/".
Similarly, the UploadStateMiddleware checks that the request path is exactly "/progress/", which needs to be changed to match.
comment:60 by , 18 years ago
I svn up'ed, applied 3581-streaming_uploads_and_uploadprogress_middleware_x_progress_id.diff, deleted the Django directory under site-packages, reinstalled Django with 'python setup.py install' and restarted apache/modpython but Django appears to be behaving as before. My memory usage still climbs beyond what I'd expect and I don't see any files in /tmp like I'd expect from tempfile.NamedTemporaryFile().
I added these lines to my settings.py file:
STREAMING_UPLOADS=True
UPLOAD_BUFFER_SIZE=2048
Am I missing something?
Thanks!
comment:61 by , 18 years ago
It is 2 middleware now.
django.middleware.upload.UploadStateMiddleware and django.middleware.upload.StreamingUploadMiddleware
Order is important
comment:62 by , 18 years ago
Thanks for the info!
I changed middleware_classes to look like this:
MIDDLEWARE_CLASSES = ( 'django.middleware.common.CommonMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.middleware.doc.XViewMiddleware', 'django.middleware.upload.UploadStateMiddleware', 'django.middleware.upload.StreamingUploadMiddleware', )
I was able to upload a couple large test files and tempfiles were showing up as expected in /tmp, but I'm intermittently receiving this exception when uploading:
Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/core/handlers/base.py" in get_response 74. response = callback(request, *callback_args, **callback_kwargs) File "/home/misjxm/dp1/file/app/views.py" in createmessage 275. if request.POST: File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/core/handlers/modpython.py" in _get_post 51. self._load_post_and_files() File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/core/handlers/modpython.py" in _load_post_and_files 32. self._post, self._files = http.parse_file_upload(self) File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/http/__init__.py" in parse_file_upload 81. return req.parse_file_upload(req) File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/middleware/upload.py" in parse_streaming_file_upload 104. FILES.appendlist(key, FileDict({ File "/usr/lib/python2.4/cgi.py" in __getattr__ 540. raise AttributeError, name AttributeError at /file/compose/ tmp_name
Have you seen this before?
Thanks!
comment:63 by , 18 years ago
And I only seem to get that exception when uploading text files such as M3U playlists, bashrc files, etc.
comment:64 by , 18 years ago
Text files in general only cause the problem intermittently. I have one M3U file that consistantly causes the exception to be thrown. This is what local vars looks like when I upload that file:
name 'tmp_name' self FieldStorage('file1', 'klipschtest.m3u', "MP3\\2pac\\Greatest Hits (Disc 1)\\1-05 Hail Mary.mp3\r\nFLAC\\James Blunt\\Back to Bedlam\\03-Wisemen.flac\r\nMP3\\Jimi_Hendrix\\UNKNOWN\\Little wing (extended instrumental).mp3\r\nFLAC\\Led Zeppelin\\Remasters (disc II)\\04-No Quarter.flac\r\nFLAC\\Pink Floyd\\The Wall - Disc 1\\03-Another Brick in the Wall. Part 1.flac\r\nFLAC\\Pink Floyd\\The Wall - Disc 1\\04-The Happiest Days of Our Lives.flac\r\nFLAC\\Natalie Merchant\\Live In Concert\\05-Carnival.flac\r\nMP3\\Metallica\\...And Justice For All\\08-To Live Is To Die.mp3\r\nMP3\\Red Hot Chili Peppers\\Blood Sugar Sex Magik\\10-Blood Sugar Sex Magik.mp3\r\nFLAC\\Sade\\Lovers Rock\\03-King Of Sorrow.flac\r\nFLAC\\Stone, Joss\\The Soul Sessions\\05-Dirty Man.flac\r\nFLAC\\U.G.K\\Ridin' Dirty\\03-Murder.flac\r\nMP3\\Weezer\\Weezer (Green Album)\\04-Weezer _ Island In The Sun.mp3\r\nMP3\\Weezer\\Weezer (Green Album)\\03-Weezer _ Hash Pipe.mp3\r\n")
comment:65 by , 18 years ago
I'm getting an error about 'read only accepts 1 arguement (2 given)", from the following line in cgi.py:
self.fp.read(self.length)
comment:66 by , 18 years ago
This is the error I'm getting:
C:\web\fxedit>python manage.py runserver
Validating models...
0 errors found.
Django version 0.95, using settings 'fxedit.settings'
Development server is running at http://127.0.0.1:8000/
Quit the server with CTRL-BREAK.
[16/Sep/2006 02:55:31] "GET /file/ HTTP/1.1" 200 1188
[16/Sep/2006 02:55:31] "GET /media/stylesheets/style.css HTTP/1.1" 304 0
[16/Sep/2006 02:55:31] "GET /media/stylesheets/print.css HTTP/1.1" 304 0
Traceback (most recent call last):
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\servers\
basehttp.py", line 272, in run
self.result = application(self.environ, self.start_response)
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\servers\
basehttp.py", line 615, in call
return self.application(environ, start_response)
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\handlers
\wsgi.py", line 156, in call
response = self.get_response(request.path, request)
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\handlers
\base.py", line 102, in get_response
return self.get_technical_error_response(request)
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\handlers
\base.py", line 134, in get_technical_error_response
return debug.technical_500_response(request, *sys.exc_info())
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\views\debug.p
y", line 131, in technical_500_response
return HttpResponseServerError(t.render(c), mimetype='text/html')
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\template\in
it.py", line 155, in render
return self.nodelist.render(context)
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\template\in
it.py", line 688, in render
bits.append(self.render_node(node, context))
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\template\in
it.py", line 716, in render_node
raise wrapped
TemplateSyntaxError: Caught an exception while rendering: read() takes exactly 1
argument (2 given)
Original Traceback (most recent call last):
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\template\in
it.py", line 706, in render_node
result = node.render(context)
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\template\defa
ulttags.py", line 189, in render
value = bool_expr.resolve(context, True)
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\template\in
it.py", line 548, in resolve
obj = resolve_variable(self.var, context)
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\template\in
it.py", line 657, in resolve_variable
raise VariableDoesNotExist, "Failed lookup for key [%s] in %r" % (bits[0], c
urrent) # missing attribute
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\handlers
\wsgi.py", line 61, in repr
return '<WSGIRequest\nGET:%s,\nPOST:%s,\nCOOKIES:%s,\nMETA:%s>' % \
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\handlers
\wsgi.py", line 97, in _get_post
self._load_post_and_files()
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\core\handlers
\wsgi.py", line 75, in _load_post_and_files
self._post, self._files = http.parse_file_upload(self)
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\http\init
.py", line 80, in parse_file_upload
return req.parse_file_upload(req)
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\middleware\up
load.py", line 87, in parse_streaming_file_upload
fs = FieldStorage(req.raw_request, environ=req.META, headers=req.header_dict
, upload_state=upload_state )
File "C:\Python24\lib\site-packages\django-0.95-py2.4.egg\django\middleware\up
load.py", line 71, in init
keep_blank_values=keep_blank_values, strict_parsing=strict_parsing)
File "C:\Python24\lib\cgi.py", line 524, in init
self.read_urlencoded()
File "C:\Python24\lib\cgi.py", line 629, in read_urlencoded
qs = self.fp.read(self.length)
TypeError: read() takes exactly 1 argument (2 given)
[16/Sep/2006 02:55:53] "POST /file/stats/ HTTP/1.1" 500 3721
by , 18 years ago
Attachment: | 3581-streaming_uploads_and_uploadprogress_middleware_x_progress_id_windowsfix.diff added |
---|
windows temporary file locking fix
comment:67 by , 18 years ago
I'm getting the same error as anonymous above (and am also using Windows).
comment:68 by , 18 years ago
Idea for fixing the m3u upload problem. May need a copy of this util.py to make it work with wsgi.
But I have almost given up on wsgi for this patch, uless anyone are able to shed som light on why wsgi does not like chunked reading from wsgi.input.
comment:69 by , 18 years ago
There is a problem using this patch with Nginx (and maybe some other webservers). Nginx doesn't pass content-type and content-length headers and the result of this is that FieldStorage can't parse request correctly and nothing is uploaded. A workaround I've found is modification of cgi.FieldStorage to use HTTP_CONTENT_TYPE and HTTP_CONTENT_LENGTH. But I don't think this is elegant solution of this problem. Does anyone have other suggestions on how to solve it ?
comment:70 by , 18 years ago
Worked for me(Apache mod_python). Working like charm. :) But heres some instructions..
- Do the patch (cd django; patch -Np0 < file.diff)
- Add two middlewares, modify the model like below:
class FileList(models.Model): name = models.CharField(maxlength=255) email = models.EmailField() class Admin: js = ['js/UploadProgress.js'] class AFile(models.Model): descr = models.CharField(maxlength=255,core=True) file = models.FileField(upload_to='files') inlist = models.ForeignKey(FileList,edit_inline=models.STACKED)
- You have progressbar in admin. :)
Submitting patch to the latest trunk.
comment:71 by , 18 years ago
Cc: | added |
---|
comment:72 by , 18 years ago
There are some issues with this patch. I have it working on my local Linux and Windows boxes. But i can't get to work it completely correct on my server with the same setup as my local linux box (python 2.5, latest Django).
The issue is that when uploading a file WSGIRequest . _load_post_and_files is called only after all data is uploaded. So no streaming occurs and no progress is displayed and I didn't find a solution for this yet..
comment:73 by , 18 years ago
[Cha0S] using lighttpd or apache, same lighttpd/apache version on both servers?
I found out lighttpd pre 1.4.0 supported streaming, so that has to be fixed before this patch will work on lighttpd.
comment:74 by , 18 years ago
This worked great on my system [Apache/2.0.58 (Win32) mod_python/3.2.8 Python/2.4.3]
Following up on Sung-Jin Hong's good instructions with a few tips for newbies like me:
- When you apply the patch, make sure you get django/middleware/upload.py. I found that when I did an SVN update to Rev 4065 and applied 4065-streaming_uploads_and_uploadprogress_middleware_x_progress_id_windowsfix.diff I did not get the file. I had to apply 3581-streaming_uploads_and_uploadprogress_middleware_x_progress_id_windowsfix.diff to get it.
- Make sure js/UploadProgress.js in accessible. The Admin page's path to it will be in the media directory, e.g. http://yoursite.com/media/js/UploadProgress.js. I had to copy UploadProgress.js from django/contrib/admin/media/js to my local media/js directory.
- Make sure the Apache httpd.conf file section for your site allows paths starting with "/progress", i.e. has a line like '<Location ~ "/(admin|shop|progress)">' in it, because the JS code is doing a "req.open("GET", "/progress/", 1)" repeatedly to track progress.
I find that the Upload progress box displays fine in Firefox and Opera, but it is at the very bottom left of the absolute page in IE6 (i.e. you need to scroll to the very bottom to see it). I'm still working on this problem.
Thank you to all who contributed this!
comment:75 by , 18 years ago
[Cha0S] using lighttpd or apache, same lighttpd/apache version on both servers? I found out lighttpd pre 1.4.0 supported streaming, so that has to be fixed before this patch will work on lighttpd.
I've tested with lighttpd and nginx web-servers. same versions on both servers. with apache it works fine on windows.
comment:76 by , 18 years ago
Component: | Core framework → Contrib apps |
---|---|
Type: | enhancement → defect |
Has anyone come up with a fix for the AttributeError exceptions when uploading text files? I'm still getting exceptions like this:
AttributeError at /file/compose/
tmp_name
Request Method: POST
Request URL: https://myserver/file/compose/
Exception Type: AttributeError
Exception Value: tmp_name
Exception Location: /usr/lib/python2.4/cgi.py in getattr, line 540
Traceback (most recent call last):
File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/core/handlers/base.py" in get_response
- response = callback(request, *callback_args, callback_kwargs)
File "/files/lib/file/app/views.py" in createmessage
- if request.POST:
File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/core/handlers/modpython.py" in _get_post
- self._load_post_and_files()
File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/core/handlers/modpython.py" in _load_post_and_files
- self._post, self._files = http.parse_file_upload(self)
File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/http/init.py" in parse_file_upload
- return req.parse_file_upload(req)
File "/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/middleware/upload.py" in parse_streaming_file_upload
- FILES.appendlist(key, FileDict({
File "/usr/lib/python2.4/cgi.py" in getattr
- raise AttributeError, name
AttributeError at /file/compose/
tmp_name
comment:77 by , 18 years ago
Type: | defect → enhancement |
---|
oops, didn't mean to change the type on this ticket... changing back to enhancement.
comment:78 by , 18 years ago
Component: | Contrib apps → Core framework |
---|
comment:79 by , 18 years ago
Cc: | added |
---|
Hi, jacob
The 'AttributeError' occurred, because FieldStorge does not make_temp_file for small files, and the tmp_name is not defined.
The progress middleware has problem too. You will notice that your upload speed will down to a significant lower rate, if you use this middleware. The performance bottleneck is that, every single 'read' from request reads only a few bytes, even if you set a large buffer_size, and the serialization and save to disc of 'UploadState' is significant.
The attachments is my working copy of code.
comment:80 by , 18 years ago
Cc: | added |
---|
comment:81 by , 18 years ago
Needs documentation: | set |
---|---|
Patch needs improvement: | set |
Triage Stage: | Unreviewed → Accepted |
Can someone who has used this give us a progress report on this - which of the 3000 patches above are needed? can these be merged into one patch that works?
comment:82 by , 18 years ago
The one that i can vouch for is 3581-streaming_uploads_and_uploadprogress_middleware_x_progress_id.diff
But it has problems with m3u uploads's for some reason i can't figure out.
Was thinking about the possibility of using email.FeedParser instead of rfc822 but that is not available in python 2.3
This patch does not work with lighttpd, since lighttpd can't stream requests.
comment:83 by , 18 years ago
This patch does not work with lighttpd, since lighttpd can't stream requests.
This is not true. When Django works with Lighttpd over FastCGI lighttpd immediately offloads received data to Django process and it can stream it wherever it wants to. In fact I use one of my services that uploads zipped mp3 albums under lighttpd and streaming works (with my patch in #1484).
comment:84 by , 18 years ago
I think you are correct. I guess the problem with this patch and fcgi has to do with not doing blocking reads with a buffer size. But a buffer size is needed if the browser sends a big file without eol's, and for the upload progress.
comment:85 by , 18 years ago
I'm not sure I understand what you're saying... I was replying solely to a claim that lighttpd can't do streaming. It certainly does streaming to a FCGI app. Whatever the app does then (streaming by blocks, by eols or no streaming at all) is a whole another question.
comment:86 by , 18 years ago
Cc: | added |
---|
comment:87 by , 18 years ago
Making a attempt at using email.FeedParser
http://fivethreeo.dynalias.org/repos/streaming
login: public, password: public
comment:88 by , 18 years ago
Need to override FeedParser._parsegen to write the payload directly to a file instead of a in-memory string.
Probably involves copying FeedParser._parsegen and do some modificatioms.
comment:89 by , 18 years ago
I am pleased to announce, its alive!!
Just some minor work left ( validators, refactoring, bugfixing ) volunteers?
FeedParser should work with python2.3 if it included with django.
comment:90 by , 18 years ago
Cc: | removed |
---|
comment:91 by , 18 years ago
To get the most recent diff, now with validator
svn co http://fivethreeo.dynalias.org/repos/streaming
then
cd streaming
svn diff -r371
by , 18 years ago
Attachment: | modpyton-ok-needs-fcgi-testing.3.diff added |
---|
did not work without middleware, fixed
comment:94 by , 18 years ago
Tested on lighttpd/fastcgi without errors or memory hogging.
Should feedparser be included with django so python2.3 users can use this feature too?
The way the feedparser works there is no way to not duplicate the entire _parsegen function as I see it.
comment:95 by , 18 years ago
Cc: | added |
---|
comment:96 by , 18 years ago
Streaming file uploads, take #3
I would really like to see something like this committed to
trunk. However I think the previous patches has been too complicated
and tried to do too much. Lack of documentation and tests has probably
been a factor as well.
I have rewritten the file upload to make it simpler and added some
documentation and tests.
- Simplified parsing by using a specialized RFC 2388 parser. The
previous patches used different RFC 2822 parsers which were not suited
for the task. The new parser is very efficient both cpu and memory
wise. It uses mx.TextTools for boudary searches if available.
- Removed AJAX code and progress meter. While I think it is a good
thing I think it hinders the commitment of this patch. We should open
a new ticket for a progress meter in the Admin interface. The
X-Progress-ID header is still supported and a pickled progress file is
written if supplied.
- Added error handling.
- Added Documentation.
- Added tests.
Tested with mod_python and lighttpd.
You need to specify the FILE_UPLOAD_DIR setting to enable streaming (read forms doc).
Patch attached (4459-streaming-file-upload.diff)
follow-up: 99 comment:98 by , 18 years ago
A couple of comments from quick reading of 4459-streaming-file-upload.diff
- It streams uploads to files only if FILE_UPLOAD_DIR is set. But it looks that it stores files in temp directory anyway. What's the point of FILE_UPLOAD_DIR then?
- Instead of FILE_UPLOAD_DIR it would be nice to have a setting like MAX_FILE_UPLOAD_BUFFER with the maximum number of bytes stored in memory during an upload. And only if the file is bigger it is streamed to disc. This will have a nice effect of not touching disc when dealing with small files which is faster
- It looks like any Exception during the parsing is swallowed and returned as a dict of errors that is then stored in a POST data. This actually breaks the whole exceptions paradigm :-).
- All the save_FIELD_file now accept raw_field. This is backward-incompatible but worse it limits this method to work only with uploaded files while now save_FIELD_file can accept any content not matter how it was created in a program. I understand that just renaming streamed upload is a fast feature but I believe it deserves a new method. Like 'save_FIELD_from_file' or something...
follow-up: 101 comment:99 by , 18 years ago
Replying to Ivan Sagalaev <Maniac@SoftwareManiacs.Org>:
A couple of comments from quick reading of 4459-streaming-file-upload.diff
- It streams uploads to files only if FILE_UPLOAD_DIR is set. But it looks that it stores files in temp directory anyway. What's the point of FILE_UPLOAD_DIR then?
A StringIO() buffer is used in that case so no files are used.
- Instead of FILE_UPLOAD_DIR it would be nice to have a setting like MAX_FILE_UPLOAD_BUFFER with the maximum number of bytes stored in memory during an upload. And only if the file is bigger it is streamed to disc. This will have a nice effect of not touching disc when dealing with small files which is faster
The problem is that you do not know in advance how large a file is. However you know how large the POST is in total so you could enable file streaming based on that. I did not want to create too many settings variables but I agree that it might be a good idea. Another option would be to have a setting where you name the form fields what you want to have streamed.
- It looks like any Exception during the parsing is swallowed and returned as a dict of errors that is then stored in a POST data. This actually breaks the whole exceptions paradigm :-).
Yes, I did that to avoid too many changes in wsgi.py/modpython.py since you have to take care not to try to read the input stream again in case of an exception. Meaning that I always wanted to return something. Bad choice maybe but I tried to make as few changes as possible so that the powers that be could start looking at the patch without too much confusion ;)
- All the save_FIELD_file now accept raw_field. This is backward-incompatible but worse it limits this method to work only with uploaded files while now save_FIELD_file can accept any content not matter how it was created in a program. I understand that just renaming streamed upload is a fast feature but I believe it deserves a new method. Like 'save_FIELD_from_file' or something...
Aren't those methods invoked automatically for you when you do .save()? I didnt think they were public but I dont understand exacly how they work so I take your word for it...
by , 18 years ago
Attachment: | 4459-streaming-file-upload.2.diff added |
---|
Added FILE_UPLOAD_MIN_SIZE (default 100kb) to define minimum request size for streaming to disk. Propage exeptions. I'm not too happy with the names of the settings anymore :/
comment:100 by , 18 years ago
(I've removed many attachments, as per a request by oyvind.saltvik@… on django-developers.)
comment:101 by , 18 years ago
The problem is that you do not know in advance how large a file is. However you know how large the POST is in total so you could enable file streaming based on that.
Yeah that what I meant... The whole point is to not make Django eat everything in memory so a settings topping this amount seems most logical to me.
I saw you made this setting. Here are couple of nits:
file_upload_min_size = getattr(settings, 'FILE_UPLOAD_MIN_SIZE', 100000)
- There is a convention in Django to put a setting with its default value into django/conf/global_settings.py. This keeps all defaults in one place and ensures that settings are always present so you don't have to fail-safe them with getattr, just use settings.SETTING_NAME
- FILE_UPLOAD_MIN_SIZE sounds confusing, like we don't allow users to upload small files :-). I still like an idea of calling it a buffer size somehow (and actually use it as a buffer size when reading chunks from input). What do you (and everyone) think?
- 100 KB is more like 100 * 1024. But I'd make it about 512 * 1024 or 1024 * 1024. This is small enough to not even show on memory stats but big enough to handle most profile photos uploaded in practice without touching disk most of the times.
Yes, I did that to avoid too many changes in wsgi.py/modpython.py since you have to take care not to try to read the input stream again in case of an exception.
Well... An exception ensures just this: the program won't run normally in case of errors, return values on the other hand are easy to forget to check. I see that you have removed swallowing the Exception. Why not MultiPartParserError then? As I understand this will happen only when the input is really malformed (like not created by a browser) so it's very little a user application can do about it. And I think it shouldn't. I think we should catch this exception in handlers/base.py and unconditionally return '400 Bad Request' on it.
Aren't those methods invoked automatically for you when you do .save()? I didnt think they were public but I dont understand exacly how they work so I take your word for it...
They are called from save(), yes. But they are also intended as a public API and documented (it's in model's API doc, too lazy to get a link :-) ).
follow-up: 105 comment:102 by , 18 years ago
Cc: | added |
---|
comment:103 by , 18 years ago
Cc: | added |
---|
comment:104 by , 17 years ago
Cc: | added; removed |
---|
comment:105 by , 17 years ago
Replying to Gábor Farkas <gabor@nekomancer.net>:
Sorry to clutter the thread.
Can this patch be used with the newforms framework, or does it only work with oldforms?
comment:106 by , 17 years ago
Can be used in newforms if you handle the incoming data, form_for_model has no automatic filefields yet.
comment:107 by , 17 years ago
Replying to adrian:
In reference to bnewbold's comment above, it's the shutil module, and it looks like it's been around since at least Python 2.3. (Django supports Python 2.3 and up.)
os.rename('old', 'new') does result in an 'invalid cross-link device' exception when 'old' and 'new' are on different filesystems. It doesn't make sense to do this, as once a large file has been streamed to disk, copying to another filesystem is a pointless additional overhead. However, the code could catch the error, and do a copy and then delete operation (and generate a warning that the system can be configured more efficiently).
Or it could use shutil as suggested by a previous poster.
comment:108 by , 17 years ago
Is it possible that this patch could be updated so that it works with the current source version? At the moment if one checks out revision 4459, patches it and then executes svn update, clashes arise at several points in the code.
comment:109 by , 17 years ago
Cc: | added |
---|
by , 17 years ago
Attachment: | 5065-streaming_file_upload_with_shutils.diff added |
---|
I've updated the patch to include the shutils
command and to work with [5065]. Please check to see if it works.
by , 17 years ago
Attachment: | 5065-streaming_file_upload_with_shutils_2.diff added |
---|
Works in [5065], renamed settings variable, uses global settings, defaults STREAMING_MIN_POST_SIZE to .5MB (please test though!)
comment:110 by , 17 years ago
Cc: | added |
---|
by , 17 years ago
Attachment: | 5070-streaming-file-upload.diff added |
---|
Updated to trunk, without changes
follow-up: 112 comment:111 by , 17 years ago
Axiak
But does this not limit streaming to gnu os'es by using shutils. Should it not try shutils first, dont import shutils at the top of the .py Then try rename, and finally just write from filecontent if that fails.
by , 17 years ago
Attachment: | 5078_streaming_file_upload_with_shutils_and_fallbacks.diff added |
---|
Usese shutils
, but falls back.
by , 17 years ago
Attachment: | 5078_streaming_file_upload_with_shutils_and_fallbacks_2.diff added |
---|
Usese shutils
, but falls back, this time it deletes the tmp file even when it falls back.
comment:112 by , 17 years ago
Replying to Øyvind Saltvik <oyvind@saltvik.no>:
Axiak
But does this not limit streaming to gnu os'es by using shutils. Should it not try shutils first, dont import shutils at the top of the .py Then try rename, and finally just write from filecontent if that fails.
I did not know it only worked in GNU os's. Also, AFAIK, there's no OS-independent way to copy files in python, and if file streaming was enabled, file_fieldcontent will not have the desired content. Thus, the only sure-fire way to make it work is to do a file.write file.read combination. Luckily it probably won't have to do this, since it will try both shutils
and
os.rename
before it gets to that point.
comment:113 by , 17 years ago
One more thing, the last fallback should read and write incrementally, to avoid reading the whole file before it is written to another.
by , 17 years ago
Attachment: | 5078-streaming_file_upload_with_shutils_and_chunked_fallbacks.diff added |
---|
This file uses python fallbacks, but using chunks to avoid loading the entire file into memory.
by , 17 years ago
Attachment: | 5079-streaming_file_upload_with_safe_file_move.diff added |
---|
Cleaned it up a bit. Moved file_move_safe into django.utils in case it should be used in future endeavors.
comment:114 by , 17 years ago
Needs documentation: | unset |
---|---|
Summary: | [patch] Large streaming uploads → Large streaming uploads |
Version: | → SVN |
This last patch seems clean. If people wanted to start testing, now would be a good time :). It works in [5079] and should work on all platforms while considering all the ideas posted here.
It also has documentation. I will not remove the `Patch needs improvement' flag to let others review this patch with a finer comb than I have.
Please post here if you have additional concerns, comments, or enhancements.
by , 17 years ago
Attachment: | 5089-streaming_file_upload_with_safe_file_move.diff added |
---|
Added missing else
by , 17 years ago
Attachment: | 5089-streaming_file_upload_with_safe_file_move.2.diff added |
---|
Forgot to add django/utils/file.py
follow-up: 124 comment:117 by , 17 years ago
Hi,
Is x-progress-bar is a header that come with apache only or it is a start header?
Is it work on fcgi, if not, can I change x-progress-bar to hidden form?
Thank.
comment:118 by , 17 years ago
It is just a header that has to be sent along with the request.
X-Progress-ID with a value
then FILE_UPLOAD_DIR + the value of the header X-Progress-ID is a pickled python object.
That can be used by any view and displayed using javascript.
comment:119 by , 17 years ago
Thank for your replying,
I have tested this code by modified old javascript (middleware enable). I never seen X-Progress-ID in header, so, a temp file with that id never created.
Can I have example usage?
Thank again
comment:120 by , 17 years ago
Actually, this patch does this wrong, the header is only used for getting progress status, the id should be the querystring.
comment:121 by , 17 years ago
As this site states:
http://search.cpan.org/~ceeshek/Apache2-UploadProgress-0.2/lib/Apache2/UploadProgress.pm#HANDLERS
We should be using multiple mechanisms to get this progress ID.
After I mull over this some more and understand it better, I'll make a new patch.
by , 17 years ago
Attachment: | 5099_patch_for_streaming_uploads.diff added |
---|
Uses multiple mechanisms for determining the progress id.
comment:122 by , 17 years ago
The patch has now been merged into [5099] and I added a new property to request
: progress_id
. This may simplify JavaScript Middleware (we should make a new ticket for that) as it only has to do request.progress_id
. If it's None
, there was no progress ID sent. Otherwise, there was.
comment:123 by , 17 years ago
This ticket is only for making sure file uploads work. I have created ticket #4165 which s specifically for the progress bar enhancements.
comment:124 by , 17 years ago
Replying to anonymous:
Hi,
Is x-progress-bar is a header that come with apache only or it is a start header?
Is it work on fcgi, if not, can I change x-progress-bar to hidden form?
Thank.
X-Progress-ID
is used to tell the server between loads "This is the same post". I'm actually not sure how the headers are supposed to be populated, only that we're supposed to "support them". After reading a perl module for quite some time, I've determined that they use javascript to append a random 32-digit hex at the end of their form action ("?progress_id=abcdef0123456789abcdef0123456789") right before the form submits.
If you do this, it will work.
NB: Using a hidden POST field will NOT work. Why? Because it's the POST upload we're trying to track!
by , 17 years ago
Attachment: | 5100_upload_request_meta.diff added |
---|
This isn't important enough to get its own property in request...request.META now contains 'UPLOAD_PROGRESS_ID'
by , 17 years ago
Attachment: | 5100_forgot_to_revert_base.diff added |
---|
I included a bit much in that last patch.
by , 17 years ago
Attachment: | 5100_file_upload_core.diff added |
---|
Meant to be cleaner, especially in light of #4165
by , 17 years ago
Attachment: | 5100_file_upload_core_with_middleware_hooks.diff added |
---|
Added middleware hooks
by , 17 years ago
Attachment: | 5100_file_upload_core_with_middleware_hooks_2.diff added |
---|
Added middleware hooks...this is better.
comment:125 by , 17 years ago
Replying to Maniac <Maniac@SoftwareManiacs.Org>:
My idea of a hook is not fit into current middleware, I just wanted to retain the style. So to clarify I see it like this.
This can be a separate set of hooks similar to MIDDLEWARE_CLASSES or it can be done in middleware classes themselves. To do the latter we will specifiy that in addition to process_request, _response, _view and _exception methods a middleware can define a new kind of method: process_upload(request, data).
parse_file_upload then would search for these middleware methods just as http requests do in modpython.py and wsgi.py. The list of methods is passed to a FieldStorage decendant that will call them each time it's about to write a chunk of data to the target stream.
This is it... Sorry I can't come with the actual code: pretty busy rolling out my current project.
After reviewing all the comments, this one struck me in particular. I have added a process_upload(UploadException)
method to MiddleWare.
I'm sure I might get heat for this recent addition, but I think it's the cleanest way to allow extensibility.
Essentially, you can create a descriptor
for the file_progress
attribute of request
. It gets treated like a dictionary so it will be passed things like {'received': 1414,'size': 100000
} and is expected to return it when its asked for. You are given an exception to raise if something should cancel the file upload, so I could imagine someone writing a descriptor that would compare received
and some target size and cancel the upload if it's more than that.
For more details, please look at django/http/file_descriptor.py
for the fallback descriptor. If you named that exact class process_upload
and stuck it in a middleware, it would be used (and would work). I'll come up with another example soon, and stick it in #4165.
by , 17 years ago
Attachment: | 5100_file_upload_core_with_middleware_hooks_3.diff added |
---|
Apparently I don't know how not to lose files.
by , 17 years ago
Attachment: | 5100_file_upload_core_with_middleware_hooks_4.diff added |
---|
Apparently I don't know how not to lose files.
by , 17 years ago
Attachment: | 5113_file_upload_core_with_middleware_hooks.diff added |
---|
Fixes WSGI bug.
by , 17 years ago
Attachment: | 5116_streaming_upload_fixed_middleware_append.diff added |
---|
Fixed bug where .append(0, func)
was called.
by , 17 years ago
Attachment: | 5116_streaming_upload_fixed_middleware_append_2.diff added |
---|
Fixed bug where .append(0, func)
was called with the files this time.
comment:126 by , 17 years ago
Break!
Traceback (most recent call last):
File "C:\Python25\lib\site-packages\django\core\servers\basehttp.py", line 272, in run
self.result = application(self.environ, self.start_response)
File "C:\Python25\lib\site-packages\django\core\servers\basehttp.py", line 614, in call
return self.application(environ, start_response)
File "C:\Python25\lib\site-packages\django\core\handlers\wsgi.py", line 207, in call
request = WSGIRequest(environ)
File "C:\Python25\lib\site-packages\django\core\handlers\wsgi.py", line 78, in init
self.METAUPLOAD_PROGRESS_ID = self._get_file_progress_id()
File "C:\Python25\lib\site-packages\django\core\handlers\wsgi.py", line 187, in _get_file_progress_id
self._req.args)
AttributeError: 'WSGIRequest' object has no attribute '_req'
by , 17 years ago
Attachment: | 5126_file_upload_wsgi_tested.diff added |
---|
Tested with WSGI and made a few changes.
follow-up: 131 comment:127 by , 17 years ago
Alright, I tested/tweaked the patch on a few different platforms. Here are my notes:
- Apache2/mod_python: Works well. Works with #4165 and generates pretty progress bar
- Apache2/mod_fastcgi: Works well. Works with #4165 and generates pretty progress bar
- Apache2/mod_scgi: Untested.
- LighTTPD/mod_fastcgi: Does NOT work. However, version 1.5 will have
mod_uploadprogress
which will provide a JSON interface just like the one in #4165. Thus, usingmod_upgradeprogress
to generate a progress bar should be a breeze. I don't have lighttpd 1.5 to test, though.
- manage.py development server: World well. Progress bar does not work AFAIK, the development server is singlethreaded and it will stay that way. The progress bar is an inherently multithreaded process (one to serve progress information, one to receive the uploaded information).
Let me know if you have anymore problems.
by , 17 years ago
Attachment: | 5126_file_uploads_latest_patch.diff added |
---|
Added 'state':'starting' to be more like mod_uploadprogress
.
follow-up: 129 comment:128 by , 17 years ago
Hi,
I have tested under window xp, apache 2.2.4, mod_python 3.3. When I upload a large file via admin interface.
A temp file with named by session generated by javascript client is created. However, a file with random name and '.upload' extension is also created. File stream is append to '.upload' version instead of session id one.
Any idea??
comment:129 by , 17 years ago
Replying to anonymous:
Hi,
I have tested under window xp, apache 2.2.4, mod_python 3.3. When I upload a large file via admin interface.
A temp file with named by session generated by javascript client is created. However, a file with random name and '.upload' extension is also created. File stream is append to '.upload' version instead of session id one.
Any idea??
Yes, this is how it is supposed to work, which is why there are hooks for middleware (see illustration middleware in #4165).
You see, we don't need to keep track of the actual file that's being uploaded, because it's always being uploaded (by the same thread, same loop, etc.). We do need to be able to look for the 'progress' of that file upload, which has to be stored in some persistent manner, keyed by the session id. Since the file upload code has no model, can't rely on cache, and does have a safe place to put temp files, it by default puts that state information in that location.
If you install the patch 5126_file_upload_contrib_app.diff
and add 'django.contrib.uploadprogress.middleware.FileProgressCached'
to your MIDDLEWARE_CLASSES
setting, the cache framework will be used, and no session_id file will be created.
Alternatively, you can use that patch and add 'django.contrib.uploadprogress.middleware.FileProgressDB'
to your MIDDLEWARE_CLASSES
and add 'django.contrib.uploadprogress'
to your INSTALLED_APPS
(and run a syncdb
...).
Or, you are welcome to look at the code in django.contrib.uploadprogress.middleware.FileProgressCached
and write your own method of storing this file progress information (it's in a dictionary format).
Sounds like it's working though, thanks for the feedback.
comment:130 by , 17 years ago
I should also explain the rationale for the naming of the files:
We use the X-Progress-ID
value for the name of the file that stores the progress because in subsequent requests we're going to need to get the file by that same value.
We use a (python) temporary file name for the actual file that's being uploaded because it guarantees uniqueness. Since X-Progress-ID
is provided by the client, we can't guarantee someone won't start auto-generating requests and cause a collision on the files themselves. I have thought of putting the X-Progress-ID
in the session. This can be done but it relies on the session app -- which is not a prereq for Django to work -- being activated and file uploading is a core feature.
It doesn't seem like too much of an issue if the progress tracking has a small chance of getting munged. The actual files, OTOH, would be a bit worrisome.
That being said, if you really don't like the naming, and have a better suggestion, by all means suggest it.
follow-up: 132 comment:131 by , 17 years ago
Replying to Michael Axiak <axiak@mit.edu>:
Alright, I tested/tweaked the patch on a few different platforms. Here are my notes:
...
My new report:
- Apache2/mod_python: Works well. Works with #4165 and generates pretty progress bar
- Apache2/mod_fastcgi: Works well. Works with #4165 and generates pretty progress bar
- Apache2/mod_scgi: Untested.
- LighTTPD/mod_fastcgi: I stand corrected (Thanks SmileyChris). It does help with LighTTPD inasmuch as it will make the interaction between lighttpd and python not eat all the memory and CPU. However, it will not show a progress bar (nor are other features in #4165 supported), as the streamed 'upload' only exists between lighttpd and python, not between the web client and the web server.
- manage.py development server: World well. Progress bar does not work AFAIK, the development server is singlethreaded and it will stay that way. The progress bar is an inherently multithreaded process (one to serve progress information, one to receive the uploaded information).
Also, when uploading a 4.5 GB file, the content length was reported incorrectly. This happened with both FastCGI and mod_python on apache2.
comment:132 by , 17 years ago
Replying to anonymous:
...
Also, when uploading a 4.5 GB file, the content length was reported incorrectly. This happened with both FastCGI and mod_python on apache2.
It turns out this happens when you don't have Large File support (on either the client or the server?).
by , 17 years ago
Attachment: | 5127_file_uploads_no_streaming_fixed.diff added |
---|
There was an error uploading large files with streaming turned off
comment:133 by , 17 years ago
I also tidied up the code a bit (just white space tweaking mostly).
I noticed that currently, you need to set an upload directory even if you are using a middleware which overrides the FileProgressDescriptor
of request
because of calls like this in http.__init__
:
if self._progress_filename: self._request.file_progress = {'state': 'done'}
I'm not sure if relying on _progress_filename
is the best way to do this.
comment:134 by , 17 years ago
Can we have a moratorium on uploading yet more changes to this ticket for a few days, please?
I've started reading through a few of the patches carefully with an eye to working out what parts look like they're ready to be committed. However, this isn't going to work if every day there is yet another slight tweak to something. It's going to take a few days to review all this carefully and think about all the issues raised in the comments, so everybody just take a deep breath and back away from the keyboards for a bit.
follow-up: 136 comment:135 by , 17 years ago
I've yet to test large files with the patch applied, but I've noticed my files are being created with 600 permissions instead of the 644 they received before patching. Not sure if it's just my setup, but I can't recall any other changes over the past day or so that would have affected it.
comment:136 by , 17 years ago
Replying to lukeman@gmail.com:
I've yet to test large files with the patch applied, but I've noticed my files are being created with 600 permissions instead of the 644 they received before patching. Not sure if it's just my setup, but I can't recall any other changes over the past day or so that would have affected it.
It should inherit the permissions from the temporary directory. I can't think of a better setup than this, but if people think this is unexpected/problematic maybe we can change it.
If people understand streaming files go to the temp folder then to the other folder, then I think inheriting the permissions from the temp folder would be expected behavior.
-Mike
follow-up: 138 comment:137 by , 17 years ago
I have tested multiple upload at the same time (difference browser instance). If there are more than two upload request, when a first request
is sent, it work properly. If I make a new call, an old stop immediately and a new one not create a session and temp file anymore.
Django is great in advance, please keep it simple and work in common task.
comment:138 by , 17 years ago
Replying to anonymous:
I have tested multiple upload at the same time (difference browser instance). If there are more than two upload request, when a first request
is sent, it work properly. If I make a new call, an old stop immediately and a new one not create a session and temp file anymore.
Django is great in advance, please keep it simple and work in common task.
Hi,
I have tested it with simultaneous file uploads and it worked fine. Could you please include more details? For instance, I don't think there would be any chance it would work in the development server. It may break in some FastCGI configurations, too. (The reason being that accepting multiple file uploads requires some multithreadedness.)
-Mike
comment:139 by , 17 years ago
Cc: | added |
---|
follow-up: 142 comment:140 by , 17 years ago
Hi
If I stop upload process by clicking a "Stop" button. It look like sever see a same upload request and never generate process id.
How to solve this problem?
Thank
comment:141 by , 17 years ago
Cc: | added |
---|
comment:142 by , 17 years ago
Replying to anonymous:
Hi
If I stop upload process by clicking a "Stop" button. It look like sever see a same upload request and never generate process id.
How to solve this problem?
Thank
I'm not sure why you'd be seeing this. If you're using #4165 then the javascript should create a new progress id on *every* upload. If you're not using #4165, then progress id is irrelevant. While testing I was able to stop and re-upload in FF 2, 1.5, IE 7, and IE 6 without a problem. Could you please describe your environment with more detail?
-Mike
by , 17 years ago
Attachment: | 5313_updated_file_progress.diff added |
---|
Removed some unneeded things. No file progress tracking by default.
follow-up: 144 comment:143 by , 17 years ago
I've cleaned the FileProgressDescriptor architecture a little bit. Three changes for this patch:
- using
request.__class__ = descriptor
rather than using a weird descriptor wrapper. - defaulting to a
NullFileProgressDescriptor
(which does nothing) versus one that stored the information in temp upload directory. - making
safe_file_move
do nothing if the source and destination are the same string.
If you want to track file upload progress you need to install the stuff from #4165. This ticket just contains the hooks to get #4165 there with just a minimal of contrib. This patch makes that even clearer by removing the last remnants of functional file progress tracking.
I *think* this is probably where we want to leave this ticket (in terms of what should be cut out of it, what should be put in). Although, I can see several other separation (take care of form stuff in another ticket, for instance).
It is worth noting (because someone had issue with this in IRC), that if you try to use this save_FIELD_file(name, content)
should now be given a request.FILE
item, rather than the actual content of the file.
For example:
object.save_video_file(request.FILE['video']['filename'], request.FILE['video']['content'])
is bad, since the mere invocation request.FILE['video']['content']
will load all the content into memory, exactly what we're trying to avoid.
The new version looks like:
object.save_video_file(request.FILE['video']['filename'], request.FILE['video'])
In reply to SmileyChris, we have to have an upload directory anyway, since the FileProgressDescriptor only changes how file *progress* is tracked, not the actual uploading of files (that is, it has to go into a directory, or else this entire thing is disabled). We could rename and clean up if people prefer it changed.
follow-up: 145 comment:144 by , 17 years ago
Replying to Michael Axiak <axiak@mit.edu>:
The new version looks like:
object.save_video_file(request.FILE['video']['filename'], request.FILE['video'])
Why not skip request.FILEvideofilename at all? This is provided by request.FILEvideo anyway.
IMHO having filename as an separate parameter makes only sense if this would be used somehow directly, but currently the filename/path where the file gets saved is changed by the field (field.get_directory_name() and field.get_filename(filename)).
comment:145 by , 17 years ago
Replying to anonymous:
Why not skip request.FILEvideofilename at all? This is provided by request.FILEvideo anyway.
IMHO having filename as an separate parameter makes only sense if this would be used somehow directly, but currently the filename/path where the file gets saved is changed by the field (field.get_directory_name() and field.get_filename(filename)).
Well, save_video_file() is meant to allow someone to save a file. Unless that file is specifically provided in a post (it's not necessarily, and in many of my applications it's not...generated images come to mind).
We *can* have it accept both sets of arguments, but then it's too nonorthogonal. Perhaps we should just create a new one? move_video_file()
??
by , 17 years ago
Attachment: | 5343_1_streaming_file_upload.diff added |
---|
Newer, cleaner version of file upload script
by , 17 years ago
Attachment: | 5343_streaming_file_upload_no_assert.diff added |
---|
Newer, cleaner version of file upload script (doh! no random commented assert)
by , 17 years ago
Attachment: | 5343_streaming_file_upload_best.diff added |
---|
Sorry about that -- this one is the better one.
comment:146 by , 17 years ago
A list of changes in this new version:
- I removed the file_progress descriptor handling and middleware -- if you need to track file progress, you can use
process_request
to achieve what you need.
- I cleaned up a lot of things, removed many catch-all
excepts
and replaced withexcept OSError
etc.
- Included cross-platform locking for saving/moving files manually (django/utils/file_locks.py)
- Made
move_file_safe(src,dest)
to care if a file already exists and throw an error if one does by default.
- Moved stuff out of
http/__init__.py
into various locations.
- In reply to SmileyChris' comment earlier, no longer uses
self._progress_file
to determine if a file is being tracked. Now usingself._track_progress
(a boolean)
save_field_file
can now optionally not take a filename (iffilename = None
). To avoid confusion, there is an additional function,instance.move_field_file
which takes one parameter: therequest.FILES
value. Code examples should usemove_field_file
to distinguish this new functionality.
- Stay tuned for updates to #4165 to reflect these and other changes!
Example of new move_video_file
Model instance function:
#from this foo.save_video_file(request.FILES['video']['filename'], request.FILES['video']) #to this foo.move_video_file(request.FILES['video'])
by , 17 years ago
Attachment: | 5343_cleaned_streaming_file_upload.diff added |
---|
It's amazing what Trac helps you see.
by , 17 years ago
Attachment: | 5343_cleaned_streaming_file_upload_tweaked.diff added |
---|
Made a subtle fix after testing with #4165.
follow-up: 148 comment:147 by , 17 years ago
def init(self, headers, input, request, file_upload_dir=None, streaming_min_post_size=None, chunk_size=1024*64):
parser = MultiPartParser(headers, input, request, file_upload_dir, streaming_min_post_size)
Is chunk_size is configurable? Do you hard code, aren't you.
Thank
comment:148 by , 17 years ago
Replying to anonymous:
def init(self, headers, input, request, file_upload_dir=None, streaming_min_post_size=None, chunk_size=1024*64):
parser = MultiPartParser(headers, input, request, file_upload_dir, streaming_min_post_size)
Is chunk_size is configurable? Do you hard code, aren't you.
Thank
Actually, that's a great point. I am going to change a few things (shift around save_FIELD_file
and move_FIELD_file
to make things more explicit) and on my list for the next upload would be to allow chunk sizes to be a settings variable. (I was actually discussing this with Øyvind in IRC, http://simon.bofh.ms/logger/django/2007/05/26/15/25 .)
Thanks for the comment, though. Feel free to post any other suggestions!
comment:149 by , 17 years ago
Works for me. Windows/Apache2.2.4/ModPython.
My page actually uploads 4 large files at once so I added 'filename' : self._filename to the file_progress dict created in multparser.py then I can display the current file in my javascript.
comment:150 by , 17 years ago
Cc: | added |
---|
comment:151 by , 17 years ago
5343_cleaned_streaming_file_upload_tweaked.diff works perfect for me (Linux/Apache 2.0.59/modPython). Uploading large zip files (10 to 100 MB each) to my photo gallery with no problems ... memory usage is dead straight at 65MB ... where as it explored to 600 MB per apache process before. Good work Michael!
comment:152 by , 17 years ago
111 def parse(self):
112 try:
113 self._parse()
114 finally:
115 if self._track_progress:
116 self._request.file_progress = {'state': 'done'}
117 return self._post, self._files
When user click 'stop' while uploading file, server return {'state':'done'} even file is not completely uploaded.
This cause unexpected behaviors under different browser. I think we should return actual file size and received size to make
client to be able to check if file is completely uploaded.
Thank
comment:153 by , 17 years ago
def parse(self):
try:
self._parse()
finally:
if self._progress_filename:
if self._received == self._size:
self._request.file_progress = {'received':self._received,
'size':self._size,
'state': 'done'}
else:
self._request.file_progress = {'received':self._received,
'size':self._size,
'state': 'abort'}
How about this?
comment:154 by , 17 years ago
Cc: | added |
---|
by , 17 years ago
Attachment: | 5343_cleaned_streaming_file_upload_tweaked2.diff added |
---|
Fixed microscopic bug in an error path
comment:155 by , 17 years ago
Cc: | added |
---|
comment:156 by , 17 years ago
Cc: | added |
---|
by , 17 years ago
Attachment: | 5722.2.diff added |
---|
Sorry, that last one patch also included #4165. This is the correct one.
by , 17 years ago
Attachment: | 5724_streaming_file_upload.diff added |
---|
updated to r5724 (previous patch was missing files too)
comment:157 by , 17 years ago
Cc: | added |
---|
comment:158 by , 17 years ago
Patch needs improvement: | unset |
---|---|
Triage Stage: | Accepted → Ready for checkin |
Reading over this, I think the latest version here looks good (5724_streaming_file_upload.diff). Is everyone happy with it?
comment:159 by , 17 years ago
Yes, I've been using it intensively for the last couple of days and it works beautifully.
comment:160 by , 17 years ago
I have been using this patch for a little while now and it does work very well. Haven't seen anything wrong with it.
comment:161 by , 17 years ago
I manually applied patch 5724... to the latest trunk version and noticed that there's a syntax error in line 254 of multipartparser.py. Instead of
except OSError, IOError:
it must be
except (OSError, IOError), e:
Besides that, the patch seems to work quite nicely and I hope it will be added to trunk ASAP.
follow-up: 163 comment:162 by , 17 years ago
Ran into a very small issue:
Files uploads that are streamed end up having 0600 permissions, while files that are uploaded normally have 0644 permissions. I can't put my finger on where this happens at a glance, will look into it tomorrow.
(In case anyone's wondering, I need read access to be able to scan newly uploaded files with clamav.)
follow-up: 166 comment:163 by , 17 years ago
Please read 136. This is standard Unix permissions. For example, if someone (heaven forbid) used PHP to upload, the files would inherit the permissions from the temporary upload directory.
-Mike
Replying to simonbun <simonbun@versea.be>:
Ran into a very small issue:
Files uploads that are streamed end up having 0600 permissions, while files that are uploaded normally have 0644 permissions. I can't put my finger on where this happens at a glance, will look into it tomorrow.
(In case anyone's wondering, I need read access to be able to scan newly uploaded files with clamav.)
comment:164 by , 17 years ago
Well the strange thing is my temporary dir and my uploads dir both have 777 permissions. I just checked again and the temporary .upload file has 600 permissions.
I'm using apache2 + mod_python + python2.5 if that would make a difference
comment:165 by , 17 years ago
Cc: | added |
---|
comment:166 by , 17 years ago
Replying to anonymous:
I'll second that. This one has bitten me too.
While setting permissions to 600 is sound, maybe a two liner on the documentation would help make that more obvious. It took me a while to understand what as going on.
Asides from that, thanks for the patch. I've been using it for a while (mod_python at webfaction's setup, post unicode merge) and it works well.
Cheers
Arthur
comment:167 by , 17 years ago
Cc: | added |
---|
comment:168 by , 17 years ago
Cc: | added |
---|
comment:169 by , 17 years ago
Cc: | added |
---|
comment:170 by , 17 years ago
Cc: | added |
---|
comment:171 by , 17 years ago
Cc: | added |
---|
comment:172 by , 17 years ago
Patch for r5820 does not work with forms.FileField and forms.ImageField, raw_field is now a StrAndUnicode object not a dict containing the raw data.
comment:174 by , 17 years ago
Keywords: | sprint added |
---|---|
Owner: | changed from | to
Status: | assigned → new |
Triage Stage: | Ready for checkin → Accepted |
I will take this on to as full extent as I can over the upcoming sprint.
If anyone has any more issues, please post them on here before Friday.
comment:175 by , 17 years ago
I would like to see the filename be part of the progress returned for when a form has multiple files:
self._request.file_progress = {'received': self._received, 'size': self._size, 'filename' : self._filename, 'state': 'uploading'}
comment:176 by , 17 years ago
The checks after an image upload (as seen in http://dpaste.com/hold/18468/) can be improved somewhat; at first glance it is entirely unclear what's supposed to happen, and the error message is repeated.
See http://dpaste.com/hold/19212/ (untested but looks OK).
comment:177 by , 17 years ago
Uploading empty files is treated as an error, why?
I would also like to see a file propgress tracker example implementation with unit tests and a stress test for uploading
large files.
On request I can provide an example of such a stress test programm.
follow-up: 179 comment:178 by , 17 years ago
I frankly fail to see the use case for uploading an empty file (as opposed to simply not uploading it in the first place).
If you have one, feel free to (a) tell us why you think it makes sense, (b) provide a patch which optionally allows that.
comment:179 by , 17 years ago
Replying to Matthias Urlichs <smurf@smurf.noris.de>:
I frankly fail to see the use case for uploading an empty file (as opposed to simply not uploading it in the first place).
There are several situations where one might want to upload empty files.
- It could be that one does not know the file is empty
- Maybe one actually needs to have an empty file, for example a ini.py in python indicates that a directory is a package
There is nothing special about an empty file, it is a file like any other file, it only contains a single character, the End Of File character. Why would you want to treat this file in any different way? If you do so it only means that your code is not generic enough.
One more note:
The reason that this ticket is languishing like this is that it is filled with tacit assumptions, unusual behavior (like the one I'm commenting on) and extra functionality that is not relevant to the actual problem it needs to solve first - streaming the uploads.
follow-up: 183 comment:180 by , 17 years ago
I'd argue that case 1 is a prime candidate for keeping the test, while case 2 can be handled by uploading a file with a comment in it.
In any case: if you think that's the reason, then please post either a list of (other) problems you have with this patch, or an improved patch.
Anythig else is not helpful. IMHO.
comment:181 by , 17 years ago
Keywords: | sprintsept14 feature added; sprint removed |
---|
comment:182 by , 17 years ago
Cc: | added |
---|
comment:183 by , 17 years ago
Replying to Matthias Urlichs <smurf@smurf.noris.de>:
while case 2 can be handled by uploading a file with a comment in it.
I would be willing to elaborate but this commment of yours indicates to me you have your mind set and don't actually want to understand the point that I was making: That is sometimes a file have to be empty.
You can't manually start adding bytes into files just because someone decided that uploading an empty file is an error. All you need is think about it for a second. As I said before an empty file is still a valid file, it has a name, a modification date ... even a size!. Maybe it is used to indicate a state, maybe it is log file that has yet to be written to etc. there are countless usecases that require the presence of a file that may not contain anything yet.
In your design if one were to simultaneously upload 10 files one of which is empty, the upload should fail once processing reaches the empty file ...
comment:184 by , 17 years ago
I do admit that you have a point; I admit that my response was a bit knee-jerk.
My core point, now that I think about it, is not that I'm dead set against uploading empty files. That test is easily removed, or made optional.
The real point is: What's holding up the change? Did axiak just not have enough time?
This is not the only patch in this bug tracker which fixes a real-world problem. We're carrying ~30 of these in our own repository and, frankly, I'd like that number to go down, one of these days. :-/
comment:185 by , 17 years ago
I know this is maybe a little late to chime in, but as opinion on this seems to be heavily divided as to whether empty files should be allowed, would it not just be best all-round to add a way to define whether empty files should be allowed or not?
Surely this wouldn't be too hard (I'd be willing to work on this if I get a few moments here-and-there).
Just a thought'''
comment:186 by , 17 years ago
I am going to have to go with +1 for not distinguishing between an empty file or a non-empty file.
Technically there is no difference between and empty and non-empty file so I don't see why django should distinguish between the two. It's definitely a user defined preference at that point, which should be possible after the file has been uploaded. Not wanting and empty file is an assumption, a popular one yes, but still an assumption about the upload.
comment:187 by , 17 years ago
Hey,
As well as Trey, I am also +1 for not making this distinction. If people need the validation, it belongs in another layer than the core http processing. (Perhaps the newforms filefield)
-Mike
comment:191 by , 17 years ago
Cc: | added |
---|
comment:192 by , 17 years ago
I'm getting
Ran 155 tests in 55.010s
FAILED (failures=1, errors=1)
when I run the tests on 5820 with 5820_streaming_file_upload.diff applied.
They look like
1) Traceback (most recent call last):
File "/usr/local/src/django/django-trunk.hg.5820+streaming/tests/modeltests/test_client/models.py", line 91, in test_post_file_view
response = self.client.post('/test_client/post_file_view/', post_data)
File "/usr/local/lib/python2.4/site-packages/django/test/client.py", line 222, in post
return self.request(r)
File "/usr/local/lib/python2.4/site-packages/django/core/handlers/base.py", line 77, in get_response
response = callback(request, *callback_args, callback_kwargs)
File "/usr/local/src/django/django-trunk.hg.5820+streaming/tests/modeltests/test_client/views.py", line 52, in post_file_view
c = Context({'file': request.FILESfile_file})
File "/usr/local/lib/python2.4/site-packages/django/utils/datastructures.py", line 143, in getitem
raise MultiValueDictKeyError, "Key %r not found in %r" % (key, self)
MultiValueDictKeyError: 'Key \'file_file\' not found in <MultiValueDict: {\'file\': [{\'content\': \'# coding: utf-8
n"""
n38. Testing usi
[...]
2) FAIL: test_simple_upload (regressiontests.test_client_regress.models.AssertFileUploadTests)
Traceback (most recent call last):
File "/usr/local/src/django/django-trunk.hg.5820+streaming/tests/regressiontests/test_client_regress/models.py", line 215, in test_simple_upload
self.assertEqual(response.status_code, 200)
AssertionError: 500 != 200
Am I doing something wrong here?
by , 17 years ago
Attachment: | 6469_streaming_file_upload.diff added |
---|
comment:193 by , 17 years ago
Uploaded new diff 6469_streaming_file_upload.diff, applying against trunk.
Currently one unit test is failing, traceback follows. There were two tests failing in the earlier version (5820), one of which was this one, but [560] helped me fix one of them on IRC.
[560] thinks this is an unicode error. If this version works for people, I'd like to hear about it. It would really nice to get this merged into trunk.
Thanks, Faheem.
*
FAIL: test_simple_upload (regressiontests.test_client_regress.models.FileUploadTests)
Traceback (most recent call last):
File "/usr/local/src/django/django-trunk.hg+mystreaming/tests/regressiontests/test_client_regress/models.py", line 244, in test_simple_upload
self.assertEqual(response.status_code, 200)
AssertionError: 500 != 200
Ran 205 tests in 85.939s
FAILED (failures=1)
follow-up: 195 comment:194 by , 17 years ago
Okay, so if I want to test this with a clean trunk, which of the above do I use? Trying to figure out if I need the middleware (if so which one), or just 6469 or what? I am needed to upload a 500M+ file on a weekly basis and would love to know HOW to use this.
follow-up: 196 comment:195 by , 17 years ago
Replying to anonymous:
Okay, so if I want to test this with a clean trunk, which of the above do I use? Trying to > figure out if I need the middleware (if so which one), or just 6469 or what? I am needed to > upload a 500M+ file on a weekly basis and would love to know HOW to use this.
6469_streaming_file_upload.diff should apply cleanly against trunk (hasn't been that long). Just tested it (patched against 6469 trunk) and it appears to work fine. You don't need anything else if you just want the streaming upload functionality. Note that apparently there are some issues with FileField/ImageField in forms/newforms, but I have no direct experience of this.
Would be interested to hear how it works for you.
Faheem.
comment:196 by , 17 years ago
Isn't that just the way. I wrote this post, re-re-rechecked out the trunk and applied the patch. Works marvelously. Combined with a crontab for file conversion and everything is working wonderfully.
Thanks!
Replying to Faheem Mitha <faheem@email.unc.edu>:
Replying to anonymous:
Okay, so if I want to test this with a clean trunk, which of the above do I use? Trying to > figure out if I need the middleware (if so which one), or just 6469 or what? I am needed to > upload a 500M+ file on a weekly basis and would love to know HOW to use this.
6469_streaming_file_upload.diff should apply cleanly against trunk (hasn't been that long). Just tested it (patched against 6469 trunk) and it appears to work fine. You don't need anything else if you just want the streaming upload functionality. Note that apparently there are some issues with FileField/ImageField in forms/newforms, but I have no direct experience of this.
Would be interested to hear how it works for you.
Faheem.
comment:197 by , 17 years ago
I noticed a problem with an older patch that seems to be still in 6469. Uploading a FORM with both a file and text INPUT and the input text containing umlauts or other non-ascii characters, you get an exception. I fixed it by replacing line 272 of multipartparser.py with self._post.appendlist(self._fieldname, unicode(self._field.getvalue(), "utf-8"))
.
by , 17 years ago
Attachment: | 6525_all_tests_pass.diff added |
---|
Fixed unicode in POST issues, added django.http.utils, moved str_to_unicode there
by , 17 years ago
Attachment: | 6603_all_tests_pass_uploadedfile_wrapper_fixed.diff added |
---|
Fixed a problem with UploadedFile wrapper and making sure content is not read in Fie/ImageField
comment:199 by , 17 years ago
Needs tests: | set |
---|
Everything seems to work now, just needs some form tests.
comment:200 by , 17 years ago
I've installed the latest trunk, applied this patch (6603_all_tests_pass_uploadedfile_wrapper_fixed.diff) and discovered two problems. Under mod_python 3.3.1 uploading 12Mb image results in apache consuming all available memory and dying. This is probably a mod_python error as it doesn't occur when using development server or mod_wsgi.
When I'm trying to upload 600Mb file under mod_wsgi, though, I get another problem:
Traceback (most recent call last): File "/var/lib/python-support/python2.5/django/core/handlers/base.py" in _real_get_response 81. response = callback(request, *callback_args, **callback_kwargs) File "/var/lib/python-support/python2.5/django/contrib/admin/views/decorators.py" in _checklogin 55. return view_func(request, *args, **kwargs) File "/var/lib/python-support/python2.5/django/views/decorators/cache.py" in _wrapped_view_func 39. response = view_func(request, *args, **kwargs) File "/var/lib/python-support/python2.5/django/contrib/admin/views/main.py" in change_stage 332. errors = manipulator.get_validation_errors(new_data) File "/var/lib/python-support/python2.5/django/oldforms/__init__.py" in get_validation_errors 61. errors.update(field.get_validation_errors(new_data)) File "/var/lib/python-support/python2.5/django/oldforms/__init__.py" in get_validation_errors 378. self.run_validator(new_data, validator) File "/var/lib/python-support/python2.5/django/oldforms/__init__.py" in run_validator 368. validator(new_data.get(self.field_name, ''), new_data) File "/var/lib/python-support/python2.5/django/oldforms/__init__.py" in isValidImage 713. validators.isValidImage(field_data, all_data) File "/var/lib/python-support/python2.5/django/core/validators.py" in isValidImage 180. content = field_data['content'] KeyError at /admin/account/pokerroom/5/ 'content'
Seems that patch is already outdated. Where can I get a newer version?
Eugene
comment:201 by , 17 years ago
Cc: | added |
---|
comment:202 by , 17 years ago
BTW, it seems that problem was caused by insufficient space on /tmp partition I'm using for uploads. Here's additional info:
POST Variable Value _file_upload_error u'Failed to write to temporary file.'
It's a bug anyways.
by , 17 years ago
Attachment: | 6652_isValidImage_using_tmpfilename.diff added |
---|
made the isValidImage validator try the tempfile before reading content
comment:203 by , 17 years ago
Hi Oyvind,
Just wondering what problem your most recent patch fixes. I tried using newforms and the patch breaks with it. (See traceback below), Does your patch fix this problem? I've heard of some problems related to newforms. Is this what it is?
Thanks, Faheem.
*
from models import FileUpload, FolderUpload
from django.newforms import form_for_model
FileUploadForm = form_for_model(FileUpload)
FolderUploadForm = form_for_model(FolderUpload)
*
*
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/core/handlers/base.py" in _real_get_response
- response = callback(request, *callback_args, callback_kwargs)
File "/usr/local/lib/python2.4/site-packages/django/contrib/auth/decorators.py" in _checklogin
- return view_func(request, *args, kwargs)
File "/usr/local/lib/python2.4/site-packages/bixfile_dev/views.py" in file_upload
- success = form.save()
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py" in save
- return save_instance(self, model(), fields, fail_message, commit)
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py" in save_instance
- f.save_form_data(instance, cleaned_data[f.name])
File "/usr/local/lib/python2.4/site-packages/django/db/models/fields/init.py" in save_form_data
- getattr(instance, "save_%s_file" % self.name)(data.filename, data.content, save=False)
File "/usr/local/lib/python2.4/site-packages/django/db/models/fields/init.py" in
- setattr(cls, 'save_%s_file' % self.name, lambda instance, filename, raw_field, save=True: instance._save_FIELD_file(self,
filename, raw_field, save))
File "/usr/local/lib/python2.4/site-packages/django/db/models/base.py" in _save_FIELD_file
- if raw_field.has_key('tmpfilename'):
AttributeError at /proto_dev/bixfile_dev/
'str' object has no attribute 'has_key'
comment:204 by , 17 years ago
Just changed the oldforms image validator to use the tempfile. The problem you have was fixed in the 6603 patch.
comment:205 by , 17 years ago
Hi oyvind,
I'm getting some test breakage with your patch.
Ran 211 tests in 90.369s
FAILED (failures=3)
The most common error is:
AttributeError: 'UploadedFile' object has no attribute 'content'.
by , 17 years ago
Attachment: | 6654_fixed_tests_and_file_clean.diff added |
---|
diff did not apply cleanly, fixed
comment:207 by , 17 years ago
Django patched with the latest streaming patch can now be checked out using mercurial.
hg clone http://streaming.cylon.no/streaming django_src
comment:208 by , 17 years ago
Cc: | added |
---|
comment:211 by , 17 years ago
Cc: | removed |
---|
comment:212 by , 17 years ago
Cc: | added |
---|
comment:213 by , 17 years ago
Cc: | added |
---|
follow-up: 215 comment:214 by , 17 years ago
Could someone post updated directions to get the latest incarnation of this patch working? I'm using Oyvind's mercurial source, but I'm not sure if I still need to modify the middleware, change my models at all, add settings to settings.py, etc.
Also, is the progress meter still included in this patch, and what do I need to see it in the Admin?
There hasn't been much activity here lately -- is this thing close to making it into the trunk and getting some proper documentation?
comment:215 by , 17 years ago
Replying to Joshua Works:
Could someone post updated directions to get the latest incarnation of this patch working? I'm using Oyvind's mercurial source, but I'm not sure if I still need to modify the middleware, change my models at all, add settings to settings.py, etc.
It works out of the box for me, at least it does for the last version I used it for, which corresponds to Oyvinds last patch. However, apparently this patch does not apply cleanly to current trunk. Oyvind asked me to do a merge, but I've been busy with other things. I'll do my best to get to this in the next day or two.
Also, is the progress meter still included in this patch, and what do I need to see it in the Admin?
No, that is in another patch http://code.djangoproject.com/ticket/4165, and is buggy.
See my comments there. Apart from a backend problem with the Python code, the JS code has
problems too. It should be easy to rewrite for someone who knows JS, though. I may end up doing it myself if nobody else does, but I'll probably use a library like PrototypeJS or Jquery.
There hasn't been much activity here lately -- is this thing close to making it into the trunk and getting some proper documentation?
I think it will need a bunch of people to give it a push. I've been intending to try and coordinate something, but I've been busy with other things.
If you don't hear anything, feel free to ping me via private email. I'm quite keen that the patch continue working till it is merged into trunk. I need it for some stuff myself.
Faheem.
comment:216 by , 17 years ago
It absolutely doesn't need "a bunch of people to give it a push". We know this ticket is here and at some point one of the maintainers will sit down and review it again to see if all the problems from the last review have been fixed and no new ones introduced.
by , 17 years ago
Attachment: | streaming.6710.patch added |
---|
Updated streaming patch to trunk rev 6710.
comment:217 by , 17 years ago
Added streaming.6710.patch against trunk rev 6710. This passes all Django tests. I haven't tested it extensively, but it appears to work. Some quick notes on how I generated this patch in case anyone is interested. I used mercurial and kdiff3 for merging (OS Debian etch).
Note that the only file that required manual intervention was django/http/utils.py. This file could probably use a little cleanup.
Everything else was handled by kdiff3.
See Section 12.8 of the hgbook for a recipe for doing this http://hgbook.red-bean.com/hgbookch12.html#x16-28700012.8 and also this wiki page,
http://www.selenic.com/mercurial/wiki/index.cgi/MqMerge
Note that the method described in the wiki is currently less likely to give problems than the method in the hgbook (otherwise similar).
See this issue for further discussion, http://www.selenic.com/mercurial/bts/issue844
comment:218 by , 17 years ago
Successfully patched trunk rev 7018 with streaming.6710.patch. Works well.
What are the main issues blocking trunk integration? This functionality is of high importance to many, as one could argue that without this patch the current FileField and ImageField implementation is both buggy and exposes a security risk (denial of service by uploading large files as Apache is eating up memory and peaking CPU usage + large file upload plainly fails in the end without the patch).
Thus strong +1 for integration ASAP.
comment:219 by , 17 years ago
Cc: | added |
---|
comment:220 by , 17 years ago
I also urgently need that feature ... any chance to see this merged soon?
comment:221 by , 17 years ago
Please do not ask "when will this be merged" questions in the ticket tracker; all it does is generate noise. If you're genuinely interested in helping to get a specific patch merged, consider asking on the developers list about why it hasn't yet been committed or, better yet, searching the archive of that list to see if there's a reason given.
comment:222 by , 17 years ago
Cc: | added |
---|
comment:223 by , 17 years ago
streaming.7092.patch gives some errors. As usual, I don't understand
what is going on well enough to fix them. See below.
*
======================================================================
FAIL: Doctest: modeltests.model_forms.models.test.API_TESTS
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 2180, in runTest
raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for modeltests.model_forms.models.test.API_TESTS
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line unknown line number, in API_TESTS
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f.is_valid()
Expected:
True
Got:
False
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
type(f.cleaned_datafile)
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[198]>", line 1, in ?
type(f.cleaned_datafile)
AttributeError: 'TextFileForm' object has no attribute 'cleaned_data'
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance = f.save()
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[199]>", line 1, in ?
instance = f.save()
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save
return save_instance(self, self.instance, self._meta.fields, fail_message, commit)
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance
raise ValueError("The %s could not be %s because the data didn't"
ValueError: The TextFile could not be created because the data didn't validate.
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance.file
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[200]>", line 1, in ?
instance.file
NameError: name 'instance' is not defined
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f = TextFileForm(data={'description': u'Assistance'}, instance=instance)
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[201]>", line 1, in ?
f = TextFileForm(data={'description': u'Assistance'}, instance=instance)
NameError: name 'instance' is not defined
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f.is_valid()
Expected:
True
Got:
False
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f.cleaned_datafile
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[203]>", line 1, in ?
f.cleaned_datafile
AttributeError: 'TextFileForm' object has no attribute 'cleaned_data'
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance = f.save()
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[204]>", line 1, in ?
instance = f.save()
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save
return save_instance(self, self.instance, self._meta.fields, fail_message, commit)
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance
raise ValueError("The %s could not be %s because the data didn't"
ValueError: The TextFile could not be created because the data didn't validate.
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance.file
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[205]>", line 1, in ?
instance.file
NameError: name 'instance' is not defined
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
os.unlink(instance.get_file_filename())
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[206]>", line 1, in ?
os.unlink(instance.get_file_filename())
NameError: name 'instance' is not defined
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f = TextFileForm(data={'description': u'Assistance'}, files={'file': {'filename': 'test2.txt', 'content': 'hello world'}}, instance=instance)
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[207]>", line 1, in ?
f = TextFileForm(data={'description': u'Assistance'}, files={'file': {'filename': 'test2.txt', 'content': 'hello world'}}, instance=instance)
NameError: name 'instance' is not defined
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f.is_valid()
Expected:
True
Got:
False
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance = f.save()
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[209]>", line 1, in ?
instance = f.save()
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save
return save_instance(self, self.instance, self._meta.fields, fail_message, commit)
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance
raise ValueError("The %s could not be %s because the data didn't"
ValueError: The TextFile could not be created because the data didn't validate.
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance.file
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[210]>", line 1, in ?
instance.file
NameError: name 'instance' is not defined
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance.delete()
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[211]>", line 1, in ?
instance.delete()
NameError: name 'instance' is not defined
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f.is_valid()
Expected:
True
Got:
False
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance = f.save()
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[219]>", line 1, in ?
instance = f.save()
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save
return save_instance(self, self.instance, self._meta.fields, fail_message, commit)
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance
raise ValueError("The %s could not be %s because the data didn't"
ValueError: The TextFile could not be changed because the data didn't validate.
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance.file
Expected:
u'.../test3.txt'
Got:
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f.is_valid()
Expected:
True
Got:
False
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
type(f.cleaned_dataimage)
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[226]>", line 1, in ?
type(f.cleaned_dataimage)
AttributeError: 'ImageFileForm' object has no attribute 'cleaned_data'
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance = f.save()
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[227]>", line 1, in ?
instance = f.save()
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save
return save_instance(self, self.instance, self._meta.fields, fail_message, commit)
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance
raise ValueError("The %s could not be %s because the data didn't"
ValueError: The ImageFile could not be created because the data didn't validate.
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance.image
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[228]>", line 1, in ?
instance.image
AttributeError: 'TextFile' object has no attribute 'image'
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f.is_valid()
Expected:
True
Got:
False
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f.cleaned_dataimage
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[231]>", line 1, in ?
f.cleaned_dataimage
AttributeError: 'ImageFileForm' object has no attribute 'cleaned_data'
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance = f.save()
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[232]>", line 1, in ?
instance = f.save()
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save
return save_instance(self, self.instance, self._meta.fields, fail_message, commit)
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance
raise ValueError("The %s could not be %s because the data didn't"
ValueError: The TextFile could not be created because the data didn't validate.
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance.image
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[233]>", line 1, in ?
instance.image
AttributeError: 'TextFile' object has no attribute 'image'
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
os.unlink(instance.get_image_filename())
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[234]>", line 1, in ?
os.unlink(instance.get_image_filename())
AttributeError: 'TextFile' object has no attribute 'get_image_filename'
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f.is_valid()
Expected:
True
Got:
False
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance = f.save()
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[237]>", line 1, in ?
instance = f.save()
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save
return save_instance(self, self.instance, self._meta.fields, fail_message, commit)
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance
raise ValueError("The %s could not be %s because the data didn't"
ValueError: The TextFile could not be created because the data didn't validate.
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance.image
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[238]>", line 1, in ?
instance.image
AttributeError: 'TextFile' object has no attribute 'image'
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance.delete()
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[239]>", line 1, in ?
instance.delete()
File "/usr/local/lib/python2.4/site-packages/django/db/models/base.py", line 324, in delete
assert self._get_pk_val() is not None, "%s object can't be deleted because its %s attribute is set to None." % (self._meta.object_name, self._meta.pk.attname)
AssertionError: TextFile object can't be deleted because its id attribute is set to None.
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f.is_valid()
Expected:
True
Got:
False
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance = f.save()
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[247]>", line 1, in ?
instance = f.save()
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save
return save_instance(self, self.instance, self._meta.fields, fail_message, commit)
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance
raise ValueError("The %s could not be %s because the data didn't"
ValueError: The ImageFile could not be changed because the data didn't validate.
File "/usr/local/src/django/django-trunk.patched/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance.image
Expected:
u'.../test3.png'
Got:
======================================================================
FAIL: Doctest: regressiontests.forms.tests.test.fields_tests
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 2180, in runTest
raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for regressiontests.forms.tests.test.fields_tests
File "/usr/local/src/django/django-trunk.patched/tests/regressiontests/forms/tests.py", line unknown line number, in fields_tests
File "/usr/local/src/django/django-trunk.patched/tests/regressiontests/forms/tests.py", line ?, in regressiontests.forms.tests.test.fields_tests
Failed example:
type(f.clean({'filename': 'name', 'content': 'Some File Content'}, 'files/test4.pdf'))
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest regressiontests.forms.tests.test.fields_tests[278]>", line 1, in ?
type(f.clean({'filename': 'name', 'content': 'Some File Content'}, 'files/test4.pdf'))
File "/usr/local/lib/python2.4/site-packages/django/newforms/fields.py", line 453, in clean
raise ValidationError(self.error_messagesempty)
ValidationError: [u'The submitted file is empty.']
Ran 234 tests in 63.274s
FAILED (failures=2)
comment:224 by , 17 years ago
http://code.djangoproject.com/browser/django/trunk/tests/modeltests/model_forms/models.py#L739
Add content-length it the data here.
by , 17 years ago
Attachment: | streaming.7092.patch.partial_tests_fix added |
---|
Slightly modified version of streaming.7092.patch with more tests passing.
comment:225 by , 17 years ago
With the patch I just uploaded (basically the changes are liberal use of 'content-length' as suggested by Oyvind, the errors are now reduced
to
======================================================================
FAIL: Doctest: modeltests.model_forms.models.test.API_TESTS
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 2180, in runTest
raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for modeltests.model_forms.models.test.API_TESTS
File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line unknown line number, in API_TESTS
File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f.is_valid()
Expected:
True
Got:
False
File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
type(f.cleaned_dataimage)
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[226]>", line 1, in ?
type(f.cleaned_dataimage)
AttributeError: 'ImageFileForm' object has no attribute 'cleaned_data'
File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance = f.save()
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[227]>", line 1, in ?
instance = f.save()
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save
return save_instance(self, self.instance, self._meta.fields, fail_message, commit)
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance
raise ValueError("The %s could not be %s because the data didn't"
ValueError: The ImageFile could not be created because the data didn't validate.
File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance.image
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[228]>", line 1, in ?
instance.image
AttributeError: 'TextFile' object has no attribute 'image'
File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f.is_valid()
Expected:
True
Got:
False
File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f.cleaned_dataimage
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[231]>", line 1, in ?
f.cleaned_dataimage
AttributeError: 'ImageFileForm' object has no attribute 'cleaned_data'
File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance = f.save()
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[232]>", line 1, in ?
instance = f.save()
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save
return save_instance(self, self.instance, self._meta.fields, fail_message, commit)
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance
raise ValueError("The %s could not be %s because the data didn't"
ValueError: The TextFile could not be created because the data didn't validate.
File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance.image
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[233]>", line 1, in ?
instance.image
AttributeError: 'TextFile' object has no attribute 'image'
File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
os.unlink(instance.get_image_filename())
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[234]>", line 1, in ?
os.unlink(instance.get_image_filename())
AttributeError: 'TextFile' object has no attribute 'get_image_filename'
File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
f.is_valid()
Expected:
True
Got:
False
File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance = f.save()
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[237]>", line 1, in ?
instance = f.save()
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 289, in save
return save_instance(self, self.instance, self._meta.fields, fail_message, commit)
File "/usr/local/lib/python2.4/site-packages/django/newforms/models.py", line 35, in save_instance
raise ValueError("The %s could not be %s because the data didn't"
ValueError: The TextFile could not be created because the data didn't validate.
File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance.image
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[238]>", line 1, in ?
instance.image
AttributeError: 'TextFile' object has no attribute 'image'
File "/usr/local/src/django/django-trunk.patched.new/tests/modeltests/model_forms/models.py", line ?, in modeltests.model_forms.models.test.API_TESTS
Failed example:
instance.delete()
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/site-packages/django/test/_doctest.py", line 1267, in run
compileflags, 1) in test.globs
File "<doctest modeltests.model_forms.models.test.API_TESTS[239]>", line 1, in ?
instance.delete()
File "/usr/local/lib/python2.4/site-packages/django/db/models/base.py", line 324, in delete
assert self._get_pk_val() is not None, "%s object can't be deleted because its %s attribute is set to None." % (self._meta.object_name, self._meta.pk.attname)
AssertionError: TextFile object can't be deleted because its id attribute is set to None.
Ran 234 tests in 63.014s
FAILED (failures=1)
comment:226 by , 17 years ago
content-length is in the wrong dict in ImageField tests and probably some others too.
>>> f = ImageFileForm(data={'description': u'An image'}, files={'image': {'filename': 'test.png', 'content': image_data}, 'content-length':len(image_data)})
should be
>>> f = ImageFileForm(data={'description': u'An image'}, files={'image': {'filename': 'test.png', 'content': image_data, 'content-length':len(image_data)}})
comment:227 by , 17 years ago
Cc: | added |
---|
comment:228 by , 17 years ago
Cc: | added |
---|
comment:229 by , 17 years ago
Cc: | added |
---|
comment:230 by , 17 years ago
I've been rewriting the multipart parsing to be lazier and found something that seems to me like a DDOS exploit in this code.
Consider line 231-233 in streaming.7106.patch. What happens here is:
- A multipart boundary has been seen recently, so we are in the HEADER state
- We are looking to match the end of of the header
- If there's not enough data we simply break and add more bytes to our unmatched bytes from the network and try matching again later
Is there any reason why a "bad" client could not send a multipart request that *never* properly sends the header termination bytes and thus forces the multipart parser to load as much of the stream as possible (up to Content-Length bytes) into memory? As the amount of bytes increases towards the maximum Content-Length the call to .find() itself will become incredibly expensive. A few malformed requests might also cause the server to swap into oblivion or run out of address space on 32 bit systems.
comment:231 by , 17 years ago
Bad wording in the prior comment.
First sentence should be read as follows:
This patch appears to introduce a DOS attack vector into Django.
comment:232 by , 17 years ago
I've tested this with:
- Lighttpd 1.4 with fastcgi/scgi: doesn't work. As somebody else said above it looks like it isn't streaming the upload at all, or maybe lighttpd waits for the POST to be over before streaming it to the fcgi app.
- Apache 2.2 with scgi: works fine so far.
comment:233 by , 17 years ago
Component: | Core framework → HTTP handling |
---|---|
Has patch: | unset |
Needs tests: | unset |
Patch needs improvement: | set |
I've tested streaming.7106.patch. It works for the upload file with 860MB. However I tested it with 1.5G. It fails. The message is below.
Request Method: POST Request URL: http://192.168.1.4/imagecity/upload/ Exception Type: MemoryError Exception Value: out of memory Exception Location: /usr/lib/python2.5/site-packages/django/http/multipartparser.py in _read, line 285 /usr/lib/python2.5/site-packages/django/http/multipartparser.py in _read 285. self._file.write(data[start:end]) ...
comment:234 by , 17 years ago
save_FOO_file() works wrong with streaming.7106.patch
If raises
Traceback (most recent call last): ... File "/usr/lib/python2.5/site-packages/django/db/models/fields/__init__.py", line 783, in <lambda> setattr(cls, 'save_%s_file' % self.name, lambda instance, filename, raw_field, save=True: instance._save_FIELD_file(self, filename, raw_field, save)) File "/usr/lib/python2.5/site-packages/django/db/models/base.py", line 414, in _save_FIELD_file if raw_field.has_key('tmpfilename'): AttributeError: 'str' object has no attribute 'has_key'
comment:235 by , 17 years ago
I worked on a lazier (i.e. you can read from the stream without loading the entire thing onto disk) and hopefully better patch for this, in case anyone is getting antsy. I haven't been able to return to it for a week or two, but all the annoying parts are pretty much done.
It does, however, gut the current implementation. At that time I also tested mod_wsgi2's buffering behavior and found that it does not do aggressive buffering (a good thing), so sites running behind mod_wsgi2 can do flow control on clients. This was done by augmenting the current patch with .sleep() calls between blocks and then posting a large file.
This is particularly important if you have to send the material to another server and would prefer (or require) that it not be held on disk in full, if the stream is of indeterminate length, or if you wanted to hold the request in full to begin with (for example: filtering a dataset as it goes by and populating the database with a structured representation)
comment:236 by , 17 years ago
Status: | new → assigned |
---|
Alright I'm going to be working on this ticket this week. Before I do any work, I'm uploading the current diff updated to work against HEAD largely untouched. After reviewing the comments, before we can consider this ready for checking or close to that, I think that the following is needed:
- Updating this to work with the new-ish FileField backends.
- Pull the MultiPartParser logic apart from the actual dealing with the chunked data. (I.e. a user should be able to do whatever they want with the data.)
- Make the MultiPartParser lazier (i.e. not look for boundaries, but let them come when available).
- Go through the bugs that people have reported (thanks for doing so!) and ensuring that they no longer exist.
-Mike
Should there be a setting to enable/disable this?
Needs testing on python 2.3.