Opened 7 years ago

Closed 7 years ago

#8333 closed (duplicate)

FileField upload of large files looping and failing

Reported by: tapautler Owned by: nobody
Component: File uploads/storage Version: master
Severity: Keywords: filefield upload looping
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

Tripped on this while trying to update photologue, testing a large zip file.
Then tried some of my own code and found that wasn't working with large files either.
Uploads work fine for 1 meg files
Tried 20meg files and it goes into an upload loop during the save.
Uploads the file, lets say "BIG_FILE", then "BIG_FILE_" and so on "BIG_FILE_..._" until it fails on a filename of 215+ characters.
All the files that are uploaded are complete and are not corrupt.
I'm running on Windows with Apache and mod_python.

MODEL

class UploadedDocument( models.Model ):
    file = models.FileField( upload_to = 'uploads/documents/')
    description = models.CharField( max_length=255, blank=True, help_text='Description of document.')
    uploaded_by = models.ForeignKey( User )

FORM

class DocumentField(forms.FileField): 
    def __init__(self, *args, **kwargs): 
        super(DocumentField, self).__init__(*args, **kwargs) 
        self.label = 'New File' 
        self.help_text = 'Select a doc, rtf, or pdf document'     

    def clean(self, data, UNKNOWN_FIELD): 
        f = super(DocumentField, self).clean(data)

        if data in (None, ''):
            " Required=False and no data... "
            return None
        file_types = ('application/msword', 'application/rtf', 'application/pdf') 
        file_exts = ('.doc', '.rtf', '.pdf') 
        if (not splitext(f.name)[1].lower() in file_exts) and (not data['content-type'] in file_types): 
            raise forms.ValidationError('Document types accepted: %s' % ' '.join(file_exts))
        return f

class UploadedDocumentForm(ModelForm):
    file = DocumentField(required=True, 
                         widget=forms.FileInput(attrs={'size':'80'}))
    description = forms.CharField(required=True,
                                  help_text=UploadedDocument._meta.get_field('description').help_text,
                                  widget=forms.Textarea(attrs={'style':'width:98%; height:3em;' }))
    
    class Meta:
        model = UploadedDocument
        fields = ( 'file', 'description')

VIEW (SAVE PIECE)

doc = UploadedDocument(uploaded_by = request.user)
doc.file.save(f.name, f, save=False)
doc.description = form.cleaned_data['description']
doc.save()

Change History (6)

comment:1 follow-up: Changed 7 years ago by julien

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Resolution set to duplicate
  • Status changed from new to closed

This might well be a dupe of #8203. I gather from your report that the actual uploaded file is ok, and I assume that the problem you're having is with the temporary file, which is likely not cleaned up properly in Windows.
Reopen if you estimate that your issue is different than #8203, but please provide more details if that's the case.

comment:2 in reply to: ↑ 1 Changed 7 years ago by tapautler

  • Resolution duplicate deleted
  • Status changed from closed to reopened

I dont think this is a duplicate. No temp file involved here.

When saving the object it uploads the file and then uploads it again and again and again, 200+ times until it fails. The file name of each file gets longer and longer as it appends _ to the new file's name. It appears that it would loop forever except that it eventually fails because the file name, with 200+ appended _'s exceeds the appearant OS filename limit of 215 characters.

So if I upload a file named "A" i will get 215 files uploaded before it fails.
If my file name is "0123456789" i would get 205 files uploaded before it fails.

Fyi, the db object never does get created.

It may be my configuration but its not just my code, latest svn of django-photologue is failing the same way when I try to upload to a large zip file.

comment:3 Changed 7 years ago by julien

Hmmm, that's an interesting issue... At first glance, the fact that it keeps sending the file repeatedly seems like it is an issue with the front-end, not with Django.
What is your front-end environment? Is there, for example, some javascript involved?

comment:4 Changed 7 years ago by julien

By the way, it is normal that it crashes when the filename reaches 215 characters as this is the allowed maximum in Windows.

comment:5 Changed 7 years ago by tapautler

Pulled the few scraps of js out of the form and retried. Still have the problem.
It occurs in both Firefox2 and IE7.

Looking at the code in storage.py

class FileSystemStorage(Storage):
    """
    Standard filesystem storage
    """

    def __init__(self, location=settings.MEDIA_ROOT, base_url=settings.MEDIA_URL):
        self.location = os.path.abspath(location)
        self.base_url = base_url

    def _open(self, name, mode='rb'):
        return File(open(self.path(name), mode))

    def _save(self, name, content):
        full_path = self.path(name)

        directory = os.path.dirname(full_path)
        if not os.path.exists(directory):
            os.makedirs(directory)
        elif not os.path.isdir(directory):
            raise IOError("%s exists and is not a directory." % directory)

        # There's a potential race condition between get_available_name and
        # saving the file; it's possible that two threads might return the
        # same name, at which point all sorts of fun happens. So we need to
        # try to create the file, but if it already exists we have to go back
        # to get_available_name() and try again.

        while True:
            try:
                # This file has a file path that we can move.
                if hasattr(content, 'temporary_file_path'):
                    file_move_safe(content.temporary_file_path(), full_path)
                    content.close()

                # This is a normal uploadedfile that we can stream.
                else:
                    # This fun binary flag incantation makes os.open throw an
                    # OSError if the file already exists before we open it.
                    fd = os.open(full_path, os.O_WRONLY | os.O_CREAT | os.O_EXCL | getattr(os, 'O_BINARY', 0))
                    try:
                        locks.lock(fd, locks.LOCK_EX)
                        for chunk in content.chunks():
                            os.write(fd, chunk)
                    finally:
                        locks.unlock(fd)
                        os.close(fd)
            except OSError:
                # Ooops, we need a new file name.
                name = self.get_available_name(name)
                full_path = self.path(name)
            else:
                # OK, the file save worked. Break out of the loop.
                break
                
        return name

I would say, that based on what I'm observing, I'm stuck in the "while True" loop. Since my file is uploaded with a new name over and over (whether I use a temp directory or not) then I must be getting an OSerror on the Content.close() and on the locks.unlock(fd) or os.close(fd).

comment:6 Changed 7 years ago by tapautler

  • Resolution set to duplicate
  • Status changed from reopened to closed

Mistake on previous assumption. I'm always going through file_move_safe(), regardless of whether or not I specify FILE_UPLOAD_TEMP_DIR (its just using C:\Windows\Temp if I don't). So, it's probably the same issue as #8203, just presenting in a different way. The "move" is working but I will bet I'm getting an OSerror on the attempt to unlock/close the temp file... so I do it all again and again. I'll stop testing on Windows and move over to a real OS 'till the patch goes in lol.

Note: See TracTickets for help on using tickets.
Back to Top