Opened 6 years ago

Closed 6 years ago

Last modified 3 years ago

#10300 closed (fixed)

Custom File Storage Backend broken by recent SVN commit.

Reported by: erikcw Owned by: mitsuhiko
Component: File uploads/storage Version: master
Severity: Keywords: file upload storage s3
Cc: erik@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

I just did an svn update to the latest trunk (from r9700), and now my file upload backend storage isn't behaving correctly. A 6MB file is being uploaded to my backend storage as a 1 byte file. When I revert by to r9700, the code behaves as intended.

My backend is a slightly modified (added encryption) version of the S3 backend from http://code.welldev.org/django-storages/wiki/Home

If any other info would be helpful in tracking this down, let me know and I'll post it.

Attachments (2)

S3Storage.py (6.1 KB) - added by erikcw 6 years ago.
Modified S3Storage.py to include Encrypt content.
10300.diff (543 bytes) - added by kmtracey 6 years ago.

Download all attachments as: .zip

Change History (12)

comment:1 Changed 6 years ago by mtredinnick

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

Well, the next step is to work out which revision caused the problem. Since r9700 works and r9800+ doesn't work, use binary searching to find the version that breaks things. It will be at most seven checkouts and tests (and probably less, since not all those commits are to the same branch).

comment:2 Changed 6 years ago by erikcw

Ok, found the revision. Everything works great up to r9765. When I "svn up -r 9766" the problem begins.

comment:3 Changed 6 years ago by kmtracey

  • Triage Stage changed from Unreviewed to Accepted

I looked at this a little, though I cannot recreate since I do not have access to S3 (let alone the modified backend...the modifications may be relevant here). However, taking the S3 backend you point to and stubbing it out to not actually attempt to call S3 services but just log what it's called to do, I see a difference pre- and post- r9766 involving calls to the backend's size method. Specifically, prior to r9766 the S3Storage size method is not called when a file is uploaded (via admim). Running with r9766, however, it is called. Ultimately the call comes from the len(content) call in the FieldFile save method in django/db/models/fields/files.py:

    def save(self, name, content, save=True):
        name = self.field.generate_filename(self.instance, name)
        self._name = self.storage.save(name, content)
        setattr(self.instance, self.field.name, self.name)

        # Update the filesize cache
        print 'Prior to updating filesize cache: content.__class__.__bases__ is: %s' % str(content.__class__.__bases__)
        self._size = len(content)

        # Save the object because it has changed, unless save is False
        if save:
            self.instance.save()
    save.alters_data = True

Though this method itself did not change, that print (added by me for debug) shows that the bases for content has changed. Prior to r9766 it prints:

Prior to updating filesize cache: content.__class__.__bases__ is: (<class 'django.core.files.uploadedfile.UploadedFile'>,)

Post-r9766 it prints:

Prior to updating filesize cache: content.__class__.__bases__ is: (<class 'django.core.files.uploadedfile.InMemoryUploadedFile'>, <class django.db.models.fields.files.FieldFile'>)

The effect of this difference is that instead of django/core/files/base.py File's _get_size:

    def _get_size(self):
        if not hasattr(self, '_size'):
            if hasattr(self.file, 'size'):
                self._size = self.file.size
            elif os.path.exists(self.file.name):
                self._size = os.path.getsize(self.file.name)
            else:
                raise AttributeError("Unable to determine the file's size.")
        return self._size

    def _set_size(self, size):
        self._size = size

    size = property(_get_size, _set_size)

being called to determine len(content) (with _size having already been set previously at a point in the code I didn't track down), django/db/models/fields/files.py FieldFile's _get_size:

    def _get_size(self):
        self._require_file()
        return self.storage.size(self.name)
    size = property(_get_size)

is called instead.

Now the _save method in the S3 storage backend you point to does not rely on len(content) when it saves the data, but if your modified version does then it likely has a problem. If I put some prints into _save:

        print 'in _save: type(content) = %s, content.__class__.__bases__ is: %s' % (str(type(content)), str(content.__class__.__bases__))
        print 'in _save: len(content): %d' % len(content)

Prior to r9766 I get:

in _save: type(content) = <class 'django.core.files.uploadedfile.InMemoryUploadedFile'>, content.__class__.__bases__ is: (<class 'django.core.files.uploadedfile.UploadedFile'>,)
in _save: len(content): 818

Post r9766 I get:

in _save: type(content) = <class 'django.db.models.fields.files.InMemoryUploadedFile'>, content.__class__.__bases__ is: (<class 'django.core.files.uploadedfile.InMemoryUploadedFile'>, <class 'django.db.models.fields.files.FieldFile'>

and my stubbed backend's size raises an exception because it is called to return the length of a file it hasn't saved yet.

So it seems there is a fairly non-obvious side-effect of r9766 involving len(content) for the content passed into the backend _save:

  • Prior to r9766 this would resolve to django/core/files/base.py File's _get_size method
  • Post-r9766 the backend's own size method will get called, which isn't likely to work since _save itself hasn't had a chance to save the file yet.

But I don't actually know for sure that this is the cause of the problem reported here, since I don't know that the modified backend relies on len(content) in its _save method. It does seem to be a problem we need to fix, though, since it seems valid for a backend _save to call len on the passed content, and it used to work...now it will find itself called to answer the len() question, which likely isn't going to work.

This is likely related to #10249...that one is reporting inability to determine a method resolution order while this one seems to be resulting from the fact that the method resolution order has changed.

Feedback from someone with a better understanding of r9766 would be helpful here.

Changed 6 years ago by erikcw

Modified S3Storage.py to include Encrypt content.

comment:4 follow-up: Changed 6 years ago by erikcw

I've attached a copy of the Modified storage backend (S3Storage.py) in case it will help in tracking this issue down.

comment:5 Changed 6 years ago by kmtracey

  • milestone set to 1.1

Changed 6 years ago by kmtracey

comment:6 in reply to: ↑ 4 Changed 6 years ago by kmtracey

Replying to erikcw:

I've attached a copy of the Modified storage backend (S3Storage.py) in case it will help in tracking this issue down.

Yes, I think it helps. Your S3Storage _put_file function looks like this:

    def _put_file(self, name, content):
        if self.encrypt == True:

            # Create a key object
            k = ezPyCrypto.key()

            # Read in a public key
            fd = open(settings.CRYPTO_KEYS_PUB, "rb")
            pubkey = fd.read()
            fd.close()

            # import this public key
            k.importKey(pubkey)

            # Now encrypt some text against this public key
            content = k.encString(content)

            # ...snip remainder...

looking at the ezPyCrypto code, that encString(content) call is going to result in it doing a len(content) to get the length of the data to encrypt. Given what I detailed above, that len call is going to result in the storage backend being called to report the length of something that hasn't been written to the backend yet.

I've attached a patch that may fix the issue. It changes the FieldFile's _get_size method so that the storage backend is called to supply the length only if _committed is true. If not, super is used to fallback to File's size property, which will be the size of the uploaded file. That may be all that's needed to fix this -- could you give it a try an let us know?

Review from someone more familiar with this code would be good. I'm still vaguely worried about he side-effects on method resolution order introduced by r9766, but that could be due to my relative unfamiliarity with the code here.

comment:7 Changed 6 years ago by erikcw

Thanks for the patch kmtracey! I applied it to my svn checkout and it seems to have fixed the problem. I'm going to keep an eye on it for a few days to make sure everything is behaving correctly.

Like you said, there could still be some other side-effects from r9766 -- hopefully the maintainer of this area of the branch will take a look and give the final sign-off for this fix.

I'll report if I notice any oddities...

comment:8 Changed 6 years ago by mitsuhiko

  • Owner changed from nobody to mitsuhiko

comment:9 Changed 6 years ago by jacob

  • Resolution set to fixed
  • Status changed from new to closed

Fixed in [10717].

comment:10 Changed 3 years ago by jacob

  • milestone 1.1 deleted

Milestone 1.1 deleted

Note: See TracTickets for help on using tickets.
Back to Top