Opened 15 months ago

Last modified 3 weeks ago

#30509 new Cleanup/optimization

Various FileResponse fixes and changes

Reported by: Piotr Kunicki Owned by: nobody
Component: HTTP handling Version: master
Severity: Normal Keywords: FileResponse
Cc: Carlton Gibson Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

As I've mentioned in PR 11011 (of ticket #30196) I noticed a couple of issues with the way FileResponse is created (and other minor things), and in this ticket I'll try to list and fix them.

For initial clarity, this PR consists of 2 commits. The first is a set of tests, most of which show invalid behavior in some quite specific cases based on current code. The second contains my initial proposed fixes.

  1. tests:
    1. test_file_from_buffer_absolute_name:
      test ends in an error, because set_headers() attempts to read the 'name' property if the filename (which can be overwritten by the custom filename) is recognized as an absolute path by os.path.
    2. test_file_from_buffer_explicit_type1, test_file_from_buffer_explicit_type2:
      of these 2, first test shows the way to explicitly set Content-Type header of the response, while the second shows that in a case that the set value matches the beginning of the default value ('text/html; charset=%s' % self.charset), it will consider that value as unset and proceed to read the content type from the file.
    3. test_file_from_named_buffer1, test_file_from_named_buffer2:
      tests show how you can easily 'fool' FileResponse into thinking it's dealing with an actual file, resulting in incorrect header values if such file exists (1) or an error if it doesn't (2).
    4. test_file_from_disk_custom_name, test_file_from_disk_as_attachment_custom_name:
      these are just 2 missing tests to check whether giving a file a custom name even works.
  1. fixes:
    1. ignored custom filename before setting Content-Length and changed the following condition a bit to fix 1.a. and the error case of 1.c.
    2. added a self._default_content_type flag to HttpResponseBase.__init__() to mark that there was no content_type explicitly set, so it should be figured out from the file, if possible, to fix 1.b.
      • added a way to tell the filelike object's size if it's neither a file nor has a getbuffer attribute, using .seek() and .tell() if they're available, and otherwise .read()
      • moved os.path.getsize path to be the worst-case-scenario, to avoid reading the same file from disk again when possible
      • that fixes the remaining problem found in 1.c..
    3. moved encoding_map to initialize only when it's necessary.
    4. changed set_headers() to _set_headers() to mark it as an internal function.
    5. simplified the - IMO - unnecessarily detailed docs of FileResponse to just tell the user what do the parameters cause, instead of (almost) explaining what each line of code does - if the user wants to know which headers get what kind of values under which conditions they can just check the code, Python is nearly pseudocode anyway.

One thing i'm not sure about is if FileResponse should support reading filelikes from a different position than the beginning. If so, getting Content-Length would require a tiny change to accomodate a non-zero starting point, while otherwise it could use a filelike.seek(0) somewhere to be sure we're reading it from the very beginning.

Change History (6)

comment:1 Changed 15 months ago by Carlton Gibson

Triage Stage: UnreviewedAccepted

Hi Piotr.

Thank you for the follow-up and the PR. The test cases look correct, so lets accept this.

For interest, how do some of these cases come up in the wild?

  1. When would you have a buffer passing an absolute value for filename?
  2. Equally, when would you have a buffer with an absolute name, either pointing to an existing file or not, that wasn't e.g. a named pipe itself?

My imagination is lacking slightly just now as to how you'd hit these issues. :)

comment:2 Changed 15 months ago by Carlton Gibson

Cc: Carlton Gibson added
Patch needs improvement: set

Left a few comments on the PR. Just ticking Patch needs improvement for now. Please uncheck when it's ready for another look.

comment:3 Changed 15 months ago by Piotr Kunicki

Patch needs improvement: unset
  1. Well, let's say you want to return a dynamically modified file, that you have an absolute path to - you open and edit it using Pillow or something similar, write result to a buffer, and then pass that buffer as the filelike, with the path as the filename. Sure, you could get the basename from the path and pass it instead, but... well now you don't have to.
  2. I'm not sure what you mean here, as the named pipe's name isn't an absolute path, but a file descriptor integer.
    However, regarding why would anyone pass an e.g. BytesIO buffer with a name set - i'm not sure (alternative way to pass the filename to the response?), but the point there was mostly to not assume that a name attribute means it's a file (unless it's the last option to get the Content-Length and such file actually exists?).

comment:4 Changed 6 months ago by Carlton Gibson

Patch needs improvement: set

comment:5 Changed 6 months ago by Piotr Kunicki

Patch needs improvement: unset

comment:6 Changed 3 weeks ago by Carlton Gibson

Patch needs improvement: set
Note: See TracTickets for help on using tickets.
Back to Top