Opened 7 years ago

Closed 3 years ago

#27590 closed New feature (fixed)

Allow configuration of where to save staticfiles manifest.

Reported by: David Sanders Owned by: Jarosław Wygoda
Component: contrib.staticfiles Version: dev
Severity: Normal Keywords:
Cc: Kevin Christopher Henry, Carsten Fuchs Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

A standard Django deploy has all staticfiles accessible to all users. This is understandable, if undesirable. By itself this is not a huge problem since those on the public Internet don't know the filenames of all of the files a deployment has, and fuskering the entire possible namespace isn't feasible and is also detectable.

However, deployments that make use of ManifestStaticFilesStorage will most likely expose a master list of all static files to anyone who wants to look. It's not a huge security risk because you shouldn't be depending on security through obscurity, but there's certainly a leg up given when there's a master list of all files. Due to the way ManifestStaticFilesStorage is setup, the manifest ends up in the directory of publicly served files. If the files are stored locally this can be fixed by blacklisting the file from webserver access and only letting Django itself read the file off the local filesystem. This is the approach I've taken once I discovered the issue - I have a server deployment running Apache serving files on the local filesystem, but have CloudFront in front of that which fetches from Apache if the cache misses. I've since blacklisted the staticfiles manifest and invalidated any cached copies in CloudFront.

Here's what I consider the risks of having a publicly exposed staticfiles manifest:

  • Easily find trade secrets in JavaScript files meant to be used only internally by staff users
  • Find hardcoded secrets in internal files - anything in the static tree gets listed here, even pre-processed files like coffee or less if the developers use django-compressor
  • Find potential attack vectors by finding normally unlisted files that are exploitable which could be used to form URLs in phishing emails
  • Possible novel way to fingerprint Django versions using the easy master list of files, could be used to quickly identify potentially vulnerable Django servers

All that said, I don't have a great solution to the problem that Django itself could implement. Currently Django writes the manifest to the staticfiles root so it's always going to be readable unless you take extra steps. The real stickler is deployments that use something like S3BotoStorage which in effect needs Django to be able to access the manifest remotely. My understanding of that setup (I don't use it) would be that on load Django is going to read the manifest from S3, so it needs to be accessible over the web by default. Further steps could be taken to make it only accessible to Django itself, but that requires user action.

Potential solutions:

  • Encrypt the manifest on disk, decrypt on load into memory - loses human readability for debugging purposes but hides it from prying eyes by default
  • Fast-track ticket #26029 to make staticfiles storage configuration allow passing options to storage - use options to change manifest path somewhere non-public or configure a secret header to use with S3 to only give Django access to the file.

On a related note, this discovery has made me extra paranoid about the exposure of internal files meant for staff only and now I'm looking at a way to formalize restricted access to the files. With the exposure of the staticfiles manifest it's clear much of the business logic we use (in JavaScript under admin) is by default visible to the Web if you know the URL.

Change History (18)

comment:1 by Tim Graham, 7 years ago

Triage Stage: UnreviewedAccepted

Florian says,

I think we should just bite the dust and add a setting which the static manifest gets written to (directly, without involving storage). Moving this file out of STATIC_ROOT is the only sensible solution I see. S3BotoStorage should redirect the manifest to a different (secure) bucket if needed. Maybe we can make it easier to detect that this file is "sensitive".

comment:2 by Kevin Christopher Henry, 6 years ago

At #28764 I reached a similar conclusion for very different reasons.

I don't see much point in making the location configurable, though. It seems to me that the file should just be stored to some standard location within the codebase. The obvious analogy here would be to makemigrations, which does the same thing. As I argue in the other ticket, this is a configuration file that affects Django's behavior, is tied to a specific commit, and has nothing to do with the actual serving of static files. Storing it in the codebase would solve David's issues above, would solve the correctness and performance issues I mentioned, and would do so for all users without any new settings. Are there advantages to storing the file in an external location that I've overlooked?

comment:3 by Kevin Christopher Henry, 6 years ago

Cc: Kevin Christopher Henry added

comment:4 by Jarosław Wygoda, 4 years ago

Owner: changed from nobody to Jarosław Wygoda
Status: newassigned

I've created a pr adding STATICFILES_MANIFEST setting allowing storing manifest file with code.
https://github.com/django/django/pull/12187

comment:5 by Carsten Fuchs, 4 years ago

Cc: Carsten Fuchs added

comment:6 by Carsten Fuchs, 4 years ago

Maybe the staticfiles.json should also be "pretty printed" as a part of this change?
After this change, many users will likely keep this file with the project code and thus under version control. Having a well defined indentation and key order would keep the diffs small and comprehensible.

in reply to:  4 comment:7 by Carsten Fuchs, 4 years ago

Replying to Jarosław Wygoda:

I've created a pr adding STATICFILES_MANIFEST setting allowing storing manifest file with code.
https://github.com/django/django/pull/12187

Will you set the "has patch" flag for this ticket?

comment:8 by Mariusz Felisiak, 3 years ago

Has patch: set

comment:9 by Carlton Gibson, 3 years ago

Needs documentation: set
Patch needs improvement: set
Summary: Prevent public access of staticfiles manifestAllow configuration of where to save staticfiles manifest.

I think we need to look at where we're headed before we simply add a setting here. If we consider the upgrade path, what we don't want to do is add a new setting and then need to deprecate it as soon as we improve the configuration options, with something along the lines of #26029.

#26029 is needed because you currently need to provide a storage subclass in order to provide configuration parameters:

# In settings.py, where `CustomStorage` adjusts configuration in `__init__()`.
STATICFILES_STORAGE = 'my_app.CustomStorage'

The complaint in #26029 is that providing the subclass is cumbersome, and we should be able to specify a dictionary like DATABASES and so on.
But we should solve that problem over there.

The issue here is that ManifestStaticFilesStorage has no way of specifying where to store the manifest.

There's a patch file for #27541 (which I'm going to close as a duplicate of this ticket) that suggests adding an optional manifest_storage kwarg to ManifestFilesMixin — here you'd pass a configured storage to override where to store the manifest file:

# Something like... 
class CustomManifestStorage(ManifestStaticFilesStorage):
    def __init__(self, *args, **kwargs):
        manifest_storage = FileSystemStorage(
            location='path/for/manifest'
        )    
        super().__init__(*args, manifest_storage=manifest_storage, **kwargs)

Using a storage, rather than just giving a path, covers the separate s3 bucket use-case (and similar) but also avoids the cumbersome ManifestHandler wrapper from the initial PR here.

I think with that, and tests, and documentation of the new parameter, with an example of how to configure it, we'd have a fix here. (That the configuration is a little cumbersome would hopefully spur renewed interest in pushing forward with #26029.)

Last edited 3 years ago by Carlton Gibson (previous) (diff)

comment:10 by Kevin Christopher Henry, 3 years ago

I argued in #28764 and comment:2 that the location of staticfiles.json shouldn't be configurable at all, that it should just be stored in the code base along with other configuration files (including automatically generated ones, like migration files). What are the use cases for storing this file anywhere else?

comment:11 by Jarosław Wygoda, 3 years ago

Making the location of staticfiles.json configurable allows us to maintain a backward compatibility by setting the default manifest path to STATIC_ROOT. If we don't want to maintain a backward compatibility then I can make manifest location not configurable and get rid of ManifestHandler. Manifest will be created in the BASE_DIR instead and STATICFILES_MANIFEST setting won't be required anymore.

Last edited 3 years ago by Jarosław Wygoda (previous) (diff)

comment:12 by Kevin Christopher Henry, 3 years ago

If we take a step back, the problem here isn't that there's no way to configure the manifest location, it's that the standard location for it is bad. That's the reason people want to change it. So rather than making it possible for people to avoid the problem by introducing a new customization mechanism, we should just fix the standard value so that no one has the problem in the first place.

If we accept that the default manifest location needs to change at some point then there will inevitably be a backwards incompatibility.

One way to address that would be to just document in the release notes that people using ManifestStaticFilesStorage need to run collectstatic after upgrading. If they don't do that they will immediately see a helpful error message at startup since the manifest won't be found, and they can fix the problem then. Still, perhaps that's overly burdensome, I'm not sure.

Another approach would be to have a deprecation cycle where STATIC_ROOT continues to be checked if the manifest isn't found at the designated location in the code base, showing a deprecation warning in that case.

comment:13 by David Sanders, 3 years ago

Kevin, I think you're neglecting how collectstatic is used, and has been historically used. Changing the default to storing the manifest in the codebase would be a breaking change for some users. The ManifestFilesMixin hashes files based on content, and the content is going to depend on people's configurations, and the content will likely differ depending on the settings used between development and production. Minifying JS for production is a common use-case, but not for development. If the manifest by default lives in the codebase, it would both need to be re-generated after any changes (change in development process), and would end up with the output for the development environment. Django would probably need to give a large caveat to ignore the file for source control otherwise users will shoot themselves in the foot quite often.

The issue I originally brought up with S3BotoStorage is that ManifestFilesMixin is a mixin, so adding that to a storage class like S3BotoStorage means it will read and write the manifest from that storage - in that case from S3. Changing the behavior so that the manifest doesn't use that storage class could be unintuitive for users. It will have particular ramifications to how and when they call collectstatic for their deployments - when using a class like S3BotoStorage that sends the files to remote storage, they may not be currently invoking collectstatic at a point where it can write out the manifest into the codebase and have that change be persistent and make it into the deployment of the code to production as well. For example, if a Docker image is used to deploy to multiple instances, currently someone with that setup would probably run collectstatic once from a single instance during a deploy - and at that point the manifest would have to be re-baked into the image or it wouldn't exist on other instances, etc. For example, AWS Elastic Beanstalk guides show configuring such that migrate is run at deploy time in the containers, and they use leader_only so only one container runs it. Someone using S3 for storing the staticfiles might logically have a setup on Elastic Beanstalk where they'd use leader_only with collectstatic so that N containers don't try to send the files off to S3. So there'd be a lot of assumptions about how people are currently using it.

I haven't looked at this issue in 4 years, so that's going from memory. But I don't believe there's a simple "change the default location" solution given how many ways collectstatic might be used - you're probably going to break it for somebody. If there were a clear win here I think I probably would have just made a PR instead of this issue. :-) I spent a decent amount of time thinking through the problem before punting it to this issue. I fear if you make it a hard-coded change to the default, even with a deprecation cycle you'll end up with people coming out of the woodwork on why they *have* to have it working the way it originally did, and if there's no configuration ability then you've got a problem there.

Last edited 3 years ago by David Sanders (previous) (diff)

comment:14 by Carlton Gibson, 3 years ago

The breaking change is important. We can't just bring that in.

Adding a setting might be OK — folks could set it to STATIC_ROOT to get the old value — BUT #26029 is about restructuring all this anyway: to add a setting only to need to deprecate it is too much churn. (Even assuming we'd take the hit on breaking people's deployments as they failed to make the changes in time.)

Adding the manifest_storage kwarg (taking a storage) is small and simple. That's why I think we should advance with it. Later, when #26029 comes along, we can review the exact API we want there — we can either configure manifest_storage directly in the settings, or allow a path there that will used internally — but that's a separate conversation (one we can defer).

comment:15 by Jacob Walls, 3 years ago

Type: Cleanup/optimizationNew feature

comment:16 by Jacob Walls, 3 years ago

Needs documentation: unset
Patch needs improvement: unset
Version: 1.10dev

comment:17 by Mariusz Felisiak, 3 years ago

Triage Stage: AcceptedReady for checkin

comment:18 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

Resolution: fixed
Status: assignedclosed

In d3c4696:

Fixed #27590 -- Allowed customizing a manifest file storage in ManifestFilesMixin.

Note: See TracTickets for help on using tickets.
Back to Top