Opened 8 years ago
Closed 3 years ago
#27590 closed New feature (fixed)
Allow configuration of where to save staticfiles manifest.
Reported by: | David Sanders | Owned by: | Jarosław Wygoda |
---|---|---|---|
Component: | contrib.staticfiles | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Kevin Christopher Henry, Carsten Fuchs | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
A standard Django deploy has all staticfiles accessible to all users. This is understandable, if undesirable. By itself this is not a huge problem since those on the public Internet don't know the filenames of all of the files a deployment has, and fuskering the entire possible namespace isn't feasible and is also detectable.
However, deployments that make use of ManifestStaticFilesStorage
will most likely expose a master list of all static files to anyone who wants to look. It's not a huge security risk because you shouldn't be depending on security through obscurity, but there's certainly a leg up given when there's a master list of all files. Due to the way ManifestStaticFilesStorage
is setup, the manifest ends up in the directory of publicly served files. If the files are stored locally this can be fixed by blacklisting the file from webserver access and only letting Django itself read the file off the local filesystem. This is the approach I've taken once I discovered the issue - I have a server deployment running Apache serving files on the local filesystem, but have CloudFront in front of that which fetches from Apache if the cache misses. I've since blacklisted the staticfiles manifest and invalidated any cached copies in CloudFront.
Here's what I consider the risks of having a publicly exposed staticfiles manifest:
- Easily find trade secrets in JavaScript files meant to be used only internally by staff users
- Find hardcoded secrets in internal files - anything in the static tree gets listed here, even pre-processed files like coffee or less if the developers use django-compressor
- Find potential attack vectors by finding normally unlisted files that are exploitable which could be used to form URLs in phishing emails
- Possible novel way to fingerprint Django versions using the easy master list of files, could be used to quickly identify potentially vulnerable Django servers
All that said, I don't have a great solution to the problem that Django itself could implement. Currently Django writes the manifest to the staticfiles root so it's always going to be readable unless you take extra steps. The real stickler is deployments that use something like S3BotoStorage which in effect needs Django to be able to access the manifest remotely. My understanding of that setup (I don't use it) would be that on load Django is going to read the manifest from S3, so it needs to be accessible over the web by default. Further steps could be taken to make it only accessible to Django itself, but that requires user action.
Potential solutions:
- Encrypt the manifest on disk, decrypt on load into memory - loses human readability for debugging purposes but hides it from prying eyes by default
- Fast-track ticket #26029 to make staticfiles storage configuration allow passing options to storage - use options to change manifest path somewhere non-public or configure a secret header to use with S3 to only give Django access to the file.
On a related note, this discovery has made me extra paranoid about the exposure of internal files meant for staff only and now I'm looking at a way to formalize restricted access to the files. With the exposure of the staticfiles manifest it's clear much of the business logic we use (in JavaScript under admin) is by default visible to the Web if you know the URL.
Change History (18)
comment:1 by , 8 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 7 years ago
At #28764 I reached a similar conclusion for very different reasons.
I don't see much point in making the location configurable, though. It seems to me that the file should just be stored to some standard location within the codebase. The obvious analogy here would be to makemigrations
, which does the same thing. As I argue in the other ticket, this is a configuration file that affects Django's behavior, is tied to a specific commit, and has nothing to do with the actual serving of static files. Storing it in the codebase would solve David's issues above, would solve the correctness and performance issues I mentioned, and would do so for all users without any new settings. Are there advantages to storing the file in an external location that I've overlooked?
comment:3 by , 7 years ago
Cc: | added |
---|
follow-up: 7 comment:4 by , 5 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
I've created a pr adding STATICFILES_MANIFEST setting allowing storing manifest file with code.
https://github.com/django/django/pull/12187
comment:5 by , 5 years ago
Cc: | added |
---|
comment:6 by , 5 years ago
Maybe the staticfiles.json
should also be "pretty printed" as a part of this change?
After this change, many users will likely keep this file with the project code and thus under version control. Having a well defined indentation and key order would keep the diffs small and comprehensible.
comment:7 by , 4 years ago
Replying to Jarosław Wygoda:
I've created a pr adding STATICFILES_MANIFEST setting allowing storing manifest file with code.
https://github.com/django/django/pull/12187
Will you set the "has patch" flag for this ticket?
comment:8 by , 4 years ago
Has patch: | set |
---|
comment:9 by , 4 years ago
Needs documentation: | set |
---|---|
Patch needs improvement: | set |
Summary: | Prevent public access of staticfiles manifest → Allow configuration of where to save staticfiles manifest. |
I think we need to look at where we're headed before we simply add a setting here. If we consider the upgrade path, what we don't want to do is add a new setting and then need to deprecate it as soon as we improve the configuration options, with something along the lines of #26029.
#26029 is needed because you currently need to provide a storage subclass in order to provide configuration parameters:
# In settings.py, where `CustomStorage` adjusts configuration in `__init__()`. STATICFILES_STORAGE = 'my_app.CustomStorage'
The complaint in #26029 is that providing the subclass is cumbersome, and we should be able to specify a dictionary like DATABASES
and so on.
But we should solve that problem over there.
The issue here is that ManifestStaticFilesStorage
has no way of specifying where to store the manifest.
There's a patch file for #27541 (which I'm going to close as a duplicate of this ticket) that suggests adding an optional manifest_storage
kwarg to ManifestFilesMixin
— here you'd pass a configured storage to override where to store the manifest file:
# Something like... class CustomManifestStorage(ManifestStaticFilesStorage): def __init__(self, *args, **kwargs): manifest_storage = FileSystemStorage( location='path/for/manifest' ) super().__init__(*args, manifest_storage=manifest_storage, **kwargs)
Using a storage, rather than just giving a path, covers the separate s3 bucket use-case (and similar) but also avoids the cumbersome ManifestHandler wrapper from the initial PR here.
I think with that, and tests, and documentation of the new parameter, with an example of how to configure it, we'd have a fix here. (That the configuration is a little cumbersome would hopefully spur renewed interest in pushing forward with #26029.)
comment:10 by , 4 years ago
I argued in #28764 and comment:2 that the location of staticfiles.json
shouldn't be configurable at all, that it should just be stored in the code base along with other configuration files (including automatically generated ones, like migration files). What are the use cases for storing this file anywhere else?
comment:11 by , 4 years ago
Making the location of staticfiles.json
configurable allows us to maintain a backward compatibility by setting the default manifest path to STATIC_ROOT
. If we don't want to maintain a backward compatibility then I can make manifest location not configurable and get rid of ManifestHandler
. Manifest will be created in the BASE_DIR
instead and STATICFILES_MANIFEST
setting won't be required anymore.
comment:12 by , 4 years ago
If we take a step back, the problem here isn't that there's no way to configure the manifest location, it's that the standard location for it is bad. That's the reason people want to change it. So rather than making it possible for people to avoid the problem by introducing a new customization mechanism, we should just fix the standard value so that no one has the problem in the first place.
If we accept that the default manifest location needs to change at some point then there will inevitably be a backwards incompatibility.
One way to address that would be to just document in the release notes that people using ManifestStaticFilesStorage
need to run collectstatic
after upgrading. If they don't do that they will immediately see a helpful error message at startup since the manifest won't be found, and they can fix the problem then. Still, perhaps that's overly burdensome, I'm not sure.
Another approach would be to have a deprecation cycle where STATIC_ROOT
continues to be checked if the manifest isn't found at the designated location in the code base, showing a deprecation warning in that case.
comment:13 by , 4 years ago
Kevin, I think you're neglecting how collectstatic
is used, and has been historically used. Changing the default to storing the manifest in the codebase would be a breaking change for some users. The ManifestFilesMixin
hashes files based on content, and the content is going to depend on people's configurations, and the content will likely differ depending on the settings used between development and production. Minifying JS for production is a common use-case, but not for development. If the manifest by default lives in the codebase, it would both need to be re-generated after any changes (change in development process), and would end up with the output for the development environment. Django would probably need to give a large caveat to ignore the file for source control otherwise users will shoot themselves in the foot quite often.
The issue I originally brought up with S3BotoStorage
is that ManifestFilesMixin
is a mixin, so adding that to a storage class like S3BotoStorage
means it will read and write the manifest from that storage - in that case from S3. Changing the behavior so that the manifest doesn't use that storage class could be unintuitive for users. It will have particular ramifications to how and when they call collectstatic
for their deployments - when using a class like S3BotoStorage
that sends the files to remote storage, they may not be currently invoking collectstatic
at a point where it can write out the manifest into the codebase and have that change be persistent and make it into the deployment of the code to production as well. For example, if a Docker image is used to deploy to multiple instances, currently someone with that setup would probably run collectstatic
once from a single instance during a deploy - and at that point the manifest would have to be re-baked into the image or it wouldn't exist on other instances, etc. For example, AWS Elastic Beanstalk guides show configuring such that migrate
is run at deploy time in the containers, and they use leader_only
so only one container runs it. Someone using S3 for storing the staticfiles might logically have a setup on Elastic Beanstalk where they'd use leader_only
with collectstatic
so that N containers don't try to send the files off to S3. So there'd be a lot of assumptions about how people are currently using it.
I haven't looked at this issue in 4 years, so that's going from memory. But I don't believe there's a simple "change the default location" solution given how many ways collectstatic
might be used - you're probably going to break it for somebody. If there were a clear win here I think I probably would have just made a PR instead of this issue. :-) I spent a decent amount of time thinking through the problem before punting it to this issue. I fear if you make it a hard-coded change to the default, even with a deprecation cycle you'll end up with people coming out of the woodwork on why they *have* to have it working the way it originally did, and if there's no configuration ability then you've got a problem there.
comment:14 by , 4 years ago
The breaking change is important. We can't just bring that in.
Adding a setting might be OK — folks could set it to STATIC_ROOT to get the old value — BUT #26029 is about restructuring all this anyway: to add a setting only to need to deprecate it is too much churn. (Even assuming we'd take the hit on breaking people's deployments as they failed to make the changes in time.)
Adding the manifest_storage kwarg (taking a storage) is small and simple. That's why I think we should advance with it. Later, when #26029 comes along, we can review the exact API we want there — we can either configure manifest_storage
directly in the settings, or allow a path there that will used internally — but that's a separate conversation (one we can defer).
comment:15 by , 3 years ago
Type: | Cleanup/optimization → New feature |
---|
comment:16 by , 3 years ago
Needs documentation: | unset |
---|---|
Patch needs improvement: | unset |
Version: | 1.10 → dev |
comment:17 by , 3 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
Florian says,