Opened 11 years ago
Last modified 15 months ago
#21080 new Bug
collectstatic post-processing fails for references inside comments
Reported by: | Owned by: | ||
---|---|---|---|
Component: | contrib.staticfiles | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Joshua Smith, Manel Clos, Petr Přikryl, powderflask | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | yes | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
"python manage.py collectstatic" is attempting to parse references inside css comments and generating errors during post-processing. I am using:
STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.CachedStaticFilesStorage'
eg.
The following snippet of code in test.css:
.gfg-collapse-closed { /* background-image : url('arrow_close.gif'); */ }
produces the following error:
ValueError: The file 'stylesheets/arrow_close.gif' could not be found with <django.contrib.staticfiles.storage.CachedStaticFilesStorage object at 0x1078a3910>. collectstatic
Ideally, collectstatic should respect CSS comments and should not attempt to parse/reference files in lines that are commented out.
If the fix is too complex, a simple workaround might be to include a --ignore-error
flag that would allow the application to continue post-processing even when it sees errors
Attachments (1)
Change History (32)
comment:1 by , 11 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 11 years ago
Could you recommend a path forward? Can we at least have it skip failed imports with a warning rather than break?
-Shreyas
comment:3 by , 11 years ago
I'm agree with the comment of @marfire, the correct way to do this is with a CSS parser, because there are many cases to be treated (multi-line comments, nested comments, broken comments, etc.)
Adding an option to skip failed imports was reported in ticket:19650 and it was marked as a duplicated of ticket:18958.
If you really need ignore these errors, you can create a subclass of CachedFilesMixin
on your own project and add a try...except
statement to handle these kind of exceptions.
comment:4 by , 11 years ago
This bit me today. "Dumb" workaround: replace that url
in your comments with u r l
.
comment:5 by , 11 years ago
Just as an additional datapoint and another thing to consider if it comes to parsing the CSS, I recently started working on a project where the designers used some CSS expressions for an IE-specific stylesheet. The JScript used in one expression puts a particularly crazy bit like .replace('url("','').replace('")','')
. As marfire said, that url
bit in a string raises an exception. Fortunately I can overcome this for my project with a simple sublcass to make a very small change to the regular expression used. But that is one other thing to consider if it is decided that the CachedStaticFilesStorage
needs to be more robust.
comment:6 by , 7 years ago
hi, apparently this issue is still open in django 1.11.
my deploy to heroku failed with the following error log:
... $ python manage.py collectstatic --noinput Post-processing 'jquery-ui-dist/jquery-ui.theme.css' failed! Traceback (most recent call last): File "manage.py", line 10, in <module> execute_from_command_line(sys.argv) File "/app/.heroku/python/lib/python3.6/site-packages/django/core/management/__init__.py", line 364, in execute_from_command_line utility.execute() File "/app/.heroku/python/lib/python3.6/site-packages/django/core/management/__init__.py", line 356, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "/app/.heroku/python/lib/python3.6/site-packages/django/core/management/base.py", line 283, in run_from_argv self.execute(*args, **cmd_options) File "/app/.heroku/python/lib/python3.6/site-packages/django/core/management/base.py", line 330, in execute output = self.handle(*args, **options) File "/app/.heroku/python/lib/python3.6/site-packages/django/contrib/staticfiles/management/commands/collectstatic.py", line 199, in handle collected = self.collect() File "/app/.heroku/python/lib/python3.6/site-packages/django/contrib/staticfiles/management/commands/collectstatic.py", line 145, in collect raise processed whitenoise.storage.MissingFileError: The file 'jquery-ui-dist/"images/ui-icons_555555_256x240.png"' could not be found with <whitenoise.storage.CompressedManifestStaticFilesStorage object at 0x7ff880f551d0>. The CSS file 'jquery-ui-dist/jquery-ui.theme.css' references a file which could not be found: jquery-ui-dist/"images/ui-icons_555555_256x240.png" Please check the URL references in this CSS file, particularly any relative paths which might be pointing to the wrong location. ! Error while running '$ python manage.py collectstatic --noinput'. See traceback above for details. You may need to update application code to resolve this error. Or, you can disable collectstatic for this application: $ heroku config:set DISABLE_COLLECTSTATIC=1 https://devcenter.heroku.com/articles/django-assets ! Push rejected, failed to compile Python app. ! Push failed
the "collectstatic" command fails here:
... The file 'jquery-ui-dist/"images/ui-icons_555555_256x240.png"' could not be found with ...
because it tries to parse js comments.
The file path should be
- jquery-ui-dist/images/ui-icons_555555_256x240.png
and not
- jquery-ui-dist/"images/ui-icons_555555_256x240.png"
other people on stackoverflow run inmto this issue as well:
- https://stackoverflow.com/questions/41772144/bad-filename-formed-by-collectstatic-from-jquery-ui-1-12-0-min-css
- https://stackoverflow.com/questions/40010040/django-pipeline-throwing-valueerror-the-file-could-not-be-found
would be great if this issue could be fixed. thanks ;)
comment:7 by , 7 years ago
Version: | 1.5 → 1.11 |
---|
comment:8 by , 7 years ago
Cc: | added |
---|---|
Version: | 1.11 → 2.0 |
comment:9 by , 7 years ago
Version: | 2.0 → master |
---|
follow-up: 12 comment:10 by , 7 years ago
Re: comment:6 is a slightly different issue and MUCH easier to patch.
This issue is caused by generators (like http://jqueryui.com/themeroller) that urlencode stuff in a comment block.
E.g.
/*! jQuery UI - v1.12.1 - 2016-09-14 * http://jqueryui.com * Includes: core.css, accordion.css, ... * To view and modify this theme, visit http://jqueryui.com/themeroller/?...&iconsHover=url(%22images%2Fui-icons_555555_256x240.png%22)... * Copyright jQuery Foundation and other contributors; Licensed MIT */
The regular expression in storage.HashedFilesMixin matches url(...)
in the comment, but it does NOT unquote the urlencoded quotation mark: %22
The last step of converter()
unquotes the transformed url: return template % unquote(transformed_url)
, converting the urlencoded quotation marks back, and thus messing up the filename, as can be seen in the posted stack trace:
The file 'jquery-ui-dist/"images/ui-icons_555555_256x240.png"' could not be found...
(Notice the excess quotation marks in the path.)
Patch:
storage.py line 61
- r"""(url\(['"]{0,1}\s*(.*?)["']{0,1}\))""",
+ r"""(url\((?:['"]|%22|%27){0,1}\s*(.*?)(?:['"]|%22|%27){0,1}\))""",
This matches url encoded single- or double-quotes and fixed several such issues with 3rd party packages that included generated CSS files like this.
IF a general fix for ignoring comments in CSS files is implemented, that would resolve this issue too.
But IF NOT, this is a simple fix that prevents collectstatic
from crashing on common 3rd party packages.
Happy to submit a patch for this if this seems worthwhile.
comment:11 by , 7 years ago
As long as we are using regular expressions, excluding comment contents seems out of question. So your patch might be worthfile in the short term.
A proper resolution of this ticket would be to use a JS parser to get links. This will be much more invasive though.
comment:12 by , 7 years ago
Replying to powderflask:
If you need to resolve this issue in the interim, override ManifestStaticFilesStorage (or CachedStaticFilesStorage) and re-define the class patterns property:
from django.contrib.staticfiles import storage class PatchedManifestStaticFilesStorage(storage.ManifestStaticFilesStorage): """ Override the replacement patterns to match URL-encoded quotations. """ patterns = ( ("*.css", ( r"""(url\((?:['"]|%22|%27){0,1}\s*(.*?)(?:['"]|%22|%27){0,1}\))""", (r"""(@import\s*["']\s*(.*?)["'])""", """@import url("%s")"""), )), )
comment:13 by , 6 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:14 by , 6 years ago
Has patch: | set |
---|
comment:15 by , 6 years ago
Patch needs improvement: | set |
---|
comment:16 by , 5 years ago
Just in case anyone is looking for a _really_ simple hotfix, you can simply strip the quotes from the name
in hashed_name()
.
Specifically, somewhere at the top of this method: https://github.com/django/django/blob/efc3e32d6d7fb9bb41be73b80c8607b653c1fbd6/django/contrib/staticfiles/storage.py#L79-L111
Or... put this monkeypatch somewhere in your code (settings file for example):
from django.contrib.staticfiles import storage import functools original_hashed_name = storage.HashedFilesMixin.hashed_name @functools.wraps(original_hashed_name) def hashed_name(self, name, *args, **kwargs): return original_hashed_name(self, name.strip('"'), *args, **kwargs) storage.HashedFilesMixin.hashed_name = hashed_name
comment:17 by , 4 years ago
Owner: | changed from | to
---|---|
Patch needs improvement: | unset |
Resetting flags for more recent, unreviewed PR.
comment:18 by , 4 years ago
Needs tests: | set |
---|---|
Patch needs improvement: | set |
comment:19 by , 4 years ago
FWIW, the current approach of parsing CSS with regexes also doesn't work with SVG data inlined as data: URLs when they reference for example <linearGradient id='someid' ...>..</..> <use fill='url(%23someid)' ...></use>
. This is changed to something like fill='url("#someid#someid")'
.
Two shortcomings here are:
- double quotes are inserted that weren't there before, breaking the quoting of the data:-URL that the svg data is in
- the doubling of the id fragment.
Concerning the second problem: I didn't look too far into it, but that seems to happen because the font-face branch is triggered https://github.com/django/django/blob/96f55ccf798c7592a1203f798a4dffaf173a9263/django/contrib/staticfiles/storage.py#L131-L142
It's also at least curious that after line 131 final_url already has any fragment hash marks escaped (# -> %23), so urlsplit in line 137 doesn't catch it as a fragment.
The relevant extract of our CSS looks like this:
.sc_modal .content.branded-bg {
background-image: url("data:image/svg+xml,%3csvg width='375' [...] %3cuse fill='url(%23c)' [...]");
}
comment:24 by , 22 months ago
If someone has time to make a little study on how we could use a real parser (e.g. lark or pyparsing) to strip comments from CSS and JS, would be awesome! I'm sure a general parser could be used in other parts of Django as well.
comment:25 by , 22 months ago
I made a very quick and dirty experiment inside HashedFilesMixin._post_process
using the pyparsing
capability to remove comments from files (pyparsing.cpp_style_comment.suppress()
). Running the Django staticfiles tests with or without that comments removal showed a performance penalty of ~330% (8.6 secs instead of 2.6 secs). A first optimization to only strip comments when an error occurs reduces the penalty to ~170% (strongly dependent on files at hand, of course).
It is expected that using a parser instead of regexes will affect performance more or less badly. What performance penalty are we ready to pay for fixing this?
by , 22 months ago
Attachment: | 21080-pyparsing.patch added |
---|
comment:26 by , 22 months ago
I noticed that the behavior got worse in Django 4.2.0b1 compared to pre-4.2 versions.
Let's say you have a javascript file with these contents (minimized exampled):
function return_text(text_one, text_two) { return `${text_one} ${text_two}`; } class MyClass { render(){ return return_text("import App from './App.vue'", 'import { Component } from "@org/components/loader";'); } }
The "import" statements here are part of strings, not actually executable JavaScript code. In the original snippet, they're part of a component that shows information on how to use the component library and, as such, contains strings with imports as part of the documentation.
In Django 3.2.x, 4.0.x, 4.1.x, running collectstatic
works fine.
However, in the new beta, you get an error:
(django-42-collectstatic-bug) D:\playground\django-42-collectstatic-bug>python manage.py collectstatic --clear --no-input -v 0 Post-processing 'example.js' failed! Traceback (most recent call last): File "D:\playground\django-42-collectstatic-bug\manage.py", line 22, in <module> main() File "D:\playground\django-42-collectstatic-bug\manage.py", line 18, in main execute_from_command_line(sys.argv) File "D:\virtualenvs\django-42-collectstatic-bug\lib\site-packages\django\core\management\__init__.py", line 442, in execute_from_command_line utility.execute() File "D:\virtualenvs\django-42-collectstatic-bug\lib\site-packages\django\core\management\__init__.py", line 436, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "D:\virtualenvs\django-42-collectstatic-bug\lib\site-packages\django\core\management\base.py", line 412, in run_from_argv self.execute(*args, **cmd_options) File "D:\virtualenvs\django-42-collectstatic-bug\lib\site-packages\django\core\management\base.py", line 458, in execute output = self.handle(*args, **options) File "D:\virtualenvs\django-42-collectstatic-bug\lib\site-packages\django\contrib\staticfiles\management\commands\collectstatic.py", line 209, in handle collected = self.collect() File "D:\virtualenvs\django-42-collectstatic-bug\lib\site-packages\django\contrib\staticfiles\management\commands\collectstatic.py", line 154, in collect raise processed File "D:\virtualenvs\django-42-collectstatic-bug\lib\site-packages\django\contrib\staticfiles\storage.py", line 364, in _post_process content = pattern.sub(converter, content) File "D:\virtualenvs\django-42-collectstatic-bug\lib\site-packages\django\contrib\staticfiles\storage.py", line 241, in converter hashed_url = self._url( File "D:\virtualenvs\django-42-collectstatic-bug\lib\site-packages\django\contrib\staticfiles\storage.py", line 174, in _url hashed_name = hashed_name_func(*args) File "D:\virtualenvs\django-42-collectstatic-bug\lib\site-packages\django\contrib\staticfiles\storage.py", line 414, in _stored_name cache_name = self.clean_name(self.hashed_name(name)) File "D:\virtualenvs\django-42-collectstatic-bug\lib\site-packages\django\contrib\staticfiles\storage.py", line 135, in hashed_name raise ValueError( ValueError: The file 'App.vue'", 'import { Component } from "@org/components/loader' could not be found with <django.contrib.staticfiles.storage.ManifestStaticFilesStorage object at 0x00000229D19679D0>.
Unfortunately, this a third-party dependency that we don't control. This means that we'd have to disable the manifest-backed static file storage completely in order to continue using this dependency (that we can't really get rid of), unless I've overlooked an option to disable import errors for specific files.
comment:27 by , 21 months ago
I'd also like to note that this bug also errors out when processing the CSS from Bootstrap 5, as it gets hung up on the SVG data URIs:
ValueError: The file '\"data:image/svg+xml,<svg xmlns='http:/www.w3.org/2000/svg' viewBox='0 0 16 16'><path fill='none' stroke='' could not be found with <django.contrib.staticfiles.storage.ManifestStaticFilesStorage object at 0x106955d00>.
This seems like a pretty significant issue to me, given the popularity of Bootstrap. Simply parsing out the comments wouldn't be enough to fix the above issue. It seems like what is needed is either a proper CSS parser (like in the linked thread about JS files--perhaps https://tinycss.readthedocs.io/?) or a better regex that can exclude url(data:)
items.
comment:28 by , 19 months ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:29 by , 19 months ago
Cc: | added |
---|
comment:30 by , 18 months ago
Cc: | added |
---|
comment:31 by , 15 months ago
Cc: | added |
---|
CachedFilesMixin
is using simple regular expressions to matchurl
and@import
statements, and so is picking up the ones inside comments. It would also have a problem if the text was inside a string.If it's possible to rewrite the regex to cover all the corner cases, that would be the easy fix. But I doubt it. The robust way would be to parse the css files properly, but that would mean rewriting a lot of code...