﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
28055	Staticfiles HashedFilesMixin postprocess optimization	Konrad Lisiczyński	Konrad Lisiczyński	"Django 1.11 refactored the way how collectstatic postprocessing is handled to calculate content based hashes and saving those files.
Good point of this refactoring is, than now handling of nested css is done correctly, for example here, in style.css:
{{{
@import 'substyles.css';

body {
  background-color: green;
}
}}}
when content of substyles.css in changed, the hash of substyles.css will also change, which at the same time will change the content of style.css (updated for example to @import 'substyles.some-hash.css';) and now Django will take into consideration by upgrading hash of style.css as well. This is good, and in previous versions of Django hash of style.css wouldnt be updated.

However, the problem is, that HashedFilesMixin.postprocess is written in such a way, that adjustable_paths, so with the default implementation - css files - will be always saved 4 times! Below snippet taken from django source code explains this:

{{{
## Snippet from HashedFilesMixin.post_process

# Do a single pass first. Post-process all files once, then repeat for
# adjustable files.
for name, hashed_name, processed, _ in self._post_process(paths, adjustable_paths, hashed_files):
    yield name, hashed_name, processed

paths = {path: paths[path] for path in adjustable_paths}
print('adjusted paths are', paths)

for i in range(self.max_post_process_passes):
    substitutions = False
    for name, hashed_name, processed, subst in self._post_process(paths, adjustable_paths, hashed_files):


## Snippet from HashedFilesMixin._post_process

if name in adjustable_paths:
    old_hashed_name = hashed_name
    content = original_file.read().decode(settings.FILE_CHARSET)
    for extension, patterns in iteritems(self._patterns):
        if matches_patterns(path, (extension,)):
            for pattern, template in patterns:
                converter = self.url_converter(name, hashed_files, template)
                try:
                    content = pattern.sub(converter, content)
                except ValueError as exc:
                    yield name, None, exc, False
    if hashed_file_exists:
        self.delete(hashed_name)
    # then save the processed result
    content_file = ContentFile(force_bytes(content))
    # Save intermediate file for reference
    saved_name = self._save(hashed_name, content_file)
    hashed_name = self.hashed_name(name, content_file)

    if self.exists(hashed_name):
        self.delete(hashed_name)

    saved_name = self._save(hashed_name, content_file)
    hashed_name = force_text(self.clean_name(saved_name))
    # If the file hash stayed the same, this file didn't change
    if old_hashed_name == hashed_name:
        substitutions = False
    processed = True
}}}

After looking at the above code carefully, you will see that all css will be saved 4 times, even when there is no point doing so in the first place, like for example when there is no @import statement in your css. I am also not sure why _save method is called twice in _postprocess.

Generally it would't be maybe extremely big problem for staticstorage which extends local file storage, but for remote storages like AWS this is unacceptable in my opinion. It would be great to change postprocessing algorythm to reduce number of save calls to minimum.

"	Cleanup/optimization	closed	contrib.staticfiles	1.11	Normal	wontfix	staticfiles HashedFilesMixin post_process	David Sanders	Unreviewed	0	0	0	0	0	0
