#27741 closed Bug (fixed)
staticfiles_tests test_corrupt_intermediate_files fails intermittently when run in parallel
Reported by: | Chris Lamb | Owned by: | nobody |
---|---|---|---|
Component: | contrib.staticfiles | Version: | 1.11 |
Severity: | Release blocker | Keywords: | |
Cc: | David Sanders | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
====================================================================== ERROR: test_corrupt_intermediate_files (staticfiles_tests.test_storage.TestCollectionCachedStorage) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/tmp/buildd/python-django-1.11~alpha1/tests/staticfiles_tests/test_storage.py", line 296, in test_corrupt_intermediate_files self.hashed_file_path('cached/css/img/window.png') File "/tmp/buildd/python-django-1.11~alpha1/tests/staticfiles_tests/test_storage.py", line 24, in hashed_file_path fullpath = test.render_template(test.static_template_snippet(path)) File "/tmp/buildd/python-django-1.11~alpha1/tests/staticfiles_tests/cases.py", line 38, in render_template return template.render(Context(**kwargs)).strip() File "/tmp/buildd/python-django-1.11~alpha1/django/template/base.py", line 207, in render return self._render(context) File "/tmp/buildd/python-django-1.11~alpha1/django/test/utils.py", line 107, in instrumented_test_render return self.nodelist.render(context) File "/tmp/buildd/python-django-1.11~alpha1/django/template/base.py", line 990, in render bit = node.render_annotated(context) File "/tmp/buildd/python-django-1.11~alpha1/django/template/base.py", line 957, in render_annotated return self.render(context) File "/tmp/buildd/python-django-1.11~alpha1/django/templatetags/static.py", line 105, in render url = self.url(context) File "/tmp/buildd/python-django-1.11~alpha1/django/templatetags/static.py", line 102, in url return self.handle_simple(path) File "/tmp/buildd/python-django-1.11~alpha1/django/templatetags/static.py", line 117, in handle_simple return staticfiles_storage.url(path) File "/tmp/buildd/python-django-1.11~alpha1/django/contrib/staticfiles/storage.py", line 162, in url return self._url(self.stored_name, name, force) File "/tmp/buildd/python-django-1.11~alpha1/django/contrib/staticfiles/storage.py", line 141, in _url hashed_name = hashed_name_func(*args) File "/tmp/buildd/python-django-1.11~alpha1/django/contrib/staticfiles/storage.py", line 365, in stored_name self.hashed_name(name, content=None, filename=intermediate_name) File "/tmp/buildd/python-django-1.11~alpha1/django/contrib/staticfiles/storage.py", line 101, in hashed_name raise ValueError("The file '%s' could not be found with %r." % (filename, self)) ValueError: The file 'cached/css/img/window.a836fe39729e.png' could not be found with <django.contrib.staticfiles.storage.CachedStaticFilesStorage object at 0x7f43b6114e90>. ---------------------------------------------------------------------- Ran 11906 tests in 171.474s FAILED (errors=1, skipped=1152, expected failures=4)
This is from the 1.11 alpha1 tarball, rather than git itself.
Change History (28)
comment:1 by , 8 years ago
Cc: | added |
---|---|
Component: | Uncategorized → contrib.staticfiles |
Severity: | Normal → Release blocker |
Summary: | test_corrupt_intermediate_files (staticfiles_tests.test_storage.TestCollectionCachedStorage) fails with ValueError: The file 'cached/css/img/window.a836fe39729e.png' could not be found → staticfiles_tests test_corrupt_intermediate_files fails in source distribution |
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → Bug |
follow-up: 5 comment:2 by , 8 years ago
comment:3 by , 8 years ago
Extracting the tarball and running the tests from there reproduced it for me.
comment:4 by , 8 years ago
I can't reproduce with those instructions.
[root@fce6b18201fb tests]# ./runtests.py "staticfiles_tests.test_storage.TestCollectionCachedStorage" Testing against Django installed in '/tmp/django/Django-1.11a1/django' with up to 2 processes Creating test database for alias 'default'... Creating test database for alias 'other'... System check identified no issues (0 silenced). ................... ---------------------------------------------------------------------- Ran 19 tests in 0.802s OK Destroying test database for alias 'default'... Destroying test database for alias 'other'...
root@fce6b18201fb tests]# ./runtests.py "staticfiles_tests.test_storage.TestCollectionCachedStorage.test_corrupt_intermediate_files" Testing against Django installed in '/tmp/django/Django-1.11a1/django' with up to 2 processes Creating test database for alias 'default'... Creating test database for alias 'other'... System check identified no issues (0 silenced). . ---------------------------------------------------------------------- Ran 1 test in 0.042s OK Destroying test database for alias 'default'... Destroying test database for alias 'other'...
That's running with Python2.7 on a CentOS 7.3 Docker image.
Tim, are you reproducing by running all tests or just the targeted tests cases like in my example output above?
follow-up: 6 comment:5 by , 8 years ago
Replying to David Sanders:
Going to need more information than that. What are the steps to reproduce?
Sorry about that, I should have been clearer when referring to the tarball. I'm running the entire testsuite.
follow-up: 7 comment:6 by , 8 years ago
Replying to Chris Lamb:
Replying to David Sanders:
Going to need more information than that. What are the steps to reproduce?
Sorry about that, I should have been clearer when referring to the tarball. I'm running the entire testsuite.
Does the error still occur when you run only the single test or the group of tests like in my example output above?
Entire testsuite takes my Macbook 20+ minutes to run at least, so I'd like to narrow down the reproduce case if it's to be tracked down in a reasonable amount of time.
follow-up: 8 comment:7 by , 8 years ago
Does the error still occur when you run only the single test or the group of tests like in my example output above?
I'm clearly failing at communicating today - my bad! It only appears when I run the entire testsuite. If I just run the single test (or its parent group), the test does not fail.
comment:8 by , 8 years ago
Replying to Chris Lamb:
Does the error still occur when you run only the single test or the group of tests like in my example output above?
I'm clearly failing at communicating today - my bad! It only appears when I run the entire testsuite. If I just run the single test (or its parent group), the test does not fail.
Well, that's good info, since it seems to show there's something elsewhere in the testsuite breaking it.
However, that's a real bummer considering how long it takes me to run the entire testsuite, tracking it down is going to be a PITA. Perhaps I can selectively delete test modules that are unlikely to be the interference and that'll speed things up...
follow-up: 10 comment:9 by , 8 years ago
David, you might want to try the --bisect
or --pair
options of runtests.py
to speed up the process.
follow-up: 11 comment:10 by , 8 years ago
Replying to Simon Charette:
David, you might want to try the
--bisect
or--pair
options ofruntests.py
to speed up the process.
Thanks for the tip Simon, both of those were quite speedy compared to running the full test suite.
Unfortunately neither revealed any problems. Ran them both 3 times to ensure there wasn't anything inconsistent:
./runtests.py --pair "staticfiles_tests.test_storage.TestCollectionCachedStorage.test_corrupt_intermediate_files" ./runtests.py --bisect "staticfiles_tests.test_storage.TestCollectionCachedStorage.test_corrupt_intermediate_files"
I did, however, have to remove one group of tests because they were failing for me and seem unrelated - is anyone able to do run ./runtests.py "staticfiles_tests.test_views"
and have it succeed? For me that always fails complaining about AUTH_USER
problems. If run with other tests that error doesn't occur, only when it is run alone.
Running the entire testsuite for me hangs and I've killed it after 40 minutes without it reproducing the original error.
Chris, could you try running ./runtests.py "staticfiles_tests.test_views"
and responding with your results? If that run causes an error, can you try deleting the test_views.py
file under staticfiles_tests
and then run the entire testsuite again and see if the failure still reproduces?
follow-up: 12 comment:11 by , 8 years ago
Replying to David Sanders:
Chris, could you try running
./runtests.py "staticfiles_tests.test_views"
and responding with your results?
Fails with https://gist.github.com/lamby/b424c3310fade40245f046b712913f44/raw
If that run causes an error, can you try deleting the
test_views.py
file understaticfiles_tests
and then run the entire testsuite again and see if the failure still reproduces?
Fails with https://gist.githubusercontent.com/lamby/647a790e935ec11c25221e69bc951263/raw
comment:12 by , 8 years ago
Replying to Chris Lamb:
Replying to David Sanders:
Chris, could you try running
./runtests.py "staticfiles_tests.test_views"
and responding with your results?
Fails with https://gist.github.com/lamby/b424c3310fade40245f046b712913f44/raw
If that run causes an error, can you try deleting the
test_views.py
file understaticfiles_tests
and then run the entire testsuite again and see if the failure still reproduces?
Fails with https://gist.githubusercontent.com/lamby/647a790e935ec11c25221e69bc951263/raw
Thanks! So it looks like staticfiles_tests.test_views
can't be run by itself for anyone, that probably should be fixed in general since each test should be able to be run in isolation and not fail.
For the second case it looks like your setup didn't appreciate deleting that test for some reason, not sure why. Need to find a way to exclude that test to see if it's the one causing problems.
follow-up: 14 comment:13 by , 8 years ago
Any progress? Would love to see this alpha version in Debian experimental...
comment:14 by , 8 years ago
Replying to Chris Lamb:
Any progress? Would love to see this alpha version in Debian experimental...
Sorry Chris, I haven't been working on it. I've got too much else on my plate and no easy way to reproduce this (for some reason running the full testsuite from the tarball hangs for me).
We do know that the test in isolation succeeds, so it's some interaction with other tests, however both --bisect
and --pair
don't yield any information.
Can any Django test gurus take a look to try to find what the test interaction might be?
comment:15 by , 8 years ago
If it helps, if I limit parallelism with "taskset --cpu-list 0" the testsuite does not fail.
comment:16 by , 8 years ago
I have a PR to fix the test_views
failure when run in isolation.
I can still reproduce a non-deterministic failure of test_corrupt_intermediate_files
when running only the staticfiles tests (actually only staticfiles_tests.test_storage
). I use a loop to repeat the tests until it fails:
for i in {1..500}; do python tests/runtests.py staticfiles_tests.test_storage retcode=$? if [ $retcode = 1 ]; then echo $i; echo 'failed'; break; fi done
comment:17 by , 8 years ago
Thanks Tim, that script helps me quickly reproduce on my setup.
I've further narrowed it and confirmed what Chris said about parallelism being a factor. If you use --parallel 1
to disable parallelism it won't reproduce.
I've narrowed it down to the parallel interaction between two test cases. It makes sense, the latter is also dealing with window.png
and changes the modification timestamp on it which changes the hash.
for i in {1..500}; do python tests/runtests.py staticfiles_tests.test_storage.TestCollectionCachedStorage staticfiles_tests.test_storage.TestCollectionHashedFilesCache retcode=$? if [ $retcode = 1 ]; then echo $i; echo 'failed'; break; fi done
That's all the digging I've got time for tonight. I'll start to look into a fix tomorrow, unless someone wants to pounce on it before me now that it's narrowed down.
comment:18 by , 8 years ago
To isolate the test, you can get inspiration from tests.i18n.utils.RunInTmpDirMixin
, that is create a temporary directory to copy the static tree in, override the STATICFILES_DIRS setting to include that temp dir, then remove the dir in the test tearDown.
follow-up: 22 comment:21 by , 8 years ago
Summary: | staticfiles_tests test_corrupt_intermediate_files fails in source distribution → staticfiles_tests test_corrupt_intermediate_files fails intermittently when run in parallel |
---|
I'm feeling a bit lazy about isolating the tests using the approach Claude suggested. How about this patch, at least for now:
-
tests/staticfiles_tests/test_storage.py
diff --git a/tests/staticfiles_tests/test_storage.py b/tests/staticfiles_tests/test_storage.py index 7e7b31f..11d1cf3 100644
a b class TestCollectionCachedStorage(TestHashedFiles, CollectionTestCase): 290 290 # high enough. 291 291 configured_storage.max_post_process_passes = 1 292 292 # File without intermediates that can be rehashed without a problem. 293 self.hashed_file_path('cached/css/ img/window.png')293 self.hashed_file_path('cached/css/fonts/font.svg') 294 294 # File with too many intermediates to rehash with the low max 295 295 # post-process passes. 296 296 err_msg = "The name 'cached/styles.css' could not be hashed with %r." % (configured_storage._wrapped,)
comment:22 by , 8 years ago
Replying to Tim Graham:
I'm feeling a bit lazy about isolating the tests using the approach Claude suggested. How about this patch, at least for now:
Can you elaborate why this would fix it? :) In isolation it seems a bit arbitrary.
follow-up: 24 comment:23 by , 8 years ago
From David's comment:17, "I've narrowed it down to the parallel interaction between two test cases. It makes sense, the latter is also dealing with window.png and changes the modification timestamp on it which changes the hash." If the test uses some other file that's not modified in another test, there's no issue as far as I've tested.
comment:24 by , 8 years ago
Replying to Tim Graham:
From David's comment:17, "I've narrowed it down to the parallel interaction between two test cases. It makes sense, the latter is also dealing with window.png and changes the modification timestamp on it which changes the hash." If the test uses some other file that's not modified in another test, there's no issue as far as I've tested.
I've actually got a patch for it isolated, Tim. Just finished it up the other day, but I've been too swamped to get it cleaned up and in a PR. Sorry about that.
Should I submit the PR for the stable/1.11.x branch and then do a separate PR for master?
Going to need more information than that. What are the steps to reproduce?