Opened 12 years ago
Closed 12 years ago
#18687 closed Bug (fixed)
Randomly failing test: test_performance_scalability
Reported by: | Aymeric Augustin | Owned by: | nobody |
---|---|---|---|
Component: | Core (Other) | Version: | dev |
Severity: | Release blocker | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
TestUtilsCryptoPBKDF2.test_performance_scalability
fails randomly in about one build out of ten on the CI server.
As a result everyone is just learning to ignore the failures, making this test fairly useless.
Attachments (1)
Change History (7)
comment:1 by , 12 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 12 years ago
We should probably test something else than the proportion of total runtime.
I think the underlying problem is that we are testing on a machine which is doing a lot of other tests at the same time. This can result in spikes in the testing times, and these lead to false positives. The problem is not imprecise measurement, it is that the runtime of the same code is not guaranteed to stay the same between consecutive runs.
So, maybe doing 9 significantly shorter runs for iterations in alternating fashion and then comparing the median would work better? In code something like:
n1 = 1000 n2 = 10000 runs_1000 = [] runs_10000 = [] for i in range(9): t1 = elapsed('pbkdf2("password", "salt", iterations=%d)' % n1) t2 = elapsed('pbkdf2("password", "salt", iterations=%d)' % n2) runs_1000.append(t1) runs_10000.append(t2) runs_1000.sort(); runs_10000.sort() t1 = runs_1000[4] t2 = runs_10000[4] measured_scale_exponent = math.log(t2 / t1, n2 / n1) self.assertLess(measured_scale_exponent, 1.1)
I haven't actually tested the above.
comment:3 by , 12 years ago
If we had mocking in the Django test suite - it would just be a matter of testing that _fast_hmac is called <iteration> times
then a different unit test would cover the behavior of a single run of _fast_hmac
comment:4 by , 12 years ago
Mock doesn't seem like the right answer here, it's trivially to validate the number of hmac calls, the idea of this test is to verify that if you increase the workload factor you really get more work. However I'm not sure it's adding any real value.
by , 12 years ago
Attachment: | 18687.diff added |
---|
comment:5 by , 12 years ago
In the short term, can we make the test faster by relaxing the check a bit? For instance, I'm attaching a patch that checks that 20 times more rounds are at least 10 times slower, and that is about 50 times faster than the current test.
In the long term, I'd like to hear the original author of this code before making a decision.
comment:6 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
The single test accounts for roughly 2.5% of the time it takes to run the full test suite - and only indirectly tests for an implementation of a linearly scaling loop inside the target function. The output and logic of the function is tested elsewhere in the testcase.