Opened 4 months ago
Last modified 4 months ago
#35728 assigned Cleanup/optimization
Lazily compute assertion messages
Reported by: | Adam Johnson | Owned by: | Rish |
---|---|---|---|
Component: | Testing framework | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Django’s custom test assertions have some rich failure messages. Typically, assertions pass, so most computation of these messages is wasted.
This overhead is quite noticeable for messages based on large strings. For example, assertContains
calls repr()
on the whole response content, typically thousands of bytes.
To mitigate this, I propose that all custom assert methods pass lazy-computing objects to unittest’s msg
arguments. unittest’s _formatMessage() effectively calls str()
on the given object when displaying, so this should work well.
To measure the overhead, I created this test script, named benchmark.py
:
import unittest from django.conf import settings from django.http import HttpResponse from django.test import SimpleTestCase if not settings.configured: settings.configure() class ExampleTests(SimpleTestCase): def test_example(self): response = HttpResponse("Apple\n" * 1_000) for _ in range(100_000): self.assertContains(response, "Apple") if __name__ == '__main__': unittest.main(module='benchmark')
I ran it under cProfile with:
$ python -m cProfile -o profile -m benchmark
I tried it without and with the below patch, which disables the custom message for assertContains
:
diff --git django/test/testcases.py django/test/testcases.py index cd7e7b45d6..de86cb55ec 100644 --- django/test/testcases.py +++ django/test/testcases.py @@ -580,13 +580,13 @@ def _assert_contains(self, response, text, status_code, msg_prefix, html): content = b"".join(response.streaming_content) else: content = response.content - content_repr = safe_repr(content) + content_repr = "" if not isinstance(text, bytes) or html: text = str(text) content = content.decode(response.charset) - text_repr = "'%s'" % text + text_repr = "" else: - text_repr = repr(text) + text_repr = "" if html: content = assert_and_parse_html( self, content, None, "Response's content is not valid HTML:" @@ -623,10 +623,10 @@ def assertContains( else: self.assertTrue( real_count != 0, - ( - f"{msg_prefix}Couldn't find {text_repr} in the following response\n" - f"{content_repr}" - ), + # ( + # f"{msg_prefix}Couldn't find {text_repr} in the following response\n" + # f"{content_repr}" + # ), ) def assertNotContains(
Here are the stats.
Before:
- 2,724,770 function calls in 2.280 seconds
- 2.119 seconds spent in
assertContains
calls
After:
- 2,524,805 function calls in 1.117 seconds
- 0.949 seconds spent in
assertContains
calls
It looks like ~50% of the cost of calling assertContains
is forming the message.
Change History (4)
comment:1 by , 4 months ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 4 months ago
Nice benchmarking idea. Looks like we may be able to trim the overhead further.
For reference, assertIn
avoids the overhead by performing the test first:
I think we should be able to adopt that pattern too. No need for a lazy string, really.
Accepting following the above with the note that we should, ideally, be mindful of potential subclassing of Django's TestCase when designing how the laziness will be incorported to ensure backwards compatibility.
Sure, but I don’t think we don’t need to support, say, an overridden _assert_contains
, because that’s the private method under the public assertContains
.
comment:4 by , 4 months ago
Owner: | set to |
---|---|
Status: | new → assigned |
Thank you Adam for this ticket and analysis, I agree with your rationale.
I performed a simpler test: I added the proposed testcase in an existing
tests.py
file, and ran with the usualpython -Wall manage.py test testapp.tests.ExampleTests
. Then, I replacedassertContains
with a call similar toself.assertIn(text, response.content.decode())
. My timings are:ExampleTests
runs in about 4 secondsExampleTests
runs in around a second (0.950s)Accepting following the above with the note that we should, ideally, be mindful of potential subclassing of Django's TestCase when designing how the laziness will be incorported to ensure backwards compatibility.