#36705 closed Cleanup/optimization (fixed)
Repeated string concatenation is slow on some python implementations, in particular PyPy
| Reported by: | Jacob Walls | Owned by: | Varun Kasyap Pentamaraju |
|---|---|---|---|
| Component: | Utilities | Version: | dev |
| Severity: | Normal | Keywords: | PyPy |
| Cc: | Triage Stage: | Ready for checkin | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
There is a known limitation in PyPy that repeated string concatenation is slow. The discussion there observes that CPython has an optimization, but it's fragile and best not relied on.
Seokchan Yoon (thanks for the report!) raised this with the security team, and we decided any hardening here could be done in public.
To close this ticket, we should:
- implement targeted optimizations in django.utils.numberformat.format() to avoid repeated string concatenation
- audit for other similar cases
- Add a disclaimer to our documentation around PyPy, for example, wording like:
That said, a lot of a web framework's work is done by concatenating strings, and PyPy has an issue with that (see https://pypy.org/posts/2023/01/string-concatenation-quadratic.html). This may cause performance issues, depending on your use.
Change History (13)
comment:1 by , 3 weeks ago
| Triage Stage: | Unreviewed → Accepted |
|---|
comment:4 by , 3 weeks ago
| Patch needs improvement: | set |
|---|
comment:5 by , 3 weeks ago
| Patch needs improvement: | unset |
|---|
comment:6 by , 3 weeks ago
| Patch needs improvement: | set |
|---|
comment:7 by , 3 weeks ago
| Patch needs improvement: | unset |
|---|
comment:8 by , 3 weeks ago
| Patch needs improvement: | set |
|---|
comment:9 by , 3 weeks ago
| Patch needs improvement: | unset |
|---|---|
| Triage Stage: | Accepted → Ready for checkin |
comment:11 by , 7 days ago
I'm a little bit skeptical of this change's necessity, and I think it may have introduced an issue: https://github.com/django/django/pull/20045/files#r2544009071 . PyPy currently only supports Python 3.11, while Django's main branch targets Python 3.12+, so it's not clear when anyone could use the change. Also, I've never heard of a Django project using PyPy—it seems impractical ecosystemwise.
comment:12 by , 6 days ago
Thanks for spotting the logic change. I replied that there's a chance the logic change is more correct: we should add a test to be sure.
I heard that Django has never supported PyPy as a deployment target. Despite whispers of PyPy winding down, it seems to be slowly keeping up in its own time.
For me, this change wasn't just about PyPy. I was persuaded by the linked blog post's explanation that even in CPython adequate performance on catastrophic string concatenation depends on a fragile optimization.
Thank you Jacob!