Opened 3 years ago
Last modified 3 years ago
#33007 new Cleanup/optimization
`form_clean()` performance
Reported by: | David Smith | Owned by: | nobody |
---|---|---|---|
Component: | Forms | Version: | 3.2 |
Severity: | Normal | Keywords: | |
Cc: | Chris Jerdonek, Simon Charette | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I ran the ran the benchmark suite to see if anything had changed in the last few months. See link
This highlighted a regression (c.15% increase) in the performance form_clean()
following this commit, see benchmark.
I'm not certain what to suggest here, or if it's even that much of an issue and should be wontfix
ed, but thought it best to atleast raise.
Change History (6)
comment:1 by , 3 years ago
Cc: | added |
---|---|
Component: | Uncategorized → Forms |
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → Cleanup/optimization |
comment:2 by , 3 years ago
That ticket (#32920) was primarily about correctness rather than trying to do anything about performance. Nevertheless, one possible optimization that occurred to me while working on it was adding a flag to BaseForm
called something like self._bound_fields_cache_filled
. Then, the BaseForm._bound_items() and BaseForm.__iter__() iterators could consult that flag at their outset. If set, the BoundField
items could be served directly from self._bound_fields_cache
instead of going through the extra layer of indirection of BaseForm.__getitem__()
. However, I don't know what's contributing most in terms of the slow-down. Maybe it's that BoundField.initial
uses @cached_property?
comment:3 by , 3 years ago
Cc: | added |
---|
While this particular commit was made to address a correctness issue the recent influx in optimization PRs merged in the past weeks made me wonder if we'd rather invest in having some form of CI to confirm their benefit in the grand scheme of things.
I greatly appreciate having David run these benchmark from time to time but it'd be great to back optimization PRs with addition to the performance suite to make sure we don't trade one improvement for another over time. It would seem like a reasonable ask given how invested we are in writing regression tests for all bug fixes and feature additions?
Thoughts? Maybe I should bring that up on the mailing list instead?
comment:4 by , 3 years ago
Thoughts?
If a PR's purpose is to optimize, but it adds complexity (like adding caching), then it seems reasonable to me that there be an objective measure that people can look to that shows improvement. A related example: just in the last couple days, it was pointed out in this PR that _lazy_re_compile()
(which was originally added as an optimization) was actually slower in one instance. How many more are like that?
On the other hand, if a PR is doing something like removing a redundant operation or eliminating an unneeded argument, I think it should be okay even if a benchmark doesn't show a noticeable improvement. I know in some cases like that, the commit message wound up saying something like "Optimized ...," even if the impact might not really be noticeable. (I think a message like "Simplified ..." or "Removed redundant ..." might be more accurate in cases like that, so it doesn't overstate the impact.)
follow-up: 6 comment:5 by , 3 years ago
+1 to building on David's work: I believe the next stage is that we need to provision the benchmarks to run on small but otherwise empty instance, that isn't David's own machine — But I guess details can be discussed on the django-bench repo… — general point is it would be great to have these running regularly (maybe daily on main, and PRs on request?)
comment:6 by , 3 years ago
Replying to Carlton Gibson:
But I guess details can be discussed on the django-bench repo…
I finally got round to finishing my thoughts on benchmarking more generally. see post.
Thanks! It's definitely worth investigating.