#36215 closed Cleanup/optimization (fixed)
Update Python code style in contribution docs to mention using PEP 448 unpacking to concatenate iterables
Description ¶
https://peps.python.org/pep-0448/ (shipped in Python 3.5) adds support for unpacking in tuple/list displays.
This is faster in micro-benchmarks (every little helps), more concise, arguably more readable, and avoids certain classes of bugs where users pass in tuples where Django expects lists and vice versa (see e.g. https://github.com/django/django/pull/15270).
Some instances of this can be automatically found and fixed with the Ruff linter's RUF005 rule (https://docs.astral.sh/ruff/rules/collection-literal-concatenation/).
Change History (11)
comment:1 by , 5 weeks ago
Cc: | added |
---|
comment:2 by , 5 weeks ago
Ah, sorry, yeah, I left the microbenchmarks out of the original ticket text for brevity.
BLANK_CHOICE_DASH = [("", "---------")] def get_action_choices_1(default_choices=BLANK_CHOICE_DASH): choices = [] + default_choices ... return choices def get_action_choices_2(default_choices=BLANK_CHOICE_DASH): choices = [*default_choices] ... return choices
(adapted from what get_action_choices
in the admin does)
results in
$ uv run --python=3.13 concates.py name='get_action_choices_1' iters=5000000 time=0.247 iters_per_sec=20206826.98 name='get_action_choices_2' iters=5000000 time=0.207 iters_per_sec=24111147.61 $ uv run --python=3.12 concates.py name='get_action_choices_1' iters=5000000 time=0.333 iters_per_sec=15023921.85 name='get_action_choices_2' iters=5000000 time=0.220 iters_per_sec=22699169.61
However, now that I benchmark some other expressions (such as some tuple concatenations), they may actually be slower in some of these cases; it looks like CPython internally converts the first input tuple to a list (BUILD_LIST
), then calls an INTRINSIC_LIST_TO_TUPLE
opcode on it...
comment:3 by , 4 weeks ago
Component: | Uncategorized → Documentation |
---|---|
Owner: | set to |
Status: | new → assigned |
Triage Stage: | Unreviewed → Accepted |
Version: | 5.1 → 5.2 |
I performed some more "traditional" benchmarking (see the test script and results below), and while there is a performance gain, I wouldn't necessarily accept this ticket based solely on that. However, I do believe we can accept the ticket and PR due to the existing precedent and preference in #28909. Additionally, I agree that the code reads better and is more robust (the tuple argument really stood out to me).
To proceed with the proposed changes, I think we need in the two separate commits as follows:
- One commit making the code changes with commit message as used before:
Refs #28909 -- Simplifed code using unpacking generalizations.
. - A second commit that refs this ticket which updates the contributing docs regarding python style.
Bonus point for providing a linter that would check for this in our code when running lint checks!
Test script and results:
import timeit setup_code = """ d1 = {str(i): i for i in range(500)} d2 = {str(i): -i for i in range(500, 1000)} d3 = {str(i): i for i in range(1, 500, 3)} merged = {} """ stmt_unpack = "merged = {**d1, **d2, **d3}" stmt_update = "merged.update(d1); merged.update(d2); merged.update(d3)" print("Dictionary unpacking **:", timeit.timeit(stmt_unpack, setup=setup_code, number=100000)) print("Dictionary update:", timeit.timeit(stmt_update, setup=setup_code, number=100000)) setup_code = """ list1 = list(range(0, 100)) list2 = list(range(0, 100, 3)) set1 = {str(i) for i in range(100, 200)} gen1 = (i for i in range(200, 300)) merged = [] merged_set = set() """ # Using + stmt_plus = "merged = list1 + list2 + list(set1) + list(gen1)" # Using extend() stmt_extend = "merged.extend(list1); merged.extend(list2); merged.extend(set1); merged.extend(gen1)" # Using update() stmt_update = "merged_set.update(list1); merged_set.update(list2); merged_set.update(set1); merged_set.update(gen1)" # Using * unpacking stmt_star = "merged = [*list1, *list2, *set1, *gen1]" # Using * unpacking on set stmt_star_set = "merged_set = {*list1, *list2, *set1, *gen1}" print("Using + :", timeit.timeit(stmt_plus, setup=setup_code, number=100000)) print("Using extend():", timeit.timeit(stmt_extend, setup=setup_code, number=100000)) print("Using update():", timeit.timeit(stmt_update, setup=setup_code, number=100000)) print("Using * unpacking:", timeit.timeit(stmt_star, setup=setup_code, number=100000)) print("Using * unpacking for set:", timeit.timeit(stmt_star_set, setup=setup_code, number=100000))
With results:
$ python unpacking.py Dictionary unpacking **: 1.0474329520002357 Dictionary update: 1.2098236809979426 Using + : 0.12143729099989287 Using extend(): 0.1061674349984969 Using update(): 0.13288522899892996 Using * unpacking: 0.0665277590014739 Using * unpacking for set: 0.1808751239987032
comment:4 by , 4 weeks ago
Summary: | Use PEP 448 unpacking to concatenate iterables → Update Python code style in contribution docs to mention using PEP 448 unpacking to concatenate iterables |
---|
comment:5 by , 4 weeks ago
Needs documentation: | set |
---|---|
Patch needs improvement: | set |
Setting as patch needs improvement and needs docs until the commits are arranged and referenced as requested.
follow-up: 8 comment:6 by , 4 weeks ago
Thanks! PR rebased and updated – as before, it doesn't touch tests, just to make it easier for the reviewer/merger to prove that these have no functional changes. I can do a follow-up PR to do the same for tests.
As for "Bonus point for providing a linter that would check for this in our code when running lint checks!", I did note Ruff's RUF005 rule in the second commit that adds a bit into the style guide, but didn't yet do anything more about it (see #34460, #35527, #35832 about Ruff).
comment:7 by , 4 weeks ago
Needs documentation: | unset |
---|---|
Patch needs improvement: | unset |
comment:8 by , 7 days ago
Replying to Aarni Koskela:
Thanks! PR rebased and updated – as before, it doesn't touch tests, just to make it easier for the reviewer/merger to prove that these have no functional changes. I can do a follow-up PR to do the same for tests.
Thank you!
As for "Bonus point for providing a linter that would check for this in our code when running lint checks!", I did note Ruff's RUF005 rule in the second commit that adds a bit into the style guide, but didn't yet do anything more about it (see #34460, #35527, #35832 about Ruff).
Right, we are not going to incorporate nor reference Ruff in the short term. I was mostly suggesting we'd analyze how feasible it would be to do something custom for Django using the linting tools we already have/
comment:9 by , 43 hours ago
Triage Stage: | Accepted → Ready for checkin |
---|
I don't love micro-optimization tickets, especially without benchmarks, as it feels like unnecessary churn.
If we are to do this, I think it makes sense to add a point in the Python style guide. Possibly even add a linter.
I see we had a ticket for this previously (#28909).
I am going to cc some folks who were involved in the original ticket in case they have an opinion.