Opened 5 weeks ago

Closed 39 hours ago

Last modified 39 hours ago

#36215 closed Cleanup/optimization (fixed)

Update Python code style in contribution docs to mention using PEP 448 unpacking to concatenate iterables

Reported by: Aarni Koskela Owned by: Aarni Koskela
Component: Documentation Version: 5.2
Severity: Normal Keywords:
Cc: Nick Pope, Tim Graham, Mariusz Felisiak Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

https://peps.python.org/pep-0448/ (shipped in Python 3.5) adds support for unpacking in tuple/list displays.

This is faster in micro-benchmarks (every little helps), more concise, arguably more readable, and avoids certain classes of bugs where users pass in tuples where Django expects lists and vice versa (see e.g. https://github.com/django/django/pull/15270).

Some instances of this can be automatically found and fixed with the Ruff linter's RUF005 rule (https://docs.astral.sh/ruff/rules/collection-literal-concatenation/).

Change History (11)

comment:1 by Sarah Boyce, 5 weeks ago

Cc: Nick Pope Tim Graham Mariusz Felisiak added

I don't love micro-optimization tickets, especially without benchmarks, as it feels like unnecessary churn.
If we are to do this, I think it makes sense to add a point in the Python style guide. Possibly even add a linter.

I see we had a ticket for this previously (#28909).
I am going to cc some folks who were involved in the original ticket in case they have an opinion.

comment:2 by Aarni Koskela, 5 weeks ago

Ah, sorry, yeah, I left the microbenchmarks out of the original ticket text for brevity.

BLANK_CHOICE_DASH = [("", "---------")]

def get_action_choices_1(default_choices=BLANK_CHOICE_DASH):
    choices = [] + default_choices
    ...
    return choices


def get_action_choices_2(default_choices=BLANK_CHOICE_DASH):
    choices = [*default_choices]
    ...
    return choices

(adapted from what get_action_choices in the admin does)

results in

$ uv run --python=3.13 concates.py
name='get_action_choices_1' iters=5000000 time=0.247 iters_per_sec=20206826.98
name='get_action_choices_2' iters=5000000 time=0.207 iters_per_sec=24111147.61
$ uv run --python=3.12 concates.py
name='get_action_choices_1' iters=5000000 time=0.333 iters_per_sec=15023921.85
name='get_action_choices_2' iters=5000000 time=0.220 iters_per_sec=22699169.61

However, now that I benchmark some other expressions (such as some tuple concatenations), they may actually be slower in some of these cases; it looks like CPython internally converts the first input tuple to a list (BUILD_LIST), then calls an INTRINSIC_LIST_TO_TUPLE opcode on it...

comment:3 by Natalia Bidart, 4 weeks ago

Component: UncategorizedDocumentation
Owner: set to Aarni Koskela
Status: newassigned
Triage Stage: UnreviewedAccepted
Version: 5.15.2

I performed some more "traditional" benchmarking (see the test script and results below), and while there is a performance gain, I wouldn't necessarily accept this ticket based solely on that. However, I do believe we can accept the ticket and PR due to the existing precedent and preference in #28909. Additionally, I agree that the code reads better and is more robust (the tuple argument really stood out to me).

To proceed with the proposed changes, I think we need in the two separate commits as follows:

  • One commit making the code changes with commit message as used before: Refs #28909 -- Simplifed code using unpacking generalizations..
  • A second commit that refs this ticket which updates the contributing docs regarding python style.

Bonus point for providing a linter that would check for this in our code when running lint checks!

Test script and results:

import timeit

setup_code = """
d1 = {str(i): i for i in range(500)}
d2 = {str(i): -i for i in range(500, 1000)}
d3 = {str(i): i for i in range(1, 500, 3)}
merged = {}
"""

stmt_unpack = "merged = {**d1, **d2, **d3}"
stmt_update = "merged.update(d1); merged.update(d2); merged.update(d3)"

print("Dictionary unpacking **:", timeit.timeit(stmt_unpack, setup=setup_code, number=100000))
print("Dictionary update:", timeit.timeit(stmt_update, setup=setup_code, number=100000))

setup_code = """
list1 = list(range(0, 100))
list2 = list(range(0, 100, 3))
set1 = {str(i) for i in range(100, 200)}
gen1 = (i for i in range(200, 300))
merged = []
merged_set = set()
"""

# Using +
stmt_plus = "merged = list1 + list2 + list(set1) + list(gen1)"

# Using extend()
stmt_extend = "merged.extend(list1); merged.extend(list2); merged.extend(set1); merged.extend(gen1)"

# Using update()
stmt_update = "merged_set.update(list1); merged_set.update(list2); merged_set.update(set1); merged_set.update(gen1)"

# Using * unpacking
stmt_star = "merged = [*list1, *list2, *set1, *gen1]"

# Using * unpacking on set
stmt_star_set = "merged_set = {*list1, *list2, *set1, *gen1}"

print("Using + :", timeit.timeit(stmt_plus, setup=setup_code, number=100000))
print("Using extend():", timeit.timeit(stmt_extend, setup=setup_code, number=100000))
print("Using update():", timeit.timeit(stmt_update, setup=setup_code, number=100000))
print("Using * unpacking:", timeit.timeit(stmt_star, setup=setup_code, number=100000))
print("Using * unpacking for set:", timeit.timeit(stmt_star_set, setup=setup_code, number=100000))

With results:

$ python unpacking.py 
Dictionary unpacking **: 1.0474329520002357
Dictionary update: 1.2098236809979426

Using + : 0.12143729099989287
Using extend(): 0.1061674349984969
Using update(): 0.13288522899892996
Using * unpacking: 0.0665277590014739
Using * unpacking for set: 0.1808751239987032

comment:4 by Natalia Bidart, 4 weeks ago

Summary: Use PEP 448 unpacking to concatenate iterablesUpdate Python code style in contribution docs to mention using PEP 448 unpacking to concatenate iterables

comment:5 by Natalia Bidart, 4 weeks ago

Needs documentation: set
Patch needs improvement: set

Setting as patch needs improvement and needs docs until the commits are arranged and referenced as requested.

comment:6 by Aarni Koskela, 4 weeks ago

Thanks! PR rebased and updated – as before, it doesn't touch tests, just to make it easier for the reviewer/merger to prove that these have no functional changes. I can do a follow-up PR to do the same for tests.

As for "Bonus point for providing a linter that would check for this in our code when running lint checks!", I did note Ruff's RUF005 rule in the second commit that adds a bit into the style guide, but didn't yet do anything more about it (see #34460, #35527, #35832 about Ruff).

comment:7 by Aarni Koskela, 4 weeks ago

Needs documentation: unset
Patch needs improvement: unset

in reply to:  6 comment:8 by Natalia Bidart, 7 days ago

Replying to Aarni Koskela:

Thanks! PR rebased and updated – as before, it doesn't touch tests, just to make it easier for the reviewer/merger to prove that these have no functional changes. I can do a follow-up PR to do the same for tests.

Thank you!

As for "Bonus point for providing a linter that would check for this in our code when running lint checks!", I did note Ruff's RUF005 rule in the second commit that adds a bit into the style guide, but didn't yet do anything more about it (see #34460, #35527, #35832 about Ruff).

Right, we are not going to incorporate nor reference Ruff in the short term. I was mostly suggesting we'd analyze how feasible it would be to do something custom for Django using the linting tools we already have/

comment:9 by Natalia Bidart, 43 hours ago

Triage Stage: AcceptedReady for checkin

comment:10 by nessita <124304+nessita@…>, 39 hours ago

Resolution: fixed
Status: assignedclosed

In 6b325067:

Fixes #36215 -- Included unpacking generalization notes in coding style guide (PEP-448).

comment:11 by Natalia <124304+nessita@…>, 39 hours ago

In 0581ec2:

[5.2.x] Fixes #36215 -- Included unpacking generalization notes in coding style guide (PEP-448).

Backport of 6b3250673937b105af44f2f14247e56876f8dbe1 from main.

Note: See TracTickets for help on using tickets.
Back to Top