Opened 3 months ago
Last modified 3 months ago
#35572 assigned Cleanup/optimization
Improve performance replacing os.listdir() with os.scandir()
Reported by: | Paolo Melchiorre | Owned by: | Amir Karimi |
---|---|---|---|
Component: | Core (Other) | Version: | dev |
Severity: | Normal | Keywords: | scandir listdir python os |
Cc: | Paolo Melchiorre | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Use os.scandir()
instead of os.listdir()
in the remaining occurrences in the code:
https://github.com/search?q=repo%3Adjango%2Fdjango+os.listdir&type=code
Based on the Python documentation
Using scandir() instead of listdir() can significantly increase the performance of code that also needs file type or file attribute information, because os.DirEntry objects expose this information if the operating system provides it when scanning a directory.
Change History (4)
comment:1 by , 3 months ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 3 months ago
Owner: | set to |
---|---|
Status: | new → assigned |
follow-up: 4 comment:3 by , 3 months ago
Component: | Uncategorized → Core (Other) |
---|
The description makes it sound like this is a simple find and replace all, however, do all usages "also need file type or file attribute information"?
comment:4 by , 3 months ago
Replying to Tim Graham:
The description makes it sound like this is a simple find and replace all, however, do all usages "also need file type or file attribute information"?
Good point! Except this case: https://github.com/django/django/blob/aa74c4083e047473ac385753e047e075e8f04890/scripts/manage_translations.py#L42
I didn't find any other cases where file attributes (is_dir, etc) are needed, and only their names or the number of list_dir output are needed. The only edge that "scandir" may still have is its less memory consumption when it comes to large folders (which I suspect is the case in any of these usages)
Similar to #29689 accepting, thank you
Note that additional benchmarks in django-asv are always welcome 👍