Opened 2 hours ago
Last modified 2 hours ago
#36858 assigned Cleanup/optimization
Optimize `db_default` creation
| Reported by: | Adam Johnson | Owned by: | Adam Johnson |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | dev |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Accepted | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
Currently, Field._get_default() returns a lambda that instantiates a new DatabaseDefault each time it is called. This expression is identical each time, so this work is wasted. We can optimize it by creating a single expression that is returned each time.
Thanks to Adam Sołtysik for the hint in https://forum.djangoproject.com/t/faster-bulk-create-using-dictionaries/43891
Benchmarked with a quick modification to the test suite:
-
./tests/bulk_create/tests.py
diff --git ./tests/bulk_create/tests.py ./tests/bulk_create/tests.py index 397fcb9186..ddfe315c0c 100644
def test_db_default_field_excluded(self): 877 877 2 if connection.features.can_return_rows_from_bulk_insert else 1, 878 878 ) 879 879 880 881 def test_benchmark(self): 882 import time 883 start = time.perf_counter() 884 DbDefaultModel.objects.bulk_create( 885 [DbDefaultModel(name=f"obj {i}") for i in range(100_000)] 886 ) 887 end = time.perf_counter() 888 print(f"{end - start:.3f} seconds") 889 880 890 @skipUnlessDBFeature( 881 891 "can_return_rows_from_bulk_insert", "supports_expression_defaults" 882 892 )
Run with:
./runtests.py --parallel 1 bulk_create.tests.BulkCreateTests.test_benchmark -v 0 System check identified no issues (0 silenced). 0.617 seconds ---------------------------------------------------------------------- Ran 1 test in 0.618s OK
Results, best of 3:
Before: 0.691 seconds
After: 0.610 seconds
A ~12% speedup on this bulk_create() operation, on a model with a single db_default field.
Thanks for that Adam! When I read the thread I figured something was definitely wrong but I didn't have the time to look at it. Could you also submit a minimal benchmark againt django-asv that makes use of
db_default?