Opened 3 weeks ago

Closed 3 weeks ago

Last modified 3 weeks ago

#36858 closed Cleanup/optimization (fixed)

Optimize `db_default` creation

Reported by: Adam Johnson Owned by: Adam Johnson
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Currently, Field._get_default() returns a lambda that instantiates a new DatabaseDefault each time it is called. This expression is identical each time, so this work is wasted. We can optimize it by creating a single expression that is returned each time.

Thanks to Adam Sołtysik for the hint in https://forum.djangoproject.com/t/faster-bulk-create-using-dictionaries/43891

Benchmarked with a quick modification to the test suite:

  • ./tests/bulk_create/tests.py

    diff --git ./tests/bulk_create/tests.py ./tests/bulk_create/tests.py
    index 397fcb9186..ddfe315c0c 100644
    def test_db_default_field_excluded(self):  
    877877            2 if connection.features.can_return_rows_from_bulk_insert else 1,
    878878        )
    879879
     880
     881    def test_benchmark(self):
     882        import time
     883        start = time.perf_counter()
     884        DbDefaultModel.objects.bulk_create(
     885            [DbDefaultModel(name=f"obj {i}") for i in range(100_000)]
     886        )
     887        end = time.perf_counter()
     888        print(f"{end - start:.3f} seconds")
     889
    880890    @skipUnlessDBFeature(
    881891        "can_return_rows_from_bulk_insert", "supports_expression_defaults"
    882892    )

Run with:

./runtests.py --parallel 1 bulk_create.tests.BulkCreateTests.test_benchmark -v 0
System check identified no issues (0 silenced).
0.617 seconds
----------------------------------------------------------------------
Ran 1 test in 0.618s

OK

Results, best of 3:

Before: 0.691 seconds
After: 0.610 seconds

A ~12% speedup on this bulk_create() operation, on a model with a single db_default field.

Change History (4)

comment:1 by Simon Charette, 3 weeks ago

Triage Stage: UnreviewedAccepted

Thanks for that Adam! When I read the thread I figured something was definitely wrong but I didn't have the time to look at it. Could you also submit a minimal benchmark againt django-asv that makes use of db_default?

comment:2 by Jacob Walls, 3 weeks ago

Triage Stage: AcceptedReady for checkin

comment:3 by Jacob Walls <jacobtylerwalls@…>, 3 weeks ago

Resolution: fixed
Status: assignedclosed

In 2b192bff:

Fixed #36858 -- Optimized Field._get_default() for db_default case.

Create and share a single instance of DatabaseDefault instead of making a new
one each time the lambda is called. The quick benchmark on the ticket shows a
~12% speedup for a large bulk_create() operation.

in reply to:  1 comment:4 by Adam Johnson, 3 weeks ago

Replying to Simon Charette:

Thanks for that Adam! When I read the thread I figured something was definitely wrong but I didn't have the time to look at it. Could you also submit a minimal benchmark againt django-asv that makes use of db_default?

https://github.com/django/django-asv/pull/95

Note: See TracTickets for help on using tickets.
Back to Top