Opened 24 hours ago
Last modified 18 hours ago
#36815 new Cleanup/optimization
Avoid unnecessary prepare_value calls when inserting db_defaults
| Reported by: | Adam Sołtysik | Owned by: | |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 6.0 |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Unreviewed | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
In SQLInsertCompiler.as_sql, it looks like prepare_value is unnecessarily called for each DatabaseDefault value on every inserted object. These calls are quite expensive (as they enter the if hasattr(value, "resolve_expression") branch) and waste a lot of time when using bulk_create.
I've written a potential patch and a simple benchmark: https://github.com/adamsol/django/commit/d6f275c8960988c92a9da5eb323b6bd6fc7717cb. The benchmark runs 2x faster with the fix, even though there's only one field with db_default in that model. The difference may become larger with more fields.
Change History (6)
comment:1 by , 19 hours ago
comment:2 by , 19 hours ago
python runtests.py bulk_create -k perf
Lowest times from several runs:
- 0.117 with the fix
- 0.225 after reverting the fix
comment:3 by , 19 hours ago
Moreover it seems your potential fix is breaking this testcase, I will like to look more into this.
FAIL: test_crafted_xml_performance (serializers.test_deserialization.TestDeserializer.test_crafted_xml_performance) [constant depth, varying length]
The time to process invalid inputs is not quadratic.
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib64/python3.14/unittest/case.py", line 58, in testPartExecutor
yield
File "/usr/lib64/python3.14/unittest/case.py", line 565, in subTest
yield
File "/home/rishu/Developer/OSS/django/tests/serializers/test_deserialization.py", line 182, in assertFactor
self.assertLessEqual(sum(factors) / len(factors), factor)
^^^^^^^
File "/usr/lib64/python3.14/unittest/case.py", line 1303, in assertLessEqual
self.fail(self._formatMessage(msg, standardMsg))
^^^^^^^^^^^
File "/usr/lib64/python3.14/unittest/case.py", line 750, in fail
raise self.failureException(msg)
^^^^^^^^^^^^^^^
AssertionError: 2.2543277454075485 not less than or equal to 2
----------------------------------------------------------------------
Ran 18875 tests in 103.044s
FAILED (failures=1, skipped=1458, expected failures=5)
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database
Otherwise it works fine:
...s..................s.................................................................................... ---------------------------------------------------------------------- Ran 18875 tests in 106.334s OK (skipped=1458, expected failures=5) Destroying test database for alias 'default'... Destroying test database for alias 'default'... Destroying test database for alias 'default'... Destroying test database for alias 'default'...
comment:4 by , 19 hours ago
That test can apparently sometimes fail randomly, see the comment:
# Assert based on the average factor to reduce test flakiness. self.assertLessEqual(sum(factors) / len(factors), factor)
comment:5 by , 18 hours ago
Thanks. Also, can you please help me understand why this fix would work here specifically? I thought about patching this in the prepare_value method, somewhat like:
django/db/models/sql/compiler.py:1722
def prepare_value(self, field, value):
from django.db.models.expressions import DatabaseDefault
"""
Prepare a value to be used in a query by resolving it if it is an
expression and otherwise calling the field's get_db_prep_save().
"""
if hasattr(value, "resolve_expression"):
if not isinstance(value, DatabaseDefault):
return value
value = value.resolve_expression(
self.query, allow_joins=False, for_save=True
)
And an absurd number of testcases failed due to that. Maybe this might be a problem with my approach of generalization or this optimization is really specific to the case here. Either way, adding that context would he helpful.
comment:6 by , 18 hours ago
First of all, the condition has the wrong logic in this variant, it should be without negation. But apart from that, the field_prepare function (and so prepare_value) is called on DatabaseDefault values also later in the code, for databases that don't satisfy supports_default_keyword_in_bulk_insert (Oracle), so this approach probably wouldn't work there.
Can you please share the benchmark results?