Opened 3 years ago

Last modified 2 years ago

#28459 new Cleanup/optimization

Improve performance of QuerySet

Reported by: Sergey Fedoseev Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Before:

In [2]: %timeit for x in City.objects.values_list('id'): pass
106 ms ± 2.88 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [3]: %timeit for x in City.objects.values_list('id', 'id'): pass
124 ms ± 846 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [2]: %timeit for x in City.objects.values_list('id'): pass
77.7 ms ± 1.49 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [3]: %timeit for x in City.objects.values_list('id', 'id'): pass
95.9 ms ± 1.71 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

PR

Change History (36)

comment:1 Changed 3 years ago by Tim Graham <timograham@…>

Resolution: fixed
Status: newclosed

In 2d136ede:

Fixed #28459 -- Improved performance of ValuesListIterable.

comment:2 Changed 3 years ago by Sergey Fedoseev

Summary: Improve performance of ValuesListIterableImprove performance of QuerySet

PR (Improved performance of Model.from_db() when fields are deferred)

Before:

In [3]: %timeit for x in City.objects.only('id'): pass               
1.07 s ± 10.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

After:

In [2]: %timeit for x in City.objects.only('id'): pass
961 ms ± 5.68 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

comment:3 Changed 3 years ago by Sergey Fedoseev

PR (Improved performance of select_related() when model is prefetched from its parent.)

Before:

In [2]: %timeit for x in Human.objects.select_related('person'): pass
196 ms ± 2.05 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [2]: %timeit for x in Human.objects.select_related('person'): pass
186 ms ± 1.59 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:4 Changed 3 years ago by Tim Graham <timograham@…>

In 78f7f68:

Refs #28459 -- Improved performance of Model.from_db() when fields are deferred.

comment:5 Changed 3 years ago by Sergey Fedoseev

PR (Improved performance of SQLCompiler.results_iter())

Before:

In [3]: %timeit for x in City.objects.values_list('id'): pass
74 ms ± 1.45 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [2]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values_list('v'): pass
343 ms ± 2.62 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

After:

In [4]: %timeit for x in City.objects.values_list('id'): pass
69.1 ms ± 744 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [3]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values_list('v'): pass
264 ms ± 3.26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Last edited 3 years ago by Sergey Fedoseev (previous) (diff)

comment:6 Changed 3 years ago by Sergey Fedoseev

PR (Optimized ModelState instantiation)

Before:

In [5]: %timeit PersonSkill()
8.9 µs ± 13 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [2]: %timeit for x in PersonSkill.objects.all(): pass
83 ms ± 628 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# skill is None for all objects
In [13]: %timeit for x in PersonSkill.objects.select_related('skill'): pass
111 ms ± 446 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# skill is not None for all objects
In [10]: %timeit for x in PersonSkill.objects.select_related('skill'): pass
187 ms ± 735 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [18]: %timeit PersonSkill()
8.54 µs ± 23.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [10]: %timeit for x in PersonSkill.objects.all(): pass
75.1 ms ± 825 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# skill is None for all objects
In [5]: %timeit for x in PersonSkill.objects.select_related('skill'): pass
108 ms ± 342 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# skill is not None for all objects
In [7]: %timeit for x in PersonSkill.objects.select_related('skill'): pass
178 ms ± 817 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Last edited 3 years ago by Sergey Fedoseev (previous) (diff)

comment:7 Changed 3 years ago by Josh Smeaton

These are awesome! We need more of these kind of patches.

comment:8 Changed 3 years ago by Tim Graham <timograham@…>

In ca46f468:

Refs #28459 -- Improved performance of SQLCompiler.results_iter().

comment:9 Changed 3 years ago by Tim Graham <timograham@…>

In ba1d82df:

Refs #28459 -- Improved performance of select_related() when model is prefetched from its parent.

comment:10 Changed 3 years ago by Sergey Fedoseev

PR (Improved performance of ValuesIterable)

Before:

In [6]: %timeit -n 10 for x in City.objects.values('id', 'b'): pass
212 ms ± 4.3 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [5]: %timeit for x in City.objects.values('id', 'b'): pass
151 ms ± 850 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:11 Changed 3 years ago by Sergey Fedoseev

PR (Improved performance of BaseExpression.convert_value())

Before:

In [2]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values_list('v'): pass
262 ms ± 3.42 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [3]: %timeit for x in City.objects.annotate(v=models.Value('1', output_field=models.TextField())).values_list('v'): pass
243 ms ± 3.77 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

After:

In [2]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values_list('v'): pass
191 ms ± 2.08 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [3]: %timeit for x in City.objects.annotate(v=models.Value('1', output_field=models.TextField())).values_list('v'): pass
65.6 ms ± 878 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:12 Changed 3 years ago by Sergey Fedoseev <fedoseev.sergey@…>

In fb0bdff:

Refs #28459 -- Improved performance of ValuesIterable.

comment:13 Changed 3 years ago by Sergey Fedoseev

PR (Improved performance of sql.compiler.cursor_iter())
Before:

In [6]: %timeit for x in City.objects.values_list('id'): pass
67.2 ms ± 1.03 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [6]: %timeit for x in City.objects.values_list('id'): pass
58 ms ± 580 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:14 Changed 3 years ago by Tim Graham <timograham@…>

In b8731c3:

Refs #28459 -- Improved performance of sql.compiler.cursor_iter().

comment:15 Changed 3 years ago by Tim Graham <timograham@…>

In 22ff86e:

Refs #28459 -- Made Oracle get_db_converters() return converter for empty strings only when it's needed.

comment:16 Changed 3 years ago by Sergey Fedoseev

PR (Improved performance of loading DecimalField on SQLite)
Before:

In [5]: %timeit for x in Test.objects.values_list('decimal'): pass
150 ms ± 1.76 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [6]: %timeit for x in Test.objects.values_list(models.F('decimal') + 1): pass
123 ms ± 1.08 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [2]: %timeit for x in Test.objects.values_list('decimal'): pass
28.8 ms ± 1.05 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [3]: %timeit for x in Test.objects.values_list(models.F('decimal') + 1): pass
84.6 ms ± 541 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:17 Changed 3 years ago by Tim Graham <timograham@…>

In a146b656:

Refs #28459 -- Improved performance of loading DecimalField on SQLite.

comment:18 Changed 3 years ago by Tim Graham <timograham@…>

In 5cc74620:

Refs #28459 -- Optimized ModelState instantiation.

comment:19 Changed 3 years ago by Tim Graham <timograham@…>

In 47ccefea:

Refs #28459 -- Improved performance of BaseExpression.convert_value().

comment:20 Changed 3 years ago by Sergey Fedoseev

PR (Improved performance of FlatValuesListIterable)

Before:

In [3]: %timeit -n 100 for x in City.objects.values_list('id', flat=True): pass
61.2 ms ± 719 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

After:

In [30]: %timeit -n 100 for x in City.objects.values_list('id', flat=True): pass
58.8 ms ± 514 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

comment:21 Changed 3 years ago by Sergey Fedoseev

PR (Improved performance of SQLCompiler.apply_converters())

Before:

In [3]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values_list('v'): pass
176 ms ± 725 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [6]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values('v'): pass
257 ms ± 2.18 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

After:

In [3]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values_list('v'): pass
170 ms ± 1.25 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [5]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values('v'): pass
224 ms ± 1.88 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

comment:22 Changed 3 years ago by Tim Graham <timograham@…>

In 4dfd6b88:

Refs #28459 -- Improved performance of FlatValuesListIterable.

comment:23 Changed 3 years ago by Tim Graham <timograham@…>

In dca67bb:

Refs #28459 -- Improved performance of SQLCompiler.apply_converters().

comment:24 Changed 3 years ago by Sergey Fedoseev

PR (Improved performance of loading models from DB)

class TestModel(models.Model):
    field1 = models.IntegerField(null=True)
    field2 = models.IntegerField(null=True)
    field3 = models.IntegerField(null=True)
    field4 = models.IntegerField(null=True)
    field5 = models.IntegerField(null=True)

TestModel.objects.count() == 20000, values for all fields is None.

Before:

In [7]: %timeit for x in TestModel.objects.all(): pass
163 ms ± 1.41 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [7]: %timeit for x in TestModel.objects.all(): pass
143 ms ± 1.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:25 Changed 3 years ago by Tomer Chachamu

Resolution: fixed
Status: closednew

There's a patch so reopening the card.

comment:26 Changed 3 years ago by Sergey Fedoseev

PR (Improved performance of loading DecimalField on SQLite)

Before:

In [2]: %timeit for x in DecimalModel.objects.values_list('d'): pass
128 ms ± 531 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [3]: %timeit for x in DecimalModel.objects.values_list(models.F('d') + 1): pass
39.5 ms ± 542 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [2]: %timeit for x in DecimalModel.objects.values_list('d'): pass
31.8 ms ± 490 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [3]: %timeit for x in DecimalModel.objects.values_list(models.F('d') + 1): pass
20.3 ms ± 579 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Last edited 3 years ago by Sergey Fedoseev (previous) (diff)

comment:27 Changed 3 years ago by Tim Graham <timograham@…>

In d0f569b3:

Refs #28459 -- Improved performance of loading DecimalField on SQLite.

comment:28 Changed 3 years ago by Sergey Fedoseev

PR (Improved performance of loading DurationField on SQLite and MySQL.)

Before:

%timeit for x in DurationModel.objects.values_list('d'): pass
168 ms ± 678 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

%timeit for x in DurationModel.objects.values_list('d'): pass
23.1 ms ± 243 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:29 Changed 3 years ago by Tim Graham <timograham@…>

In ae1baa7d:

Refs #28459 -- Improved performance of loading DurationField on SQLite and MySQL.

comment:30 Changed 3 years ago by Sergey Fedoseev

PR (Used default date converter on SQLite for better performance)

Before:

%timeit for x in TestModel.objects.values_list('date'): pass
83.3 ms ± 829 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

%timeit for x in TestModel.objects.values_list('date'): pass
40.7 ms ± 613 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:31 Changed 3 years ago by Sergey Fedoseev <fedoseev.sergey@…>

In 51ae4e1f:

Refs #28459 -- Used default date converter on SQLite for better performance.

See https://docs.python.org/3/library/sqlite3.html#default-adapters-and-converters.

comment:32 Changed 3 years ago by Sergey Fedoseev

PR (Improved performance of duration expressions on SQLite)

Before:

%timeit for x in Date.objects.annotate(d=models.ExpressionWrapper(models.F('date') + models.F('duration'), models.DateTimeField())).values_list('d'): pass
591 ms ± 21.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

After:

%timeit for x in Date.objects.annotate(d=models.ExpressionWrapper(models.F('date') + models.F('duration'), models.DateTimeField())).values_list('d'): pass
390 ms ± 2.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

comment:33 Changed 3 years ago by Tim Graham <timograham@…>

In 2115be6:

Refs #28459 -- Improved performance of duration expressions on SQLite.

comment:34 Changed 2 years ago by Sergey Fedoseev

PR (Improved performance of time difference expressions on MySQL)

Before:

%timeit bool(TestModel.objects.values_list(models.ExpressionWrapper(models.F('time') - models.F('time'), output_field=models.DurationField())))
35.4 ms ± 901 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

%timeit bool(TestModel.objects.values_list(models.ExpressionWrapper(models.F('time') - models.F('time'), output_field=models.DurationField())))
30 ms ± 810 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:35 Changed 2 years ago by Sergey Fedoseev <fedoseev.sergey@…>

In ba4a9862:

Refs #28459 -- Improved performance of time difference expressions on MySQL.

comment:36 Changed 2 years ago by Tim Graham

Patch needs improvement: set
Note: See TracTickets for help on using tickets.
Back to Top