Opened 8 years ago

Closed 3 years ago

#28459 closed Cleanup/optimization (fixed)

Improve performance of QuerySet

Reported by: Sergey Fedoseev Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no
Pull Requests:9070 unmerged, 8854 merged, 9739 merged, 8934 merged, 9542 merged, 9486 merged, 9509 merged, 9494 merged, 8851 merged, 8935 merged, 8867 merged, 8835 merged, 8849 merged, 8878 merged, 8871 merged, 8843 merged, 8845 merged, 8882 merged

Description

Before:

In [2]: %timeit for x in City.objects.values_list('id'): pass
106 ms ± 2.88 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [3]: %timeit for x in City.objects.values_list('id', 'id'): pass
124 ms ± 846 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [2]: %timeit for x in City.objects.values_list('id'): pass
77.7 ms ± 1.49 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [3]: %timeit for x in City.objects.values_list('id', 'id'): pass
95.9 ms ± 1.71 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

PR

Change History (37)

comment:1 by Tim Graham <timograham@…>, 8 years ago

Resolution: fixed
Status: newclosed

In 2d136ede:

Fixed #28459 -- Improved performance of ValuesListIterable.

comment:2 by Sergey Fedoseev, 8 years ago

Summary: Improve performance of ValuesListIterableImprove performance of QuerySet

PR (Improved performance of Model.from_db() when fields are deferred)

Before:

In [3]: %timeit for x in City.objects.only('id'): pass               
1.07 s ± 10.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

After:

In [2]: %timeit for x in City.objects.only('id'): pass
961 ms ± 5.68 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

comment:3 by Sergey Fedoseev, 8 years ago

PR (Improved performance of select_related() when model is prefetched from its parent.)

Before:

In [2]: %timeit for x in Human.objects.select_related('person'): pass
196 ms ± 2.05 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [2]: %timeit for x in Human.objects.select_related('person'): pass
186 ms ± 1.59 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:4 by Tim Graham <timograham@…>, 8 years ago

In 78f7f68:

Refs #28459 -- Improved performance of Model.from_db() when fields are deferred.

comment:5 by Sergey Fedoseev, 8 years ago

PR (Improved performance of SQLCompiler.results_iter())

Before:

In [3]: %timeit for x in City.objects.values_list('id'): pass
74 ms ± 1.45 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [2]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values_list('v'): pass
343 ms ± 2.62 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

After:

In [4]: %timeit for x in City.objects.values_list('id'): pass
69.1 ms ± 744 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [3]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values_list('v'): pass
264 ms ± 3.26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Last edited 8 years ago by Sergey Fedoseev (previous) (diff)

comment:6 by Sergey Fedoseev, 8 years ago

PR (Optimized ModelState instantiation)

Before:

In [5]: %timeit PersonSkill()
8.9 µs ± 13 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [2]: %timeit for x in PersonSkill.objects.all(): pass
83 ms ± 628 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# skill is None for all objects
In [13]: %timeit for x in PersonSkill.objects.select_related('skill'): pass
111 ms ± 446 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# skill is not None for all objects
In [10]: %timeit for x in PersonSkill.objects.select_related('skill'): pass
187 ms ± 735 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [18]: %timeit PersonSkill()
8.54 µs ± 23.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [10]: %timeit for x in PersonSkill.objects.all(): pass
75.1 ms ± 825 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# skill is None for all objects
In [5]: %timeit for x in PersonSkill.objects.select_related('skill'): pass
108 ms ± 342 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# skill is not None for all objects
In [7]: %timeit for x in PersonSkill.objects.select_related('skill'): pass
178 ms ± 817 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Last edited 8 years ago by Sergey Fedoseev (previous) (diff)

comment:7 by Josh Smeaton, 8 years ago

These are awesome! We need more of these kind of patches.

comment:8 by Tim Graham <timograham@…>, 8 years ago

In ca46f468:

Refs #28459 -- Improved performance of SQLCompiler.results_iter().

comment:9 by Tim Graham <timograham@…>, 8 years ago

In ba1d82df:

Refs #28459 -- Improved performance of select_related() when model is prefetched from its parent.

comment:10 by Sergey Fedoseev, 8 years ago

PR (Improved performance of ValuesIterable)

Before:

In [6]: %timeit -n 10 for x in City.objects.values('id', 'b'): pass
212 ms ± 4.3 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [5]: %timeit for x in City.objects.values('id', 'b'): pass
151 ms ± 850 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:11 by Sergey Fedoseev, 8 years ago

PR (Improved performance of BaseExpression.convert_value())

Before:

In [2]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values_list('v'): pass
262 ms ± 3.42 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [3]: %timeit for x in City.objects.annotate(v=models.Value('1', output_field=models.TextField())).values_list('v'): pass
243 ms ± 3.77 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

After:

In [2]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values_list('v'): pass
191 ms ± 2.08 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [3]: %timeit for x in City.objects.annotate(v=models.Value('1', output_field=models.TextField())).values_list('v'): pass
65.6 ms ± 878 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:12 by Sergey Fedoseev <fedoseev.sergey@…>, 8 years ago

In fb0bdff:

Refs #28459 -- Improved performance of ValuesIterable.

comment:13 by Sergey Fedoseev, 8 years ago

PR (Improved performance of sql.compiler.cursor_iter())
Before:

In [6]: %timeit for x in City.objects.values_list('id'): pass
67.2 ms ± 1.03 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [6]: %timeit for x in City.objects.values_list('id'): pass
58 ms ± 580 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:14 by Tim Graham <timograham@…>, 8 years ago

In b8731c3:

Refs #28459 -- Improved performance of sql.compiler.cursor_iter().

comment:15 by Tim Graham <timograham@…>, 8 years ago

In 22ff86e:

Refs #28459 -- Made Oracle get_db_converters() return converter for empty strings only when it's needed.

comment:16 by Sergey Fedoseev, 8 years ago

PR (Improved performance of loading DecimalField on SQLite)
Before:

In [5]: %timeit for x in Test.objects.values_list('decimal'): pass
150 ms ± 1.76 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [6]: %timeit for x in Test.objects.values_list(models.F('decimal') + 1): pass
123 ms ± 1.08 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [2]: %timeit for x in Test.objects.values_list('decimal'): pass
28.8 ms ± 1.05 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [3]: %timeit for x in Test.objects.values_list(models.F('decimal') + 1): pass
84.6 ms ± 541 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:17 by Tim Graham <timograham@…>, 8 years ago

In a146b656:

Refs #28459 -- Improved performance of loading DecimalField on SQLite.

comment:18 by Tim Graham <timograham@…>, 8 years ago

In 5cc74620:

Refs #28459 -- Optimized ModelState instantiation.

comment:19 by Tim Graham <timograham@…>, 8 years ago

In 47ccefea:

Refs #28459 -- Improved performance of BaseExpression.convert_value().

comment:20 by Sergey Fedoseev, 8 years ago

PR (Improved performance of FlatValuesListIterable)

Before:

In [3]: %timeit -n 100 for x in City.objects.values_list('id', flat=True): pass
61.2 ms ± 719 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

After:

In [30]: %timeit -n 100 for x in City.objects.values_list('id', flat=True): pass
58.8 ms ± 514 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

comment:21 by Sergey Fedoseev, 8 years ago

PR (Improved performance of SQLCompiler.apply_converters())

Before:

In [3]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values_list('v'): pass
176 ms ± 725 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [6]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values('v'): pass
257 ms ± 2.18 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

After:

In [3]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values_list('v'): pass
170 ms ± 1.25 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [5]: %timeit for x in City.objects.annotate(v=models.Value(1, output_field=models.IntegerField())).values('v'): pass
224 ms ± 1.88 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

comment:22 by Tim Graham <timograham@…>, 8 years ago

In 4dfd6b88:

Refs #28459 -- Improved performance of FlatValuesListIterable.

comment:23 by Tim Graham <timograham@…>, 8 years ago

In dca67bb:

Refs #28459 -- Improved performance of SQLCompiler.apply_converters().

comment:24 by Sergey Fedoseev, 7 years ago

PR (Improved performance of loading models from DB)

class TestModel(models.Model):
    field1 = models.IntegerField(null=True)
    field2 = models.IntegerField(null=True)
    field3 = models.IntegerField(null=True)
    field4 = models.IntegerField(null=True)
    field5 = models.IntegerField(null=True)

TestModel.objects.count() == 20000, values for all fields is None.

Before:

In [7]: %timeit for x in TestModel.objects.all(): pass
163 ms ± 1.41 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [7]: %timeit for x in TestModel.objects.all(): pass
143 ms ± 1.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:25 by Tomer Chachamu, 7 years ago

Resolution: fixed
Status: closednew

There's a patch so reopening the card.

comment:26 by Sergey Fedoseev, 7 years ago

PR (Improved performance of loading DecimalField on SQLite)

Before:

In [2]: %timeit for x in DecimalModel.objects.values_list('d'): pass
128 ms ± 531 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [3]: %timeit for x in DecimalModel.objects.values_list(models.F('d') + 1): pass
39.5 ms ± 542 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [2]: %timeit for x in DecimalModel.objects.values_list('d'): pass
31.8 ms ± 490 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [3]: %timeit for x in DecimalModel.objects.values_list(models.F('d') + 1): pass
20.3 ms ± 579 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Last edited 7 years ago by Sergey Fedoseev (previous) (diff)

comment:27 by Tim Graham <timograham@…>, 7 years ago

In d0f569b3:

Refs #28459 -- Improved performance of loading DecimalField on SQLite.

comment:28 by Sergey Fedoseev, 7 years ago

PR (Improved performance of loading DurationField on SQLite and MySQL.)

Before:

%timeit for x in DurationModel.objects.values_list('d'): pass
168 ms ± 678 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

%timeit for x in DurationModel.objects.values_list('d'): pass
23.1 ms ± 243 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:29 by Tim Graham <timograham@…>, 7 years ago

In ae1baa7d:

Refs #28459 -- Improved performance of loading DurationField on SQLite and MySQL.

comment:30 by Sergey Fedoseev, 7 years ago

PR (Used default date converter on SQLite for better performance)

Before:

%timeit for x in TestModel.objects.values_list('date'): pass
83.3 ms ± 829 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

%timeit for x in TestModel.objects.values_list('date'): pass
40.7 ms ± 613 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:31 by Sergey Fedoseev <fedoseev.sergey@…>, 7 years ago

In 51ae4e1f:

Refs #28459 -- Used default date converter on SQLite for better performance.

See https://docs.python.org/3/library/sqlite3.html#default-adapters-and-converters.

comment:32 by Sergey Fedoseev, 7 years ago

PR (Improved performance of duration expressions on SQLite)

Before:

%timeit for x in Date.objects.annotate(d=models.ExpressionWrapper(models.F('date') + models.F('duration'), models.DateTimeField())).values_list('d'): pass
591 ms ± 21.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

After:

%timeit for x in Date.objects.annotate(d=models.ExpressionWrapper(models.F('date') + models.F('duration'), models.DateTimeField())).values_list('d'): pass
390 ms ± 2.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

comment:33 by Tim Graham <timograham@…>, 7 years ago

In 2115be6:

Refs #28459 -- Improved performance of duration expressions on SQLite.

comment:34 by Sergey Fedoseev, 7 years ago

PR (Improved performance of time difference expressions on MySQL)

Before:

%timeit bool(TestModel.objects.values_list(models.ExpressionWrapper(models.F('time') - models.F('time'), output_field=models.DurationField())))
35.4 ms ± 901 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

%timeit bool(TestModel.objects.values_list(models.ExpressionWrapper(models.F('time') - models.F('time'), output_field=models.DurationField())))
30 ms ± 810 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

comment:35 by Sergey Fedoseev <fedoseev.sergey@…>, 7 years ago

In ba4a9862:

Refs #28459 -- Improved performance of time difference expressions on MySQL.

comment:36 by Tim Graham, 7 years ago

Patch needs improvement: set

comment:37 by Simon Charette, 3 years ago

Resolution: fixed
Status: newclosed

I think we can close this one as fixed now. There is nothing actionable here and separate tickets can be created if further performance improvement opportunities are identified.

Note: See TracTickets for help on using tickets.
Back to Top