Opened 23 months ago

Closed 22 months ago

Last modified 21 months ago

#20950 closed Cleanup/optimization (fixed)

Use OrderedDicts in ORM only when needed

Reported by: akaariai Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Initializing OrderedDicts seem to be really slow, at least on Python 2.7. By instantiating OrderedDicts in Query only when needed one can save considerable time. For example model_save_existing benchmark is 1.3x faster, qs_filter_chaining 1.35x faster. Nearly all of the query_ benchmarks have at least 10% speedup.

Patch at https://github.com/akaariai/django/tree/ordered_dict_on_need. Together with splitted_clone branch this gives over 1.5x speedup to model_save_existing.

There might be some cleaner way to implement the "initiate only on need" for Query._aggregates and Query._extra. Ideas welcome.

Ill accept this directly as this seems like a good idea to do. This trades code-cleanness for performance, but in this particular case I think it is worth it.

Change History (3)

comment:1 Changed 22 months ago by akaariai

I benchmarked the change that introduced OrderedDict (that is, commit 07876cf02b6db453ca0397c29c225668872fa96d). It seems the change introduces around 15% slowdown in model_save_existing benchmark. Initializing an empty OrderedDict is around 50% slower than Django's SortedDict was (different algorithms, different tradeoffs).

Using Python's version of ordered dictionary is the correct thing to do. I am pretty sure Python's OrderedDict will get optimised implementation some day. But before that happens it seems like a good idea to avoid initialization of empty OrderedDictionaries where possible.

comment:2 Changed 22 months ago by Anssi Kääriäinen <anssi.kaariainen@…>

  • Resolution set to fixed
  • Status changed from new to closed

In ff723d894d9272ea721d1996432ffc806c2b8180:

Fixed #20950 -- Instantiate OrderedDict() only when needed

The use of OrderedDict (even an empty one) was surprisingly slow. By
initializing OrderedDict only when needed it is possible to save
non-trivial amount of computing time (Model.save() is around 30% faster
for example).

This commit targetted sql.Query only, there are likely other places
which could use similar optimizations.

comment:3 Changed 21 months ago by Anssi Kääriäinen <akaariai@…>

In d64060a73650360dcabfdb4928a9e92d090925b1:

OrderedDict creation avoidance for .values() queries

Avoid accessing query.extra and query.aggregates directly for .values()
queries. Refs #20950.

Note: See TracTickets for help on using tickets.
Back to Top