Opened 12 years ago

Closed 11 years ago

Last modified 11 years ago

#20950 closed Cleanup/optimization (fixed)

Use OrderedDicts in ORM only when needed

Reported by: Anssi Kääriäinen Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no
Pull Requests:How to create a pull request

Description

Initializing OrderedDicts seem to be really slow, at least on Python 2.7. By instantiating OrderedDicts in Query only when needed one can save considerable time. For example model_save_existing benchmark is 1.3x faster, qs_filter_chaining 1.35x faster. Nearly all of the query_ benchmarks have at least 10% speedup.

Patch at https://github.com/akaariai/django/tree/ordered_dict_on_need. Together with splitted_clone branch this gives over 1.5x speedup to model_save_existing.

There might be some cleaner way to implement the "initiate only on need" for Query._aggregates and Query._extra. Ideas welcome.

Ill accept this directly as this seems like a good idea to do. This trades code-cleanness for performance, but in this particular case I think it is worth it.

Change History (3)

comment:1 by Anssi Kääriäinen, 12 years ago

I benchmarked the change that introduced OrderedDict (that is, commit 07876cf02b6db453ca0397c29c225668872fa96d). It seems the change introduces around 15% slowdown in model_save_existing benchmark. Initializing an empty OrderedDict is around 50% slower than Django's SortedDict was (different algorithms, different tradeoffs).

Using Python's version of ordered dictionary is the correct thing to do. I am pretty sure Python's OrderedDict will get optimised implementation some day. But before that happens it seems like a good idea to avoid initialization of empty OrderedDictionaries where possible.

comment:2 by Anssi Kääriäinen <anssi.kaariainen@…>, 11 years ago

Resolution: fixed
Status: newclosed

In ff723d894d9272ea721d1996432ffc806c2b8180:

Fixed #20950 -- Instantiate OrderedDict() only when needed

The use of OrderedDict (even an empty one) was surprisingly slow. By
initializing OrderedDict only when needed it is possible to save
non-trivial amount of computing time (Model.save() is around 30% faster
for example).

This commit targetted sql.Query only, there are likely other places
which could use similar optimizations.

comment:3 by Anssi Kääriäinen <akaariai@…>, 11 years ago

In d64060a73650360dcabfdb4928a9e92d090925b1:

OrderedDict creation avoidance for .values() queries

Avoid accessing query.extra and query.aggregates directly for .values()
queries. Refs #20950.

Note: See TracTickets for help on using tickets.
Back to Top