Code

Opened 8 months ago

Closed 7 months ago

Last modified 7 months ago

#20950 closed Cleanup/optimization (fixed)

Use OrderedDicts in ORM only when needed

Reported by: akaariai Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Initializing OrderedDicts seem to be really slow, at least on Python 2.7. By instantiating OrderedDicts in Query only when needed one can save considerable time. For example model_save_existing benchmark is 1.3x faster, qs_filter_chaining 1.35x faster. Nearly all of the query_ benchmarks have at least 10% speedup.

Patch at https://github.com/akaariai/django/tree/ordered_dict_on_need. Together with splitted_clone branch this gives over 1.5x speedup to model_save_existing.

There might be some cleaner way to implement the "initiate only on need" for Query._aggregates and Query._extra. Ideas welcome.

Ill accept this directly as this seems like a good idea to do. This trades code-cleanness for performance, but in this particular case I think it is worth it.

Attachments (0)

Change History (3)

comment:1 Changed 8 months ago by akaariai

I benchmarked the change that introduced OrderedDict (that is, commit 07876cf02b6db453ca0397c29c225668872fa96d). It seems the change introduces around 15% slowdown in model_save_existing benchmark. Initializing an empty OrderedDict is around 50% slower than Django's SortedDict was (different algorithms, different tradeoffs).

Using Python's version of ordered dictionary is the correct thing to do. I am pretty sure Python's OrderedDict will get optimised implementation some day. But before that happens it seems like a good idea to avoid initialization of empty OrderedDictionaries where possible.

comment:2 Changed 7 months ago by Anssi Kääriäinen <anssi.kaariainen@…>

  • Resolution set to fixed
  • Status changed from new to closed

In ff723d894d9272ea721d1996432ffc806c2b8180:

Fixed #20950 -- Instantiate OrderedDict() only when needed

The use of OrderedDict (even an empty one) was surprisingly slow. By
initializing OrderedDict only when needed it is possible to save
non-trivial amount of computing time (Model.save() is around 30% faster
for example).

This commit targetted sql.Query only, there are likely other places
which could use similar optimizations.

comment:3 Changed 7 months ago by Anssi Kääriäinen <akaariai@…>

In d64060a73650360dcabfdb4928a9e92d090925b1:

OrderedDict creation avoidance for .values() queries

Avoid accessing query.extra and query.aggregates directly for .values()
queries. Refs #20950.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.