Opened 13 years ago

Closed 13 years ago

Last modified 13 years ago

#16639 closed Cleanup/optimization (wontfix)

Optimize Model initialization by sending pre and post init signals only if there are listeners

Reported by: Anssi Kääriäinen Owned by: nobody
Component: Database layer (models, ORM) Version: 1.3
Severity: Normal Keywords:
Cc: anssi.kaariainen@… Triage Stage: Design decision needed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

The attached patch implements an optimization to Model object initialization. The idea is to check if there are any listeners to signals. If not, then we can skip the sending and save time. The attached benchmark.tar.gz has some tests. Unpack and run python speedtester.py. See the file for details.

The most interesting test is a loop of 10000 model initializations with a post and pre init signal listeners connected to a _different_ model. The result is that pre patch it takes 0.21 seconds to initialize objects, after patch only 0.07 seconds.

This situation is interesting because GenericForeignKey will attach to pre_init signal, and ImageField to post_init signal. So if you have both you will pay 0.14 seconds extra for every batch of 10000 objects created. That is true for all models in your project. After patch, you will pay the price only when there is somebody actually listening.

The patch is complete, expect that I don't know if support for .connect() without sender kwarg is needed. It is possible to support that case also, though a little more code is needed.

As a completely non-scientific data point, the test suite ran 2% faster after patch (and was passed).

Attachments (2)

benchmark.tar.gz (20.0 KB ) - added by Anssi Kääriäinen 13 years ago.
obj-creation-speed.diff (7.1 KB ) - added by Anssi Kääriäinen 13 years ago.

Download all attachments as: .zip

Change History (12)

by Anssi Kääriäinen, 13 years ago

Attachment: benchmark.tar.gz added

comment:1 by Anssi Kääriäinen, 13 years ago

Needs tests: set
Patch needs improvement: set

I checked the documentation, and pre_init and post_init can be listened globally (without specifying in which model you are interested). Not that it makes much sense to do so... But, I will update the patch so that it handles this situation also and write a test for that situation.

comment:2 by Anssi Kääriäinen, 13 years ago

Needs tests: unset
Patch needs improvement: unset

Updated patch, now with tests. Now works when .connect() is called without the sender kwarg.

Notably there were no tests for pre_init and post_init signals before (although some tests do depend on post and pre init signals indirectly).

I think the patch is ready for a committer, but a review is needed. The only ugly thing is that the disconnect methods do not clear the has_listener flags. This can result in some overhead if you connect and then disconnect a listener, especially when using connect() without sender.

In summary, the patch gets rid of signal sending overhead when initiating models by not trying to send signals when there are no listeners. This can result in almost 70% more performance in the best case, and should result in 20-30% better performance in many common cases. The downside is that there is some special handling needed for pre and post init signals.

If the patch is not accepted, at least the added tests should be included.

by Anssi Kääriäinen, 13 years ago

Attachment: obj-creation-speed.diff added

comment:3 by Aymeric Augustin, 13 years ago

Triage Stage: UnreviewedDesign decision needed

The goal is to avoid a function call to the signals library, so this idea can't be implemented purely within the signals library. I'm uneasy with adding coupling between the models and the signals libraries. But the patch uses only public APIs, so it isn't too much of a hack.

IMO, instantiating thousands of models and not saving them to the database isn't a very realistic workload. Is saving 14µs (µs, not ms) per model instanciation sufficient to justify this, especially compared to the time required to fetch an instance from the database or save it?

in reply to:  3 comment:4 by Anssi Kääriäinen, 13 years ago

Replying to aaugustin:

The goal is to avoid a function call to the signals library, so this idea can't be implemented purely within the signals library. I'm uneasy with adding coupling between the models and the signals libraries. But the patch uses only public APIs, so it isn't too much of a hack.

There are two goals:

  1. Avoid the function call.
  2. Avoid the bigger overhead produced by signal listeners to OTHER models than the model you are currently instantiating. That is, if you have a signal for model T2 and you instantiate T1, you pay overhead. The overhead of function call is 0.01s per 10000 objects on my laptop. The overhead of calling the function when there is no listener for your model but there is a listener for some other model is 0.07s per 10000 objects on my laptop.

So, if you have ANY registered signal handlers for pre_init in you project, you will pay 0.07s for 10000 objects for EVERY model in your whole project. This is the situation if you use GenericForeignKey. You will have a post_init signal if you any ImageFields in your project. That means another 0.07s overhead for. For the whole project.

IMO, instantiating thousands of models and not saving them to the database isn't a very realistic workload. Is saving 14µs (µs, not ms) per model instanciation sufficient to justify this, especially compared to the time required to fetch an instance from the database or save it?

According to my tests it will take 0.13s per 10000 simple objects (just the id field) from the DB without any signal handling overhead. If you have both signals in use somewhere in you project, this will jump to 0.27s, or over 100% overhead. Without signal handling it takes 0.33 for a "fatter" model (id + 10 text fields). With the overhead of signals defined for other models, this is 0.47s, or overhead of 42%. These numbers are based on a post at the django-developers [1] and calculation, I can't run the benchmarks at the moment.

As the main thing to guard against is not the function call, but the overhead of signals defined for other models, it seems much better to have the per-model checking in the .send() function, not in the model.__init__ function. That would be clean and you would get most of the benefit. I started to guard against the function call, later on realized that was not the main cause of the signals problem. Thus the patch seems like I am guarding against the function call which is not the case.

[1] http://groups.google.com/group/django-developers/browse_thread/thread/5de2b14acfa0997c

comment:5 by Aymeric Augustin, 13 years ago

Sorry, I hadn't read your post on django-developers. It provides enough information about how much this optimization saves.

You still need to get a core developer's attention to this; unfortunately, the thread on django-developers is more a monologue than a discussion. You should send a link to this ticket, saying that it contains a patch. You could try #django-dev on IRC, too.

comment:6 by Anssi Kääriäinen, 13 years ago

Patch needs improvement: set

I will need to update the patch so that the logic is all contained in the PreInitSignal and PostInitSignal classes. Or maybe an InitSignal subclass with customized __init__, connect() and send() could do the trick. This way there needs to be no logic in models.Options and Model.__init__. Seems like the right way to go. You will need to pay the function call overhead however, but I think compared to the current patch you can save at maximum 0.01s per 10000 objects by getting rid of the function call. That means somewhere around 1-5% percent overhead in the more realistic use cases. Probably not worth the complexity...

I wonder if it would be possible to make the base Signal class more intelligent itself... I will check that approach, also.

After that I will try to get some attention from core developers.

comment:7 by Anssi Kääriäinen, 13 years ago

I tested the approach of localizing the changes to a new InitSignal class. Otherwise a very good idea, except it adds too much overhead. My tests indicate that instantiating 10000 objects with the current patch takes 0.085s, and 0.110s with the localized approach. This is without signals, and this is actually just a bit slower than trunk. It does however remove the larger overhead when there are signals for other models.

In my opinion, the current patch is the way to go. Unless somebody has other ideas...

comment:8 by Russell Keith-Magee, 13 years ago

Resolution: wontfix
Status: newclosed

I'm going to make an executive decision and mark this wontfix. The signals framework already has a "don't fire if there aren't any signals" case, which exists specifically as a performance gain. This patch microoptimizes this for one specific use case, at the cost of some more memory and some code path complexity.

If there's a gain to be made here, I'd be more interested in seeing it at the base signals level, rather than the level of a a microoptimization for just pre and post. That way *every* signal will see the benefit, rather than just 2 very specific signals.

comment:9 by Anssi Kääriäinen, 13 years ago

You are correct. The patch is uqly and causes code path complexity. However I do think that the model init signals are somewhat special in that they are fired more often than other signals, and the overhead they cause can be 2x more than the cost of rest of the action (in this case model's __init__). I do not believe that to be true for other signals.

The "don't fire if there aren't any signals" case is fast enough. The problematic case which is not fast enough is "don't fire if there aren't any signals to this class". That is, signal connected to class T2 will make send for class T1 (which does not have any listeners) more expensive than it needs to be. I will see if there is any nice way to fix that in the base Signal class. If I found a nice way, I guess a new ticket is the way to go?

in reply to:  9 comment:10 by Russell Keith-Magee, 13 years ago

Replying to akaariai:

You are correct. The patch is uqly and causes code path complexity. However I do think that the model init signals are somewhat special in that they are fired more often than other signals, and the overhead they cause can be 2x more than the cost of rest of the action (in this case model's __init__). I do not believe that to be true for other signals.

A general rule of thumb is that it's the first signal that is the expensive one. After that, the increase is acceptable and linear with the number of signals; the only cheap case is "No signals" (IIRC, the cost there is roughly comparable to 1 function call).

The "don't fire if there aren't any signals" case is fast enough. The problematic case which is not fast enough is "don't fire if there aren't any signals to this class". That is, signal connected to class T2 will make send for class T1 (which does not have any listeners) more expensive than it needs to be. I will see if there is any nice way to fix that in the base Signal class. If I found a nice way, I guess a new ticket is the way to go?

Yes - if you can find a way to optimize the "no signals to this class" (or, for that matter, any other optimizations to the signal infrastructure for other cases), a new ticket would be called for.

Note: See TracTickets for help on using tickets.
Back to Top