#13132 closed (wontfix)
[signals] Storing id()s instead of actual instances considered harmful
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Core (Other) | Version: | dev |
Severity: | Keywords: | signals | |
Cc: | real.human@… | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The signal dispatcher module goes to great lengths to avoid keeping references to the actual senders and receivers and instead holds ids and weakrefs. Aside from the fact that the whole idea is debatable at least (never seen a use case that relied on implicit garbage collection of senders/receivers without doing an explicit disconnect()), it is particularly error prone if one happens to send() an equal but not identical sender to the one used in connect().
I was bitten by this recently in an app that uses plain byte strings as senders. Then some other app called send() passing the sender string as unicode (because it was resolved from a template), which led to an obscure bug since the receivers were not called. And in case someone comes back with "well duh, it's your responsibility to ensure you send byte strings, not the framework's", it gets worse: simply sending str(sender) still fails because it creates a *new* string instance, which has different id() from the one connected. What eventually worked was intern(str(sender)) but that's hardly acceptable. Not only it is an ugly kludge but it works only for byte strings; unicode, integers, tuples, etc. are still bad choices for senders.
The ideal solution would be to get rid of id()s altogether and store the actual senders. If for some reason this is not possible or desirable, a new 'sender_dispatcher_uid' (in the same vein as 'dispatcher_uid') optional parameter should be introduced.
Change History (8)
comment:1 by , 15 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
comment:2 by , 15 years ago
Resolution: | invalid |
---|---|
Status: | closed → reopened |
If the way signals are supposed to be used is only with class senders, then sure the fault is with the usage. However that's not documented and I'm certainly not the first to use non-class senders; in fact there was a Djangocon presentation on signals that encourages using string labels ("That pesky sender attribute ? If in doubt, use a string").
Simple example:
from django.dispatch import Signal numcalled = 0 def receiver(sender, **kwds): global numcalled; numcalled += 1 print ' called from %r' % sender sig = Signal() sig.connect(receiver, sender='can_i_haz_string_senders') sig.connect(receiver, sender=13132) senders = [ 'can_i_haz_string_senders', 'can_i_haz_' + 'string_senders', '_'.join(['can', 'i', 'haz', 'string', 'senders']), u'can_i_haz_string_senders', str(u'can_i_haz_string_senders'), 13132, int('13132'), 13131+1, 13132L, int(13132L), ] print 'Waiting for %d replies' % len(senders) for sender in senders: sig.send(sender=sender) print 'Got only %d replies' % numcalled
comment:3 by , 15 years ago
Resolution: | → wontfix |
---|---|
Status: | reopened → closed |
Triage Stage: | Unreviewed → Accepted |
I'm going to close this wontfix.
"If in doubt, use a string" isn't especially good advice. A string will work, but it's the instance of the string object that matters. The whole sender framework is supposed to be identifying the object that sent the message. If you don't want a sender, you should be using the Anonymous sender, rather than trying to fake it with a string.
However, this isn't really well communicated in the docs, so this ticket is a good indicator that the documentation needs to be improved. #13080 is currently tracking documentation issues surrounding the sender argument; I'll make a note that the issue described in this ticket needs addressing as part of the fix for #13080.
On a historical note, the id-based technique that is currently used for signals was introduced during the last big refactor of Django's signal framework. Previous to this change, signal senders were keyed using instances. The signal refactor introduced speedups of up to 90% in some cases. See [8223] for details.
comment:4 by , 15 years ago
Resolution: | wontfix |
---|---|
Status: | closed → reopened |
The whole sender framework is supposed to be identifying the object that sent the message
Sure but what we're debating is whether identification should be based on instance identity or equality. I really don't see the reason for using identity here.
It's disappointing that the argument here is performance while the current implementation uses a list instead of a dict to store the receivers and does a linear search for lookups! Regardless, I seriously doubt that changing _make_id(target)
to return target
instead of id(target)
would have any measurable difference in performance. If there are benchmarks available, I'm willing to run them and post the results.
If this stays as wontfix, at least please consider adding a 'sender_dispatch_uid' parameter; it's basically the same reason that 'dispatch_uid' was introduced (fragility of id-based approach).
comment:5 by , 15 years ago
Resolution: | → wontfix |
---|---|
Status: | reopened → closed |
As documented in the Contributing docs, please do not reopen tickets marked wontfix by a core developer. If you feel the issue merits further conversation, bring it up on the django-developers mailing list with a link to the ticket. Trac is a bad place for design conversations.
comment:6 by , 15 years ago
Sorry about that, didn't know about these rules; will follow up on the mailing list.
comment:7 by , 14 years ago
Cc: | added |
---|
I'm closing this invalid. It sounds like you're trying to use signals in a way that they're not supposed to be used, and getting bitten as a result; in this case, the fault is with usage, not with the signals.
However, it's impossible to tell for sure, because all you've given us to work with is a prose description. Given that you're describing a complex problem, Is a simple code example too much to ask for?