Code

Opened 4 years ago

Closed 4 years ago

Last modified 3 years ago

#13132 closed (wontfix)

[signals] Storing id()s instead of actual instances considered harmful

Reported by: george.sakkis@… Owned by: nobody
Component: Core (Other) Version: master
Severity: Keywords: signals
Cc: real.human@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

The signal dispatcher module goes to great lengths to avoid keeping references to the actual senders and receivers and instead holds ids and weakrefs. Aside from the fact that the whole idea is debatable at least (never seen a use case that relied on implicit garbage collection of senders/receivers without doing an explicit disconnect()), it is particularly error prone if one happens to send() an equal but not identical sender to the one used in connect().

I was bitten by this recently in an app that uses plain byte strings as senders. Then some other app called send() passing the sender string as unicode (because it was resolved from a template), which led to an obscure bug since the receivers were not called. And in case someone comes back with "well duh, it's your responsibility to ensure you send byte strings, not the framework's", it gets worse: simply sending str(sender) still fails because it creates a *new* string instance, which has different id() from the one connected. What eventually worked was intern(str(sender)) but that's hardly acceptable. Not only it is an ugly kludge but it works only for byte strings; unicode, integers, tuples, etc. are still bad choices for senders.

The ideal solution would be to get rid of id()s altogether and store the actual senders. If for some reason this is not possible or desirable, a new 'sender_dispatcher_uid' (in the same vein as 'dispatcher_uid') optional parameter should be introduced.

Attachments (0)

Change History (8)

comment:1 Changed 4 years ago by russellm

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Resolution set to invalid
  • Status changed from new to closed

I'm closing this invalid. It sounds like you're trying to use signals in a way that they're not supposed to be used, and getting bitten as a result; in this case, the fault is with usage, not with the signals.

However, it's impossible to tell for sure, because all you've given us to work with is a prose description. Given that you're describing a complex problem, Is a simple code example too much to ask for?

comment:2 Changed 4 years ago by anonymous

  • Resolution invalid deleted
  • Status changed from closed to reopened

If the way signals are supposed to be used is only with class senders, then sure the fault is with the usage. However that's not documented and I'm certainly not the first to use non-class senders; in fact there was a Djangocon presentation on signals that encourages using string labels ("That pesky sender attribute ? If in doubt, use a string").

Simple example:

from django.dispatch import Signal

numcalled = 0
def receiver(sender, **kwds):
    global numcalled; numcalled += 1
    print '  called from %r' % sender

sig = Signal()
sig.connect(receiver, sender='can_i_haz_string_senders')
sig.connect(receiver, sender=13132)

senders = [
    'can_i_haz_string_senders',
    'can_i_haz_' + 'string_senders',
    '_'.join(['can', 'i', 'haz', 'string', 'senders']),
    u'can_i_haz_string_senders',
    str(u'can_i_haz_string_senders'),

    13132,
    int('13132'),
    13131+1,
    13132L,
    int(13132L),
]

print 'Waiting for %d replies' % len(senders)
for sender in senders:
    sig.send(sender=sender)
print 'Got only %d replies' % numcalled

comment:3 Changed 4 years ago by russellm

  • Resolution set to wontfix
  • Status changed from reopened to closed
  • Triage Stage changed from Unreviewed to Accepted

I'm going to close this wontfix.

"If in doubt, use a string" isn't especially good advice. A string will work, but it's the instance of the string object that matters. The whole sender framework is supposed to be identifying the object that sent the message. If you don't want a sender, you should be using the Anonymous sender, rather than trying to fake it with a string.

However, this isn't really well communicated in the docs, so this ticket is a good indicator that the documentation needs to be improved. #13080 is currently tracking documentation issues surrounding the sender argument; I'll make a note that the issue described in this ticket needs addressing as part of the fix for #13080.

On a historical note, the id-based technique that is currently used for signals was introduced during the last big refactor of Django's signal framework. Previous to this change, signal senders were keyed using instances. The signal refactor introduced speedups of up to 90% in some cases. See [8223] for details.

comment:4 Changed 4 years ago by anonymous

  • Resolution wontfix deleted
  • Status changed from closed to reopened

The whole sender framework is supposed to be identifying the object that sent the message

Sure but what we're debating is whether identification should be based on instance identity or equality. I really don't see the reason for using identity here.

It's disappointing that the argument here is performance while the current implementation uses a list instead of a dict to store the receivers and does a linear search for lookups! Regardless, I seriously doubt that changing _make_id(target) to return target instead of id(target) would have any measurable difference in performance. If there are benchmarks available, I'm willing to run them and post the results.

If this stays as wontfix, at least please consider adding a 'sender_dispatch_uid' parameter; it's basically the same reason that 'dispatch_uid' was introduced (fragility of id-based approach).

comment:5 Changed 4 years ago by carljm

  • Resolution set to wontfix
  • Status changed from reopened to closed

As documented in the Contributing docs, please do not reopen tickets marked wontfix by a core developer. If you feel the issue merits further conversation, bring it up on the django-developers mailing list with a link to the ticket. Trac is a bad place for design conversations.

comment:6 Changed 4 years ago by anonymous

Sorry about that, didn't know about these rules; will follow up on the mailing list.

comment:7 Changed 3 years ago by mrmachine

  • Cc real.human@… added

comment:8 Changed 3 years ago by jacob

  • milestone 1.2 deleted

Milestone 1.2 deleted

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.