#7128 closed (fixed)
Infinite Recursion in new QuerySet Refactor when using a post_init signal
Reported by: | Owned by: | Malcolm Tredinnick | |
---|---|---|---|
Component: | Core (Other) | Version: | dev |
Severity: | Keywords: | qsrf-cleanup QuerySet Refactor Infinite Recursion | |
Cc: | Triage Stage: | Design decision needed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The new Queryset refactor uses deepcopy for the where element of Query.clone()
in django.db.models.sql.Query.py line 169(ish) obj.where = deepcopy(self.where)
One gets an object that has a post_init signal (or post_save). (best shown be example)
class MyObj(models.Model): cola = models.Char... colb = models.Char... .. attr_a = None # none DB attribute attr_b = None # none DB attribute class AnotherObj(models.Model): my_obj = models.ForeignKey(MyObj) cola = models.Char... colb = models.Char... class YetAnotherObj(models.Model): my_obj = models.ForeignKey(MyObj) cola = models.Char... colb = models.Char... def after_my_obj_init(sender, instance, signal, *args, **kwargs): # try to fill in the instances 'other attributes' with some preset # querySets for use as properties later #the deepcopy in the of the Querset clone # will cause an intinite HERE (clone on the order_by) instance.attr_a = AnotherObj.objects.filter( my_obj = instance, cola__contains = "Some Text" ).order_by('cola') #No Infinite Recusrion as there is no Clone() called instance.attr_b = AnotherObj.objects.filter( my_obj = instance, cola__contains = "Some Other Text" ) dispatcher.connect(after_my_obj_init, signal=signals.post_init, sender=MyObj)
The above signal will now result in an infinite recursion .. as apparently it keeps trying to call the post_init function on 'instance' as it marches done the Where Tree.
There is a solution, but becuase it can also happen Outside the initsignal on other QuerySets, and seems rather sporatic sometimes for instance anytime clone is used it can crash.
an_obj = MyObj.objects.get(pk = 1) # No infinite recursion my_stuf = YetAnotherObj.objects.filter(cola = "something", my_obj = an_obj) my_stuf = YetAnotherObj.objects.filter(cola = "something") #BOOM (due to the clone()) my_stuf = my_stuf.filter(my_obj = an_obj)
The work around is to use the Primary Key
#in the signal def after_my_obj_init(sender, instance, signal, *args, **kwargs): # try to fill in the instances 'other attributes' with some preset # querySets for use as properties later #the deepcopy in the of the Querset clone # will cause an intinite HERE (clone on the order_by) instance.attr_a = AnotherObj.objects.filter( my_obj__pk = instance.pk, cola__contains = "Some Text" ).order_by('cola') #No Infinite Recusrion as there is no Clone() called instance.attr_b = AnotherObj.objects.filter( my_obj__pk = instance.pk, cola__contains = "Some Other Text" ) #in the other qset my_stuf = my_stuf.filter(my_obj__pk = an_obj.pk)
As you can probably guess, this behavior is rather scary. Changing the "deepcopy" to a normal "copy" did seem to help the issue, but i imagine that will break other tests.
Change History (9)
comment:1 by , 16 years ago
Keywords: | qsrf-cleanup added |
---|
comment:2 by , 16 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:3 by , 16 years ago
milestone: | → 1.0 |
---|
comment:4 by , 16 years ago
Triage Stage: | Accepted → Design decision needed |
---|
comment:5 by , 16 years ago
Owner: | changed from | to
---|
comment:6 by , 16 years ago
Most of these issues i'm finding tends to result from old (like 10 years old) legacy things and 'functions' that we are attempting to port into django ..
I'm sure that by writing the app in django from the beginning, none of these things would appear, as our models would be a bit more 'robust', coherent, etc
But yes there certainly are workarounds
namely making these attribute setters properties of the Model Class, However, the purpose of the post_init was to 'add extra' attributes to other models in other applications
# in app1/models.py class MyObj(models.Model): cola = models.Char... colb = models.Char... .. attr_a = None # none DB attribute attr_b = None # none DB attribute ---------------- #in app2/models.py class YetAnotherObj(models.Model): my_obj = models.ForeignKey(MyObj) cola = models.Char... colb = models.Char... ## define a post_init on MyObj here def post_init_my_obj(sender, instance, signal, *args, **kwargs): instance.attr_a = {Some Query Set depending on YetAnotherObj} dispatcher.connect(post_init_my_obj, signal=signals.post_init, sender=MyObj)
I imagine what you mean as the base work around is this
## define a post_init on MyObj here def post_init_my_obj(sender, instance, signal, *args, **kwargs): if not hasattr(instance, '_set_attr_a'): instance.attr_a = {Some Query Set depending on YetAnotherObj} instance._set_attr_a = True dispatcher.connect(post_init_my_obj, signal=signals.post_init, sender=MyObj)
I think i tried this at some point, but after a few cascades of the post_init (like add a YetAnotherObj post_init, and more and more) the ordering, attributes and some things never got fully set. They would be in some instances, but not other depending on the entry point on the post_init and which app it came from.
So the real issues is that in the application splits "app2" may or may not exist/be loaded, etc.
So the attributes set by app2 would be None .. so yes the work around, all be it ugly
(as you need to know that app2 exists, at least in name), is this
# in app1/models.py class MyObj(models.Model): cola = models.Char... colb = models.Char... .. @property def attr_a(self): try: from app2.models.import YetAnotherObj return {Some query using YetAnotherObj} except ImportError: return None @property def attr_b(self): try: from app2.models.import YetAnotherObj return {Some query using YetAnotherObj} except: return None
comment:7 by , 16 years ago
I'm about to close this as part of a commit that I think fixes the root problem. We'll no longer be storing any reference to models or fields in the Where
object in the normal course of things (people creating their own where-nodes might do so, but that's not something I can control). I think this will fix the cloning problems you're seeing.
Reopen if there's still a case that fails, but I think this is about the best we can hope for as a compromise between speed and robustness (it should actually now be completely robust, unless I've missed something subtle).
The
deepcopy()
call is absolutely needed, since each !Queryset clone must be independent, which means their where-clauses must not share references to internal data structures.I'll play a bit with the
__deepcopy__()
implementation to try and not descend into certain types of objects, but if that slows things down too much (cloning !Querysets is already very expensive), this is going to be a case of hard cheese. The workaround is to set a flag oninstance
in you signal handler to say you are currently initialising it and check that flag before doing anything else (if the flag is set, just return). That way, you will only do the real work once and all the other calls will just return. It's really a problem with trying to use circular references to objects and since this is a pretty unusual sort of situation, having to do a little bit of planning ahead to avoid it won't be the end of the world. So if the attempt to fix it at the root level is too costly, I'm not going to spend too much effort on making that better, since there is a workaround.