Opened 7 years ago

Closed 7 years ago

Last modified 3 years ago

#7128 closed (fixed)

Infinite Recursion in new QuerySet Refactor when using a post_init signal

Reported by: bo <bo.blanton@…> Owned by: mtredinnick
Component: Core (Other) Version: master
Severity: Keywords: qsrf-cleanup QuerySet Refactor Infinite Recursion
Cc: Triage Stage: Design decision needed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

The new Queryset refactor uses deepcopy for the where element of Query.clone()

in django.db.models.sql.Query.py line 169(ish)
        obj.where = deepcopy(self.where)

One gets an object that has a post_init signal (or post_save). (best shown be example)

class MyObj(models.Model):
	cola = models.Char...
	colb = models.Char...
	..
	attr_a = None # none DB attribute
	attr_b = None # none DB attribute

class AnotherObj(models.Model):
	my_obj = models.ForeignKey(MyObj)
	cola = models.Char...
	colb = models.Char...

class YetAnotherObj(models.Model):
	my_obj = models.ForeignKey(MyObj)
	cola = models.Char...
	colb = models.Char...
	

def after_my_obj_init(sender, instance, signal, *args, **kwargs):
	# try to fill in the instances 'other attributes' with some preset
	# querySets for use as properties later
	
	#the deepcopy in the of the Querset clone
	# will cause an intinite HERE (clone on the order_by)
	instance.attr_a = AnotherObj.objects.filter(
		my_obj = instance, 
		cola__contains = "Some Text"
	).order_by('cola')
	
	#No Infinite Recusrion as there is no Clone() called
	instance.attr_b = AnotherObj.objects.filter(
		my_obj = instance, 
		cola__contains = "Some Other Text"
	)
	
dispatcher.connect(after_my_obj_init, signal=signals.post_init, sender=MyObj)

The above signal will now result in an infinite recursion .. as apparently it keeps trying to call the post_init function on 'instance' as it marches done the Where Tree.

There is a solution, but becuase it can also happen Outside the initsignal on other QuerySets, and seems rather sporatic sometimes for instance anytime clone is used it can crash.

an_obj = MyObj.objects.get(pk = 1)

# No infinite recursion
my_stuf = YetAnotherObj.objects.filter(cola = "something", my_obj = an_obj)
my_stuf = YetAnotherObj.objects.filter(cola = "something")

#BOOM (due to the clone())
my_stuf = my_stuf.filter(my_obj = an_obj)

The work around is to use the Primary Key

#in the signal
def after_my_obj_init(sender, instance, signal, *args, **kwargs):
	# try to fill in the instances 'other attributes' with some preset
	# querySets for use as properties later
	
	#the deepcopy in the of the Querset clone
	# will cause an intinite HERE (clone on the order_by)
	instance.attr_a = AnotherObj.objects.filter(
		my_obj__pk = instance.pk, 
		cola__contains = "Some Text"
	).order_by('cola')
	
	#No Infinite Recusrion as there is no Clone() called
	instance.attr_b = AnotherObj.objects.filter(
		my_obj__pk = instance.pk, 
		cola__contains = "Some Other Text"
	)

#in the other qset
my_stuf = my_stuf.filter(my_obj__pk = an_obj.pk)

As you can probably guess, this behavior is rather scary. Changing the "deepcopy" to a normal "copy" did seem to help the issue, but i imagine that will break other tests.

Change History (9)

comment:1 Changed 7 years ago by gav

  • Keywords qsrf-cleanup added
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

comment:2 Changed 7 years ago by Simon Greenhill

  • Triage Stage changed from Unreviewed to Accepted

comment:3 Changed 7 years ago by jacob

  • milestone set to 1.0

comment:4 Changed 7 years ago by mtredinnick

  • Triage Stage changed from Accepted to Design decision needed

The deepcopy() call is absolutely needed, since each !Queryset clone must be independent, which means their where-clauses must not share references to internal data structures.

I'll play a bit with the __deepcopy__() implementation to try and not descend into certain types of objects, but if that slows things down too much (cloning !Querysets is already very expensive), this is going to be a case of hard cheese. The workaround is to set a flag on instance in you signal handler to say you are currently initialising it and check that flag before doing anything else (if the flag is set, just return). That way, you will only do the real work once and all the other calls will just return. It's really a problem with trying to use circular references to objects and since this is a pretty unusual sort of situation, having to do a little bit of planning ahead to avoid it won't be the end of the world. So if the attempt to fix it at the root level is too costly, I'm not going to spend too much effort on making that better, since there is a workaround.

comment:5 Changed 7 years ago by mtredinnick

  • Owner changed from nobody to mtredinnick

comment:6 Changed 7 years ago by bo <bo.blanton@…>

Most of these issues i'm finding tends to result from old (like 10 years old) legacy things and 'functions' that we are attempting to port into django ..
I'm sure that by writing the app in django from the beginning, none of these things would appear, as our models would be a bit more 'robust', coherent, etc

But yes there certainly are workarounds

namely making these attribute setters properties of the Model Class, However, the purpose of the post_init was to 'add extra' attributes to other models in other applications

# in app1/models.py
class MyObj(models.Model):
	cola = models.Char...
	colb = models.Char...
	..
	attr_a = None # none DB attribute
	attr_b = None # none DB attribute

----------------
#in app2/models.py

class YetAnotherObj(models.Model):
	my_obj = models.ForeignKey(MyObj)
	cola = models.Char...
	colb = models.Char...

## define a post_init on MyObj here
def post_init_my_obj(sender, instance, signal, *args, **kwargs):
    instance.attr_a = {Some Query Set depending on YetAnotherObj}

dispatcher.connect(post_init_my_obj, signal=signals.post_init, sender=MyObj)

I imagine what you mean as the base work around is this

## define a post_init on MyObj here
def post_init_my_obj(sender, instance, signal, *args, **kwargs):
    if not hasattr(instance, '_set_attr_a'):
            instance.attr_a = {Some Query Set depending on YetAnotherObj}
            instance._set_attr_a = True

dispatcher.connect(post_init_my_obj, signal=signals.post_init, sender=MyObj)

I think i tried this at some point, but after a few cascades of the post_init (like add a YetAnotherObj post_init, and more and more) the ordering, attributes and some things never got fully set. They would be in some instances, but not other depending on the entry point on the post_init and which app it came from.

So the real issues is that in the application splits "app2" may or may not exist/be loaded, etc.
So the attributes set by app2 would be None .. so yes the work around, all be it ugly
(as you need to know that app2 exists, at least in name), is this

# in app1/models.py
class MyObj(models.Model):
	cola = models.Char...
	colb = models.Char...
	..
        @property
        def attr_a(self):
              try:
                  from app2.models.import YetAnotherObj
                  return {Some query using YetAnotherObj}
              except ImportError:
                  return None

        @property
        def attr_b(self):
              try:
                     from app2.models.import YetAnotherObj
                     return {Some query using YetAnotherObj}
              except:
                     return None

comment:7 Changed 7 years ago by mtredinnick

I'm about to close this as part of a commit that I think fixes the root problem. We'll no longer be storing any reference to models or fields in the Where object in the normal course of things (people creating their own where-nodes might do so, but that's not something I can control). I think this will fix the cloning problems you're seeing.

Reopen if there's still a case that fails, but I think this is about the best we can hope for as a compromise between speed and robustness (it should actually now be completely robust, unless I've missed something subtle).

comment:8 Changed 7 years ago by mtredinnick

  • Resolution set to fixed
  • Status changed from new to closed

Fixed in [7773].

comment:9 Changed 3 years ago by jacob

  • milestone 1.0 deleted

Milestone 1.0 deleted

Note: See TracTickets for help on using tickets.
Back to Top