Code

Opened 4 years ago

Closed 2 years ago

Last modified 19 months ago

#13839 closed Bug (fixed)

select_related caches None for non-existent objects in reverse one-to-one relations

Reported by: shauncutts Owned by: aaugustin
Component: Database layer (models, ORM) Version: 1.2
Severity: Normal Keywords:
Cc: shaun@…, vzima, tomasz.zielinski@…, Helot, sebastian.goll@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Suppose we have the following:

class AM( models.Manager ):
    def get_query_set( self ):
        q = super( AM, self ).get_query_set()
        q = q.select_related( 'b' )
        return q

class A( models.Model ):
    objects = AM()
    
    pass

class B( A ):
    pass

Then when we create an A, and retrieve via the manager we have:

>>> A.objects.create()
<A: A object>
>>> a = A.objects.all()[ 0 ]
>>> a.__dict__
{'_b_cache': None,
 '_state': <django.db.models.base.ModelState object at 0x1fc6930>,
 'id': 2}

This causes problems, as a.b yields None rather than triggering
ObjectDoesNotExist, which causes problems -- e.g.:

>>> a.delete()
/Users/shauncutts/dev/django/db/models/base.pyc in delete(self, using)
    645         # Find all the objects than need to be deleted.

    646         seen_objs = CollectedObjects()
--> 647         self._collect_sub_objects(seen_objs)
    648 
    649         # Actually delete the objects.


/Users/shauncutts/dev/django/db/models/base.pyc in _collect_sub_objects(self, seen_objs, parent, nullable)
--> 580                     sub_obj._collect_sub_objects(seen_objs, self, related.field.null)
    581             else:
    582                 # To make sure we can access all elements, we can't use the


AttributeError: 'NoneType' object has no attribute '_collect_sub_objects'

Attachments (5)

13839_1.diff (830 bytes) - added by shauncutts 4 years ago.
test13839.patch (2.0 KB) - added by mk 4 years ago.
fix13839.patch (2.8 KB) - added by mk 4 years ago.
13839-4.patch (4.7 KB) - added by aaugustin 2 years ago.
13839-5.patch (8.9 KB) - added by aaugustin 2 years ago.

Download all attachments as: .zip

Change History (30)

comment:1 Changed 4 years ago by shauncutts

  • Cc shaun@… added
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

Changed 4 years ago by shauncutts

comment:2 Changed 4 years ago by shauncutts

The patch seems to do the trick... at the very least though, the error message probably needs a tweak -- what info do you typically put into them? Also, still needs a test.

comment:3 Changed 4 years ago by vzima

  • Cc vzima added

comment:4 Changed 4 years ago by mk

Here's a patch for the testsuite which currently fails, but should not.

Changed 4 years ago by mk

comment:5 Changed 4 years ago by mk

  • Has patch set
  • milestone set to 1.3
  • Triage Stage changed from Unreviewed to Accepted

Proposed fix attached. This isn't my area of expertise though but the full test suite passed on Python 2.6 / sqlite3.

Assigning milestone 1.3 because this bug is a possible cause of crashes and should be looked into before another release is made.

Changed 4 years ago by mk

comment:6 Changed 3 years ago by trey

Is this patch planned for inclusion in milestone 1.3?

I am witnessing problems from this bug. I am using code similar to the example code given in the bug description, overriding get_query_set to include select_related on a reverse one-to-one relation. Whenever I delete a model that references an instance of the problem model an error is thrown due to the failed cascading delete on the problem model.

comment:7 Changed 3 years ago by mk

Hi trey

If this patch fixes the problem you are encountering (and it's the problem referenced in this ticket, as you write) you could change the triage state of this ticket to "Ready for checkin".

It might be included in 1.3 if the committers think this patch is ready for checkin too.

Thanks!

comment:8 Changed 3 years ago by anonymous

  • Triage Stage changed from Accepted to Ready for checkin

I have tested the attached patch and confirmed that it fixes the problem I have been encountering.

comment:9 Changed 3 years ago by russellm

  • Component changed from ORM aggregation to Database layer (models, ORM)
  • Owner set to nobody

comment:10 Changed 3 years ago by DrMeers

As long as the fix doesn't necessitate an additional query to determine that the related object doesn't exist. It's quite handy to be able to say x = x.subclass_one or x.subclass_two or x with the None behaviour. Raising DoesNotExist without actually running another query would be acceptable.

comment:11 Changed 3 years ago by TomaszZielinski

  • Cc tomasz.zielinski@… added

Personally I rely on the None behaviour and don't consider it a bug, but it looks like I'm a minority here so I'll just look at how this ends.

Last edited 3 years ago by TomaszZielinski (previous) (diff)

comment:12 Changed 3 years ago by Helot

  • Cc Helot added

I just got bit by this bug, and am now in the position of having some rather clunky code to test for both possibilities, as I don't know whether select_related will have been run once I hit the code section where I need the test.

Hope this makes it into 1.3!

Last edited 3 years ago by Helot (previous) (diff)

comment:13 Changed 3 years ago by anonymous

For those that rely on the current None behavior, if this patch is approved you should be able to get the same results by using

getattr(a, _b_cache, None)

instead of

a._b_cache

comment:14 Changed 3 years ago by graham_king

  • Severity set to Normal
  • Type set to Bug

comment:15 follow-up: Changed 3 years ago by lukeplant

  • Easy pickings unset
  • Patch needs improvement set
  • Triage Stage changed from Ready for checkin to Accepted

If I'm reading this correctly, there are a couple of issues:

  1. The point of selected_related() is that it avoids doing an extra lookup. With the proposed fix, when the related object does not exist, accessing the attribute will cause the lookup to happen. This throws the DoesNotExist as it ought to, but the lookup shouldn't happen at all. I imagine that one way to fix this, while preserving the efficiency of select_related, is to use some other sentinel value that triggers a DoesNotExist exception inside the OneToOne descriptor. (Since querysets need to be pickled, remember that the simplest choice of sentinel value is a class).
  1. It doesn't look like it handles nullable OneToOne relations. In that case, the None object should be cached.

Whether I'm wrong or right in either case, tests that prove the behaviour are needed. In the first case, use assertNumQueries can be used.

comment:16 Changed 3 years ago by jacob

  • milestone 1.3 deleted

Milestone 1.3 deleted

comment:17 Changed 3 years ago by lrekucki

  • Owner changed from nobody to lrekucki
  • UI/UX unset

comment:18 in reply to: ↑ 15 Changed 3 years ago by lrekucki

New patch: https://github.com/django/django/pull/102

  1. The point of selected_related() is that it avoids doing an extra lookup. With the proposed fix, when the related object does not exist, accessing the attribute will cause the lookup to happen. This throws the DoesNotExist as it ought to, but the lookup shouldn't happen at all. I imagine that one way to fix this, while preserving the efficiency of select_related, is to use some other sentinel value that triggers a DoesNotExist exception inside the OneToOne descriptor. (Since querysets need to be pickled, remember that the simplest choice of sentinel value is a class).

Added test for this. A second sentinel is not needed, because we can just check the field.null if None is a valid value (Edit: currently, this is always False so we can just raise the exception on None, see below).

  1. It doesn't look like it handles nullable OneToOne relations. In that case, the None object should be cached.

Modified the current test to explicitly check a nullable OneToOneField that it returns None

One thing to note here is that the null flag on OneToOneField is irrelevant to this issue as it controls the other side of relation. At least in my understanding which is:

Take two models:

class Parent(Model):
    pass

class Child(Model):
    parent = OneToOneField(Parent)

The following is true:

  1. This defines a 1-1 relation between the two with Child having a DB field which is a non-nullable foreign key.
  2. It is possible to create a Parent model, with no Child model.
  3. Currently Parent.objects.select_related('child') will cache None as child value which is not correct.
  4. There is no way of creating a Child model without a Parent instance (as it is DB enforced).
  5. On databases without strict FK checking, creating a Child model with an FK pointing to non-existant Parent is possible. But Child.object.select_related('parent') will result in an INNER JOIN and prevent such instances of Child from being returned (is this intentional ?).
  6. Adding null=true, allows creating Child instances without a Parent (it doesn't affect the reverse relation, which is declared to be required, but isn't enforced at the DB level see 2.). Child.object.select_related('parent') will return all Child instances, caching None for missing Parents. This is a 1?-1 (or 0-1) relation.

Defining 1-1? (1-0) relations is subject of ticket #10227. Without that parent.child should never return None.

Last edited 3 years ago by lrekucki (previous) (diff)

comment:19 Changed 3 years ago by sebastian

  • Cc sebastian.goll@… added

comment:20 Changed 2 years ago by lrekucki

  • Patch needs improvement unset

comment:21 Changed 2 years ago by aaugustin

  • Owner changed from lrekucki to aaugustin
  • Patch needs improvement set

I had started working in this area and I had missed the pull request hidden in the middle of the comments :/

In fact, I've used the same approach as Łukasz, except that I've also used the new meaning of None to cache DoesNotExist errors in the general case.

This is basically an improvement of caching over OneToOne relations. It has the side effect of fixing select_related — because it gives a meaning to None in the cache, and that meaning is to raise an exception. So I'm uploading the patch here.


I'm going to demonstrate the changes with an example, here are the models I'll use:

class Group(models.Model):
    name = models.CharField(max_length=10)

class GroupExtra(models.Model):
    group = models.OneToOneField(Group)

The patch does two significant changes.

1) Cache DoesNotExist in SingleRelatedObjectDescriptor

>>> group = Group.objects.create(name='test')
>>> group.groupextra                # 1 SQL query, raises DoesNotExist
>>> group.groupextra                # no SQL query, raises DoesNotExist (previously, did 1 SQL query)

2) Cache the reverse relation in SingleRelatedObjectDescriptor and ReverseSingleRelatedObjectDescriptor

>>> group = Group.objects.create(name='test')
>>> groupextra = GroupExtra.objects.create(group=group)
>>> group.groupextra                # no SQL query, returns groupextra (previously, did 1 SQL query)

>>> group_copy = Group.objects.get(pk=group.pk)
>>> group_copy.groupextra           # 1 SQL query, return groupextra
>>> group_copy.groupextra.group     # no SQL query, returns group_copy (previously, did 1 SQL query)

It also does a minor optimization:

3) Use self.cache_name instead of self.field.get_cache_name(), since it is computed in __init__.


This still needs a few more tests with assertNumQueries.

Changed 2 years ago by aaugustin

Changed 2 years ago by aaugustin

comment:22 Changed 2 years ago by aaugustin

  • Patch needs improvement unset

This latest patch adds lots of tests and a little bit more refactoring to ensure relations are always cached both ways on OneToOne fields.

Assuming the test suite passes, I'm ready to commit that.

comment:23 Changed 2 years ago by aaugustin

  • Resolution set to fixed
  • Status changed from new to closed

In [17890]:

Made the caching of related and reverse related objects consistent in OneToOneFields. Fixed #13839. Refs #17439.

comment:24 Changed 19 months ago by Aymeric Augustin <aymeric.augustin@…>

In e9c24bef74e55729b190cf07e0ac452aa4c86fcd:

Fix #19524 -- Incorrect caching of parents of unsaved model instances.

Thanks qcwxezdas for the report. Refs #13839.

comment:25 Changed 19 months ago by Aymeric Augustin <aymeric.augustin@…>

In 5097d3c5faab2b6582c4cebee2b265fcdbb893eb:

[1.5.x] Fix #19524 -- Incorrect caching of parents of unsaved model instances.

Thanks qcwxezdas for the report. Refs #13839.

Backport of e9c24be.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.