Opened 14 years ago

Closed 13 years ago

Last modified 12 years ago

#13839 closed Bug (fixed)

select_related caches None for non-existent objects in reverse one-to-one relations

Reported by: Shaun Cutts Owned by: Aymeric Augustin
Component: Database layer (models, ORM) Version: 1.2
Severity: Normal Keywords:
Cc: shaun@…, Vlastimil Zíma, tomasz.zielinski@…, John Krukoff, sebastian.goll@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Suppose we have the following:

class AM( models.Manager ):
    def get_query_set( self ):
        q = super( AM, self ).get_query_set()
        q = q.select_related( 'b' )
        return q

class A( models.Model ):
    objects = AM()
    
    pass

class B( A ):
    pass

Then when we create an A, and retrieve via the manager we have:

>>> A.objects.create()
<A: A object>
>>> a = A.objects.all()[ 0 ]
>>> a.__dict__
{'_b_cache': None,
 '_state': <django.db.models.base.ModelState object at 0x1fc6930>,
 'id': 2}

This causes problems, as a.b yields None rather than triggering
ObjectDoesNotExist, which causes problems -- e.g.:

>>> a.delete()
/Users/shauncutts/dev/django/db/models/base.pyc in delete(self, using)
    645         # Find all the objects than need to be deleted.

    646         seen_objs = CollectedObjects()
--> 647         self._collect_sub_objects(seen_objs)
    648 
    649         # Actually delete the objects.


/Users/shauncutts/dev/django/db/models/base.pyc in _collect_sub_objects(self, seen_objs, parent, nullable)
--> 580                     sub_obj._collect_sub_objects(seen_objs, self, related.field.null)
    581             else:
    582                 # To make sure we can access all elements, we can't use the


AttributeError: 'NoneType' object has no attribute '_collect_sub_objects'

Attachments (5)

13839_1.diff (830 bytes ) - added by Shaun Cutts 14 years ago.
test13839.patch (2.0 KB ) - added by Matthias Kestenholz 14 years ago.
fix13839.patch (2.8 KB ) - added by Matthias Kestenholz 14 years ago.
13839-4.patch (4.7 KB ) - added by Aymeric Augustin 13 years ago.
13839-5.patch (8.9 KB ) - added by Aymeric Augustin 13 years ago.

Download all attachments as: .zip

Change History (30)

comment:1 by Shaun Cutts, 14 years ago

Cc: shaun@… added

by Shaun Cutts, 14 years ago

Attachment: 13839_1.diff added

comment:2 by Shaun Cutts, 14 years ago

The patch seems to do the trick... at the very least though, the error message probably needs a tweak -- what info do you typically put into them? Also, still needs a test.

comment:3 by Vlastimil Zíma, 14 years ago

Cc: Vlastimil Zíma added

comment:4 by Matthias Kestenholz, 14 years ago

Here's a patch for the testsuite which currently fails, but should not.

by Matthias Kestenholz, 14 years ago

Attachment: test13839.patch added

comment:5 by Matthias Kestenholz, 14 years ago

Has patch: set
milestone: 1.3
Triage Stage: UnreviewedAccepted

Proposed fix attached. This isn't my area of expertise though but the full test suite passed on Python 2.6 / sqlite3.

Assigning milestone 1.3 because this bug is a possible cause of crashes and should be looked into before another release is made.

by Matthias Kestenholz, 14 years ago

Attachment: fix13839.patch added

comment:6 by Trey, 14 years ago

Is this patch planned for inclusion in milestone 1.3?

I am witnessing problems from this bug. I am using code similar to the example code given in the bug description, overriding get_query_set to include select_related on a reverse one-to-one relation. Whenever I delete a model that references an instance of the problem model an error is thrown due to the failed cascading delete on the problem model.

comment:7 by Matthias Kestenholz, 14 years ago

Hi trey

If this patch fixes the problem you are encountering (and it's the problem referenced in this ticket, as you write) you could change the triage state of this ticket to "Ready for checkin".

It might be included in 1.3 if the committers think this patch is ready for checkin too.

Thanks!

comment:8 by anonymous, 14 years ago

Triage Stage: AcceptedReady for checkin

I have tested the attached patch and confirmed that it fixes the problem I have been encountering.

comment:9 by Russell Keith-Magee, 14 years ago

Component: ORM aggregationDatabase layer (models, ORM)
Owner: set to nobody

comment:10 by Simon Meers, 14 years ago

As long as the fix doesn't necessitate an additional query to determine that the related object doesn't exist. It's quite handy to be able to say x = x.subclass_one or x.subclass_two or x with the None behaviour. Raising DoesNotExist without actually running another query would be acceptable.

comment:11 by Tomasz Zieliński, 14 years ago

Cc: tomasz.zielinski@… added

Personally I rely on the None behaviour and don't consider it a bug, but it looks like I'm a minority here so I'll just look at how this ends.

Last edited 14 years ago by Tomasz Zieliński (previous) (diff)

comment:12 by John Krukoff, 14 years ago

Cc: John Krukoff added

I just got bit by this bug, and am now in the position of having some rather clunky code to test for both possibilities, as I don't know whether select_related will have been run once I hit the code section where I need the test.

Hope this makes it into 1.3!

Last edited 14 years ago by John Krukoff (previous) (diff)

comment:13 by anonymous, 14 years ago

For those that rely on the current None behavior, if this patch is approved you should be able to get the same results by using

getattr(a, _b_cache, None)

instead of

a._b_cache

comment:14 by Graham King, 14 years ago

Severity: Normal
Type: Bug

comment:15 by Luke Plant, 14 years ago

Easy pickings: unset
Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

If I'm reading this correctly, there are a couple of issues:

  1. The point of selected_related() is that it avoids doing an extra lookup. With the proposed fix, when the related object does not exist, accessing the attribute will cause the lookup to happen. This throws the DoesNotExist as it ought to, but the lookup shouldn't happen at all. I imagine that one way to fix this, while preserving the efficiency of select_related, is to use some other sentinel value that triggers a DoesNotExist exception inside the OneToOne descriptor. (Since querysets need to be pickled, remember that the simplest choice of sentinel value is a class).
  1. It doesn't look like it handles nullable OneToOne relations. In that case, the None object should be cached.

Whether I'm wrong or right in either case, tests that prove the behaviour are needed. In the first case, use assertNumQueries can be used.

comment:16 by Jacob, 13 years ago

milestone: 1.3

Milestone 1.3 deleted

comment:17 by Łukasz Rekucki, 13 years ago

Owner: changed from nobody to Łukasz Rekucki
UI/UX: unset

in reply to:  15 comment:18 by Łukasz Rekucki, 13 years ago

New patch: https://github.com/django/django/pull/102

  1. The point of selected_related() is that it avoids doing an extra lookup. With the proposed fix, when the related object does not exist, accessing the attribute will cause the lookup to happen. This throws the DoesNotExist as it ought to, but the lookup shouldn't happen at all. I imagine that one way to fix this, while preserving the efficiency of select_related, is to use some other sentinel value that triggers a DoesNotExist exception inside the OneToOne descriptor. (Since querysets need to be pickled, remember that the simplest choice of sentinel value is a class).

Added test for this. A second sentinel is not needed, because we can just check the field.null if None is a valid value (Edit: currently, this is always False so we can just raise the exception on None, see below).

  1. It doesn't look like it handles nullable OneToOne relations. In that case, the None object should be cached.

Modified the current test to explicitly check a nullable OneToOneField that it returns None

One thing to note here is that the null flag on OneToOneField is irrelevant to this issue as it controls the other side of relation. At least in my understanding which is:

Take two models:

class Parent(Model):
    pass

class Child(Model):
    parent = OneToOneField(Parent)

The following is true:

  1. This defines a 1-1 relation between the two with Child having a DB field which is a non-nullable foreign key.
  2. It is possible to create a Parent model, with no Child model.
  3. Currently Parent.objects.select_related('child') will cache None as child value which is not correct.
  4. There is no way of creating a Child model without a Parent instance (as it is DB enforced).
  5. On databases without strict FK checking, creating a Child model with an FK pointing to non-existant Parent is possible. But Child.object.select_related('parent') will result in an INNER JOIN and prevent such instances of Child from being returned (is this intentional ?).
  6. Adding null=true, allows creating Child instances without a Parent (it doesn't affect the reverse relation, which is declared to be required, but isn't enforced at the DB level see 2.). Child.object.select_related('parent') will return all Child instances, caching None for missing Parents. This is a 1?-1 (or 0-1) relation.

Defining 1-1? (1-0) relations is subject of ticket #10227. Without that parent.child should never return None.

Last edited 13 years ago by Łukasz Rekucki (previous) (diff)

comment:19 by Sebastian Goll, 13 years ago

Cc: sebastian.goll@… added

comment:20 by Łukasz Rekucki, 13 years ago

Patch needs improvement: unset

comment:21 by Aymeric Augustin, 13 years ago

Owner: changed from Łukasz Rekucki to Aymeric Augustin
Patch needs improvement: set

I had started working in this area and I had missed the pull request hidden in the middle of the comments :/

In fact, I've used the same approach as Łukasz, except that I've also used the new meaning of None to cache DoesNotExist errors in the general case.

This is basically an improvement of caching over OneToOne relations. It has the side effect of fixing select_related — because it gives a meaning to None in the cache, and that meaning is to raise an exception. So I'm uploading the patch here.


I'm going to demonstrate the changes with an example, here are the models I'll use:

class Group(models.Model):
    name = models.CharField(max_length=10)

class GroupExtra(models.Model):
    group = models.OneToOneField(Group)

The patch does two significant changes.

1) Cache DoesNotExist in SingleRelatedObjectDescriptor

>>> group = Group.objects.create(name='test')
>>> group.groupextra                # 1 SQL query, raises DoesNotExist
>>> group.groupextra                # no SQL query, raises DoesNotExist (previously, did 1 SQL query)

2) Cache the reverse relation in SingleRelatedObjectDescriptor and ReverseSingleRelatedObjectDescriptor

>>> group = Group.objects.create(name='test')
>>> groupextra = GroupExtra.objects.create(group=group)
>>> group.groupextra                # no SQL query, returns groupextra (previously, did 1 SQL query)

>>> group_copy = Group.objects.get(pk=group.pk)
>>> group_copy.groupextra           # 1 SQL query, return groupextra
>>> group_copy.groupextra.group     # no SQL query, returns group_copy (previously, did 1 SQL query)

It also does a minor optimization:

3) Use self.cache_name instead of self.field.get_cache_name(), since it is computed in __init__.


This still needs a few more tests with assertNumQueries.

by Aymeric Augustin, 13 years ago

Attachment: 13839-4.patch added

by Aymeric Augustin, 13 years ago

Attachment: 13839-5.patch added

comment:22 by Aymeric Augustin, 13 years ago

Patch needs improvement: unset

This latest patch adds lots of tests and a little bit more refactoring to ensure relations are always cached both ways on OneToOne fields.

Assuming the test suite passes, I'm ready to commit that.

comment:23 by Aymeric Augustin, 13 years ago

Resolution: fixed
Status: newclosed

In [17890]:

Made the caching of related and reverse related objects consistent in OneToOneFields. Fixed #13839. Refs #17439.

comment:24 by Aymeric Augustin <aymeric.augustin@…>, 12 years ago

In e9c24bef74e55729b190cf07e0ac452aa4c86fcd:

Fix #19524 -- Incorrect caching of parents of unsaved model instances.

Thanks qcwxezdas for the report. Refs #13839.

comment:25 by Aymeric Augustin <aymeric.augustin@…>, 12 years ago

In 5097d3c5faab2b6582c4cebee2b265fcdbb893eb:

[1.5.x] Fix #19524 -- Incorrect caching of parents of unsaved model instances.

Thanks qcwxezdas for the report. Refs #13839.

Backport of e9c24be.

Note: See TracTickets for help on using tickets.
Back to Top