Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#33608 closed Bug (invalid)

select_related() to nullable FK with db_constraint=False do not promote OUTER joins.

Reported by: Mathieu Hinderyckx Owned by: nobody
Component: Database layer (models, ORM) Version: 3.2
Severity: Normal Keywords:
Cc: Anssi Kääriäinen Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When using a FK which has db_constraint=False, the sql generated when using the combination of a filter() and select_related() are wrong, causing results which should be included in a queryset to not be.

The problem occurs when the remote object does not exist in the table where the FK points to. In some cases the ORM will generate an INNER JOIN statement instead of a LEFT OUTER JOIN causing a row to be wrongly omitted.

I have made a POC setup for this behaviour on https://github.com/mhindery/django_fk_issue, with testcases containing some working lookups, and one specific method that demonstrate the behaviour. I've included in comments in the tests the query which gets generated, showing the problematic join.

Change History (6)

comment:1 by Mathieu Hinderyckx, 3 years ago

Version: 4.03.2

comment:2 by Mariusz Felisiak, 3 years ago

Cc: Anssi Kääriäinen added
Summary: Wrong sql generation for FK with db_constraint=False causing results to incorrectly get dropped from a querysetselect_related() to nullable FK with db_constraint=False do not promote OUTER joins.
Triage Stage: UnreviewedAccepted

Thanks for the report. Tentatively accepted for future investigation.

comment:3 by Simon Charette, 3 years ago

The question here is should the ORM account for possible integrity violations in the first place, lets not forget that db_constraint was basically introduced as some form of shadow option without in-dept discussion about its desired behaviour with regards to violations.

How the ORM should account for possible integrity violation was already discussed in the past plenty when usage of the MySQL's MyISAM backend (and SQLite in older versions) which doesn't support foreign keys was more common and caused users to run into this exact issue.

Given we've opted not to special case JOIN'ing logic for supports_foreign_keys in the past I don't see why we should take a different stance for db_constraint=False.

If we allowed Object.objects.filter(remote_object_id=1234).select_related('remote_object') to return Object instances what would their .remote_object be? A mock RemoteObject(id=1234) object? Simply None? Aren't both approaches breaking symmetry with Object.objects.filter(remote_object_id=1234).first().remote_object raising a RemoteObject.DoesNotExist exception?

comment:4 by Mariusz Felisiak, 3 years ago

Resolution: invalid
Status: newclosed

Simon, Thanks for the comment. Agreed, db_constraint = False in general should not be used, except for a few really specific cases. Closing as "invalid" as this case is not supported.

in reply to:  3 comment:5 by Mathieu Hinderyckx, 3 years ago

Replying to Simon Charette:

The question here is should the ORM account for possible integrity violations in the first place, lets not forget that db_constraint was basically introduced as some form of shadow option without in-dept discussion about its desired behaviour with regards to violations.

How the ORM should account for possible integrity violation was already discussed in the past plenty when usage of the MySQL's MyISAM backend (and SQLite in older versions) which doesn't support foreign keys was more common and caused users to run into this exact issue.

Given we've opted not to special case JOIN'ing logic for supports_foreign_keys in the past I don't see why we should take a different stance for db_constraint=False.

If we allowed Object.objects.filter(remote_object_id=1234).select_related('remote_object') to return Object instances what would their .remote_object be? A mock RemoteObject(id=1234) object? Simply None? Aren't both approaches breaking symmetry with Object.objects.filter(remote_object_id=1234).first().remote_object raising a RemoteObject.DoesNotExist exception?

Object.objects.filter(remote_object_id=1234).first().remote_object doesn't raise a DoesNotExist exception. It returns null, as it should be. First it fetches the Object with remote_object_id=1234 which exists. Secondly, It then accesses the FK field on that Object pointing to a RemoteObject, which is null as it does not exist in the remote table (hence why the db_constraint=False). That works as expected, just like an 'ordinary' FK. What doesn't work as expected is that adding a select_related() breaks the filter statement. Select_related is an optimization, the expectation is that it does the same two steps as before but with a single DB query. That isn't the case currently; it silently breaks the filtering logic, and doesn't event return the Object at all.

comment:6 by Simon Charette, 3 years ago

Object.objects.filter(remote_object_id=1234).first().remote_object doesn't raise a DoesNotExist exception. It returns null, as it should be.

I think you might be wrong here. Did you test this statement? I did with Django 3.2.12 and 4.0.3.

  • poc/poc_models/tests.py

    diff --git a/poc/poc_models/tests.py b/poc/poc_models/tests.py
    index 27b8249..f7edb32 100644
    a b def setUpTestData(cls):  
    1010
    1111    # All of the test methods below try to fetch the same object which was created in setUpTestData
    1212
     13    def test_access_remote_object_broken_referential_integrity(self):
     14        with self.assertRaises(RemoteObject.DoesNotExist):
     15            Object.objects.filter(remote_object_id=1234).first().remote_object
     16
    1317    def test_remote_object_does_exist_object_should_be_found(self):
    1418        RemoteObject.objects.create(remote_id=1234)

A DoesNotExist is raised when accessing .remote_object because your data breaks integrity expectations that the ORM is built around. If you break these integrity expectations you should expect undefined behaviour.

Note: See TracTickets for help on using tickets.
Back to Top