#33608 closed Bug (invalid)
select_related() to nullable FK with db_constraint=False do not promote OUTER joins.
Reported by: | Mathieu Hinderyckx | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.2 |
Severity: | Normal | Keywords: | |
Cc: | Anssi Kääriäinen | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
When using a FK which has db_constraint=False, the sql generated when using the combination of a filter() and select_related() are wrong, causing results which should be included in a queryset to not be.
The problem occurs when the remote object does not exist in the table where the FK points to. In some cases the ORM will generate an INNER JOIN
statement instead of a LEFT OUTER JOIN
causing a row to be wrongly omitted.
I have made a POC setup for this behaviour on https://github.com/mhindery/django_fk_issue, with testcases containing some working lookups, and one specific method that demonstrate the behaviour. I've included in comments in the tests the query which gets generated, showing the problematic join.
Change History (6)
comment:1 by , 3 years ago
Version: | 4.0 → 3.2 |
---|
comment:2 by , 3 years ago
Cc: | added |
---|---|
Summary: | Wrong sql generation for FK with db_constraint=False causing results to incorrectly get dropped from a queryset → select_related() to nullable FK with db_constraint=False do not promote OUTER joins. |
Triage Stage: | Unreviewed → Accepted |
follow-up: 5 comment:3 by , 3 years ago
The question here is should the ORM account for possible integrity violations in the first place, lets not forget that db_constraint
was basically introduced as some form of shadow option without in-dept discussion about its desired behaviour with regards to violations.
How the ORM should account for possible integrity violation was already discussed in the past plenty when usage of the MySQL's MyISAM backend (and SQLite in older versions) which doesn't support foreign keys was more common and caused users to run into this exact issue.
Given we've opted not to special case JOIN'ing logic for supports_foreign_keys
in the past I don't see why we should take a different stance for db_constraint=False
.
If we allowed Object.objects.filter(remote_object_id=1234).select_related('remote_object')
to return Object
instances what would their .remote_object
be? A mock RemoteObject(id=1234)
object? Simply None
? Aren't both approaches breaking symmetry with Object.objects.filter(remote_object_id=1234).first().remote_object
raising a RemoteObject.DoesNotExist
exception?
comment:4 by , 3 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
Simon, Thanks for the comment. Agreed, db_constraint = False
in general should not be used, except for a few really specific cases. Closing as "invalid" as this case is not supported.
comment:5 by , 3 years ago
Replying to Simon Charette:
The question here is should the ORM account for possible integrity violations in the first place, lets not forget that
db_constraint
was basically introduced as some form of shadow option without in-dept discussion about its desired behaviour with regards to violations.
How the ORM should account for possible integrity violation was already discussed in the past plenty when usage of the MySQL's MyISAM backend (and SQLite in older versions) which doesn't support foreign keys was more common and caused users to run into this exact issue.
Given we've opted not to special case JOIN'ing logic for
supports_foreign_keys
in the past I don't see why we should take a different stance fordb_constraint=False
.
If we allowed
Object.objects.filter(remote_object_id=1234).select_related('remote_object')
to returnObject
instances what would their.remote_object
be? A mockRemoteObject(id=1234)
object? SimplyNone
? Aren't both approaches breaking symmetry withObject.objects.filter(remote_object_id=1234).first().remote_object
raising aRemoteObject.DoesNotExist
exception?
Object.objects.filter(remote_object_id=1234).first().remote_object
doesn't raise a DoesNotExist exception. It returns null, as it should be. First it fetches the Object with remote_object_id=1234
which exists. Secondly, It then accesses the FK field on that Object pointing to a RemoteObject, which is null as it does not exist in the remote table (hence why the db_constraint=False). That works as expected, just like an 'ordinary' FK. What doesn't work as expected is that adding a select_related() breaks the filter statement. Select_related is an optimization, the expectation is that it does the same two steps as before but with a single DB query. That isn't the case currently; it silently breaks the filtering logic, and doesn't event return the Object at all.
comment:6 by , 3 years ago
Object.objects.filter(remote_object_id=1234).first().remote_object
doesn't raise aDoesNotExist
exception. It returns null, as it should be.
I think you might be wrong here. Did you test this statement? I did with Django 3.2.12 and 4.0.3.
-
poc/poc_models/tests.py
diff --git a/poc/poc_models/tests.py b/poc/poc_models/tests.py index 27b8249..f7edb32 100644
a b def setUpTestData(cls): 10 10 11 11 # All of the test methods below try to fetch the same object which was created in setUpTestData 12 12 13 def test_access_remote_object_broken_referential_integrity(self): 14 with self.assertRaises(RemoteObject.DoesNotExist): 15 Object.objects.filter(remote_object_id=1234).first().remote_object 16 13 17 def test_remote_object_does_exist_object_should_be_found(self): 14 18 RemoteObject.objects.create(remote_id=1234)
A DoesNotExist
is raised when accessing .remote_object
because your data breaks integrity expectations that the ORM is built around. If you break these integrity expectations you should expect undefined behaviour.
Thanks for the report. Tentatively accepted for future investigation.