Opened 9 years ago
Closed 10 months ago
#25298 closed Cleanup/optimization (wontfix)
Forbid QuerySet filtering by a related field that expects a single value when multiple values are returned
Reported by: | Tim Graham | Owned by: | Tyler Gehrs |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | github@… | Triage Stage: | Unreviewed |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
As raised in #25284, different databases error in different cases on queries like Model.objects.filter(related_id=RelatedModel.objects.all())
- PostgreSQL, always.
- MySQL, if the subquery returns multiple results.
- SQLite, never.
It would be useful to see if we could always throw an error in this case.
Change History (12)
comment:1 by , 9 years ago
Cc: | added |
---|
comment:2 by , 9 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
follow-up: 6 comment:3 by , 9 years ago
Notably Foo.objects.filter(related_model__exact=RelatedModel.objects.values('pk'))
should work when the RelatedModel query returns just a single row. This is a completely valid query, equivalent to:
SELECT ... FROM foo WHERE related_model_id = (SELECT id FROM relatedmodel)
If the subquery returns one or zero rows, this query should work on all backends.
comment:4 by , 9 years ago
For reference, here's some sql fiddles:
select * from Child where rel_id = (select id from Parent where cnt=?);
Engine | SubQuery Rows | Result | Link |
---|---|---|---|
MySQL | 0 | No error, 0 records | http://sqlfiddle.com/#!9/1d497/1 |
MySQL | 1 | No error, 1 record | http://sqlfiddle.com/#!9/1d497/2 |
MySQL | 2 | Error | http://sqlfiddle.com/#!9/1d497/3 |
PostgreSQL | 0 | No error, 0 records | http://sqlfiddle.com/#!15/1d497/3 |
PostgreSQL | 1 | No error, 1 record | http://sqlfiddle.com/#!15/1d497/2 |
PostgreSQL | 2 | Error | http://sqlfiddle.com/#!15/1d497/1 |
SQLite | 0 | No error, 0 records | http://sqlfiddle.com/#!7/1d497/3 |
SQLite | 1 | No error, 1 record | http://sqlfiddle.com/#!7/1d497/2 |
SQLite | 2 | No error, 2 records | http://sqlfiddle.com/#!7/1d497/1 |
comment:5 by , 9 years ago
Thank you very much for your feedback Naddiseo & akaariai
I'm not very experienced with raw SQL and your comments really help to get on the right track.
comment:6 by , 9 years ago
Replying to akaariai:
Notably
Foo.objects.filter(related_model__exact=RelatedModel.objects.values('pk'))
should work when the RelatedModel query returns just a single row. This is a completely valid query, equivalent to:
SELECT ... FROM foo WHERE related_model_id = (SELECT id FROM relatedmodel)If the subquery returns one or zero rows, this query should work on all backends.
Hello akaariai.
Don't you think it would be better to forbid such cases as well? If by any means the subquery returns more than one row it'll fail. I think we can save many user errors preventing this before it happens, if we raise an exception whenever a subquery that may return multiple rows is used the users will know and prevent said cases.
Open to comments and suggestions, of course.
comment:7 by , 9 years ago
It looks like it isn't possible to know beforehand if the subquery will return more than one row, therefore the two possibles solutions I got while discussing this at #django-dev are:
- Remove said feature at all, validating the queryset passed to the filter, raising an exception if it can return more than one row.
- Leave the feature as-is and add a warning on the documentation, specifying that SQLite lacks of said validation.
I request more comments / feedback about this please. What should be the desired solution here and why.
Thanks.
comment:8 by , 9 years ago
To re-add my opinion on this from the other issue: I think the ORM should be consistent across backends, so it should throw an exception if the inner queryset could return more than a single row. My reasoning is basically to prevent vendor lock-in; Django was (is?) marketed as being mix-and-match for the backends such as database and templating, and Django would take care of the rest. (But, I'm fairly biased on this since I just spent a few unpleasant weeks converting raw mysql in our code to use ORM so that we can eventually switch to postgresql.)
For the use-case that akaariai brought up, perhaps it would have to be changed to accept a limit of one:
Foo.objects.filter(related_model__exact=RelatedModel.objects.values('pk')[:1]) # SELECT * FROM Foo WHERE related_model_id = (SELECT pk FROM RelatedModel LIMIT 1);
This way it's explicit that you only want 0 or 1 returned from the subquery. As a side note, does __exact
allow passing in a model instance?
Foo.objects.filter(related_model__exact=RelatedModel.objects.get(pk=1))
Of course, it's not exactly the same since the .get
can throw a DoesNotExist
exception, and it generates two queries.
An alternate solution is to implicitly add limits, or change =
to IN
if the subquery may return multiple results. However, I am against this behaviour since - although it may be intuitive - it's implicit and may be the result of a programmer error that should really be raised as an error instead of silently accepted.
comment:9 by , 3 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:10 by , 11 months ago
Has patch: | set |
---|---|
Owner: | set to |
Status: | new → assigned |
Version: | 1.8 → dev |
comment:11 by , 10 months ago
Patch needs improvement: | set |
---|
I don't think there is a real value in documenting this discrepancy. This is an SQLite caveat that will go unnoticed in most cases, leaving a bug in users code. Adding a warning to docs will not change this or help in any way.
Please first start a discussion on the Django Forum, where you'll reach a wider audience, and see what other think. Assuming there is no way to detect the number of rows in advance (as far as I'm aware there is not), I'd close it as "wontfix" (or "invalid").
Marking ticket as need improvement pending discussion.
comment:12 by , 10 months ago
Resolution: | → wontfix |
---|---|
Status: | assigned → closed |
Triage Stage: | Accepted → Unreviewed |
I'm taking this one.
Will update once I send the Github pullrequest.