Opened 9 years ago

Closed 10 months ago

#25298 closed Cleanup/optimization (wontfix)

Forbid QuerySet filtering by a related field that expects a single value when multiple values are returned

Reported by: Tim Graham Owned by: Tyler Gehrs
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: github@… Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

As raised in #25284, different databases error in different cases on queries like Model.objects.filter(related_id=RelatedModel.objects.all())

  • PostgreSQL, always.
  • MySQL, if the subquery returns multiple results.
  • SQLite, never.

It would be useful to see if we could always throw an error in this case.

Change History (12)

comment:1 by no, 9 years ago

Cc: github@… added

comment:2 by Y3K, 9 years ago

Owner: changed from nobody to Y3K
Status: newassigned

I'm taking this one.

Will update once I send the Github pullrequest.

comment:3 by Anssi Kääriäinen, 9 years ago

Notably Foo.objects.filter(related_model__exact=RelatedModel.objects.values('pk')) should work when the RelatedModel query returns just a single row. This is a completely valid query, equivalent to:

SELECT ... FROM foo WHERE related_model_id = (SELECT id FROM relatedmodel)

If the subquery returns one or zero rows, this query should work on all backends.

comment:4 by no, 9 years ago

For reference, here's some sql fiddles:
select * from Child where rel_id = (select id from Parent where cnt=?);

Engine SubQuery Rows Result Link
MySQL 0 No error, 0 records http://sqlfiddle.com/#!9/1d497/1
MySQL 1 No error, 1 record http://sqlfiddle.com/#!9/1d497/2
MySQL 2 Error http://sqlfiddle.com/#!9/1d497/3
PostgreSQL 0 No error, 0 records http://sqlfiddle.com/#!15/1d497/3
PostgreSQL 1 No error, 1 record http://sqlfiddle.com/#!15/1d497/2
PostgreSQL 2 Error http://sqlfiddle.com/#!15/1d497/1
SQLite 0 No error, 0 records http://sqlfiddle.com/#!7/1d497/3
SQLite 1 No error, 1 record http://sqlfiddle.com/#!7/1d497/2
SQLite 2 No error, 2 records http://sqlfiddle.com/#!7/1d497/1

comment:5 by Y3K, 9 years ago

Thank you very much for your feedback Naddiseo & akaariai

I'm not very experienced with raw SQL and your comments really help to get on the right track.

in reply to:  3 comment:6 by Y3K, 9 years ago

Replying to akaariai:

Notably Foo.objects.filter(related_model__exact=RelatedModel.objects.values('pk')) should work when the RelatedModel query returns just a single row. This is a completely valid query, equivalent to:

SELECT ... FROM foo WHERE related_model_id = (SELECT id FROM relatedmodel)

If the subquery returns one or zero rows, this query should work on all backends.

Hello akaariai.

Don't you think it would be better to forbid such cases as well? If by any means the subquery returns more than one row it'll fail. I think we can save many user errors preventing this before it happens, if we raise an exception whenever a subquery that may return multiple rows is used the users will know and prevent said cases.

Open to comments and suggestions, of course.

comment:7 by Y3K, 9 years ago

It looks like it isn't possible to know beforehand if the subquery will return more than one row (unless we execute the query to check first, but we all know this isn't a sane solution), therefore the two possibles solutions I got while discussing this at #django-dev are:

  • Remove said feature at all, validating the queryset passed to the filter, raising an exception if it can return more than one row.
  • Leave the feature as-is and add a warning on the documentation, specifying that SQLite lacks of said validation.

I request more comments / feedback about this please. What should be the desired solution here and why.

Thanks.

Last edited 9 years ago by Y3K (previous) (diff)

comment:8 by no, 9 years ago

To re-add my opinion on this from the other issue: I think the ORM should be consistent across backends, so it should throw an exception if the inner queryset could return more than a single row. My reasoning is basically to prevent vendor lock-in; Django was (is?) marketed as being mix-and-match for the backends such as database and templating, and Django would take care of the rest. (But, I'm fairly biased on this since I just spent a few unpleasant weeks converting raw mysql in our code to use ORM so that we can eventually switch to postgresql.)

For the use-case that akaariai brought up, perhaps it would have to be changed to accept a limit of one:

Foo.objects.filter(related_model__exact=RelatedModel.objects.values('pk')[:1])

# SELECT * FROM Foo WHERE related_model_id = (SELECT pk FROM RelatedModel LIMIT 1);

This way it's explicit that you only want 0 or 1 returned from the subquery. As a side note, does __exact allow passing in a model instance?

Foo.objects.filter(related_model__exact=RelatedModel.objects.get(pk=1))

Of course, it's not exactly the same since the .get can throw a DoesNotExist exception, and it generates two queries.

An alternate solution is to implicitly add limits, or change = to IN if the subquery may return multiple results. However, I am against this behaviour since - although it may be intuitive - it's implicit and may be the result of a programmer error that should really be raised as an error instead of silently accepted.

comment:9 by Mariusz Felisiak, 3 years ago

Owner: Y3K removed
Status: assignednew

comment:10 by Tyler Gehrs, 11 months ago

Has patch: set
Owner: set to Tyler Gehrs
Status: newassigned
Version: 1.8dev

comment:11 by Mariusz Felisiak, 10 months ago

Patch needs improvement: set

I don't think there is a real value in documenting this discrepancy. This is an SQLite caveat that will go unnoticed in most cases, leaving a bug in users code. Adding a warning to docs will not change this or help in any way.

Please first start a discussion on the Django Forum, where you'll reach a wider audience, and see what other think. Assuming there is no way to detect the number of rows in advance (as far as I'm aware there is not), I'd close it as "wontfix" (or "invalid").

Marking ticket as need improvement pending discussion.

comment:12 by Mariusz Felisiak, 10 months ago

Resolution: wontfix
Status: assignedclosed
Triage Stage: AcceptedUnreviewed
Note: See TracTickets for help on using tickets.
Back to Top