Opened 3 years ago

Last modified 4 months ago

#28560 new Bug

distinct() on ordered queryset with restricted list of columns returns incorrect result

Reported by: Mariusz Felisiak Owned by:
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: distinct values
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Tobias McNulty)

When distinct() is used with values() (or values_list()) on ordered queryset and a list of fields in values() doesn't contain all fields from ORDER BY, then it doesn't return correct result because columns from ORDER BY clause must be included in SELECT.

After some discussion on the mailing list (https://groups.google.com/g/django-developers/c/DNVRFqVBsfk/m/xDUvaq3DAAAJ) it looks like the consensus is to require an explicit opt-in or raise an error (i.e., never add a column implicitly if the user specified a list of columns via values() or values_list()).

Change History (14)

comment:2 Changed 3 years ago by Mariusz Felisiak

Summary: distinct() on None valuesdistinct() on ordered queryset with restricted list of columns returns incorrect result

comment:3 Changed 3 years ago by Mariusz Felisiak

Description: modified (diff)

comment:4 Changed 3 years ago by Simon Charette

Couldn't we wrap the query in a subquery like we did in #24254 in this case?

comment:5 Changed 3 years ago by Mariusz Felisiak

Good idea! it should be feasible. I will try to prepare patch in this week.

comment:6 Changed 3 years ago by Simon Charette

By the way Mariusz, do you have an opinion on https://code.djangoproject.com/ticket/14357#comment:11?

comment:7 Changed 3 years ago by Mariusz Felisiak

Description: modified (diff)

comment:8 Changed 3 years ago by Mariusz Felisiak

Description: modified (diff)

comment:9 Changed 3 years ago by Mariusz Felisiak

Has patch: set

comment:10 Changed 3 years ago by Tim Graham

Triage Stage: UnreviewedAccepted

comment:11 Changed 3 years ago by Tim Graham

Patch needs improvement: set

comment:12 Changed 3 years ago by Mariusz Felisiak

Has patch: unset
Owner: Mariusz Felisiak deleted
Patch needs improvement: unset
Status: assignednew

comment:13 Changed 4 months ago by Tobias McNulty

Should/could we force the user to include in values() any ordering columns required in the SELECT per #7070?

While this doesn't fix the issue per se (though I'm not sure fixing it is possible...), it seems most explicit for the user and consistent with the Postgres error noted in #5321.

Last edited 4 months ago by Tobias McNulty (previous) (diff)

comment:14 Changed 4 months ago by Tobias McNulty

After some discussion on the mailing list (https://groups.google.com/g/django-developers/c/DNVRFqVBsfk/m/xDUvaq3DAAAJ) it looks like the consensus is to require an explicit opt-in or raise an error (i.e., never add a column implicitly if the user specified a list of columns via values() or values_list()).

I am not too familiar with the ORM. I believe these columns are added in get_extra_select() (django/db/models/sql/compiler.py), which could be adapted initially to raise a deprecation warning and eventually an exception. Does that sound right?

I may not have time right away so if anyone feels like picking this up, go for it.

comment:15 Changed 4 months ago by Tobias McNulty

Description: modified (diff)
Note: See TracTickets for help on using tickets.
Back to Top