﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
24218	Use sub-query in ORM when distinct and order_by columns do not match	Miroslav Shubernetskiy	nobody	"This ticket is to propose a slight change in ORM - use subqueries when querying a model where {{{.distinct()}}} and {{{.order_by()}}} (or {{{.extra(order_by=())}}}) leftmost columns do not match. For example:

{{{#!python
Model.objects.all().distinct('foo').order_by('bar')
}}}

The above generates the following SQL:

{{{#!sql
SELECT DISTINCT ON (""app_model"".""foo"") <lots of columns here>
FROM ""app_model""
ORDER BY ""app_model"".""bar"" ASC;
}}}

I am not sure about all backends however the above syntax is not allowed in PostgreSQL which produces the following error:

{{{#!sql
ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY expressions
}}}

Here are [http://www.postgresql.org/docs/9.0/static/sql-select.html#SQL-DISTINCT PostgreSQL docs] explaining why that is not allowed:

  DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. [...] Note that the ""first row"" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. [...] The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s).

This ticket proposes to use subqueries in such situations which would use SQL:

{{{#!sql
SELECT *
FROM (
  SELECT DISTINCT ON (""app_model"".""foo"") <lots of columns here>
  FROM ""app_model""
) result
ORDER BY ""app_model"".""bar"" ASC;
}}}

The above is perfectly valid SQL and produces expected results (please note that {{{ORDER_BY}}} is in the outer query to guarantee that distinct results are correctly sorted).

I created a simple proof-of-concept [https://gist.github.com/miki725/ce26d8b6ee2f3075884a patch] by overwriting few things in {{{SQLCompiler.as_sql()}}} which seems to work pretty well. The patch only creates subquery when the above dilema is encountered which should not have any negative side-effects on existing queries (since such queries were not allowed by SQL). The patch also works on the {{{.count()}}} queries since Django then strips any ordering hence the subquery is never created."	New feature	new	Database layer (models, ORM)	dev	Normal		subquery distinct order_by		Unreviewed	1	0	0	0	0	0
