id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
30685	Optimize QuerySet.count() with distinct()	Adam Sołtysik	Akash Kumar Sen	"I have a PostgreSQL table with 100 000 records and 15 columns.

A simple `.count()` query results in a fast SQL (execution time in seconds on the left):

{{{(0.015) SELECT COUNT(*) AS ""__count"" FROM ""table"";}}}

When we add `.distinct()`, a subquery is created with all columns SELECTed:

{{{(0.178) SELECT COUNT(*) FROM (SELECT DISTINCT ""table"".""id"" AS Col1, ... (15 columns) FROM ""table"") subquery;}}}

When instead of `.distinct()` we write `.distinct('id')` and add `.order_by('id', 'col1', 'col2')`, the subquery is additionally ORDERed:

{{{(0.151) SELECT COUNT(*) FROM (SELECT DISTINCT ON (""table"".""id"") ""table"".""id"" AS Col1, ... (15 columns) FROM ""table"" ORDER BY ""table"".""id"" ASC, ""table"".""col1"" ASC, ""table"".""col2"" ASC) subquery;}}}

Funny thing is that without `.distinct('id')` we can write `.order_by('non_existing_column')` and it works without any exception.

After adding `.values('id')` and an empty `.order_by()`, the query is as fast as it can be with DISTINCT:

{{{(0.053) SELECT COUNT(*) FROM (SELECT DISTINCT ON (""table"".""id"") ""table"".""id"" AS Col1 FROM ""table"") subquery;}}}

I think that the subquery for `count()` should never contain additional columns nor be ordered (and the same probably goes for other aggregations)."	Cleanup/optimization	assigned	Database layer (models, ORM)	2.2	Normal				Accepted	1	0	0	1	0	0