﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
29577	"Generated SQL in PostGres for exists filter does Exists(...)=true.  Having just Exists(...) without ""true"" is faster by nearly 50% in many of my queries"	Michael Chiciak	nobody	"I am trying to optimize my exists filters, since on large record sets I noticed some slowness.  I have narrowed down part of it to the ""= true"" portion of the ""Exists(...)"" subquery.  Removing the ""= true"" gives a noticeable performance boost

Also, it would be really nice to be able to just filter and not have a field inside the select portion of the query.  When I use the Paginator class for pagination, it turns the count query into one big subquery and adds all my Exists filters as select fields and thus performs a Group BY on each of them, drastically slowing down my count query.  Only having the Exists in the where clause would be very nice.  Is there a way to do this already?


Sample django code

{{{
fits_vehicle = ProductFitment.objects.filter_vehicle(2008, 2013, 'Chevrolet', 'Corvette').filter(product=OuterRef('pk')).only(""pk"")
queryset = queryset.annotate(fits_vehicle=Exists(fits_vehicle)).filter(fits_vehicle=True)

# With exists = true 500ms
SELECT COUNT(*)
FROM ""aces_pies_data_product"" 
WHERE EXISTS(SELECT 1 FROM ""aces_pies_data_productfitment"" U0 INNER JOIN ""django_vehiclefitment_vehicle"" U1 ON (U0.""vehicle_id"" = U1.""id"") INNER JOIN ""django_vehiclefitment_vehiclemake"" U2 ON (U1.""make_id"" = U2.""id"") INNER JOIN ""django_vehiclefitment_vehiclemodel"" U3 ON (U1.""model_id"" = U3.""id"") WHERE (U2.""name"" = 'Chevrolet' AND U3.""name"" = 'Corvette' AND (2008 BETWEEN U0.""start_year"" AND U0.""end_year"" OR 2009 BETWEEN U0.""start_year"" AND U0.""end_year"" OR 2010 BETWEEN U0.""start_year"" AND U0.""end_year"" OR 2011 BETWEEN U0.""start_year"" AND U0.""end_year"" OR 2012 BETWEEN U0.""start_year"" AND U0.""end_year"" OR 2013 BETWEEN U0.""start_year"" AND U0.""end_year"") AND U0.""product_id"" = (""aces_pies_data_product"".""id""))) = true

#With no ""true"" 350MS
SELECT COUNT(*)
FROM ""aces_pies_data_product"" 
WHERE EXISTS(SELECT 1 FROM ""aces_pies_data_productfitment"" U0 INNER JOIN ""django_vehiclefitment_vehicle"" U1 ON (U0.""vehicle_id"" = U1.""id"") INNER JOIN ""django_vehiclefitment_vehiclemake"" U2 ON (U1.""make_id"" = U2.""id"") INNER JOIN ""django_vehiclefitment_vehiclemodel"" U3 ON (U1.""model_id"" = U3.""id"") WHERE (U2.""name"" = 'Chevrolet' AND U3.""name"" = 'Corvette' AND (2008 BETWEEN U0.""start_year"" AND U0.""end_year"" OR 2009 BETWEEN U0.""start_year"" AND U0.""end_year"" OR 2010 BETWEEN U0.""start_year"" AND U0.""end_year"" OR 2011 BETWEEN U0.""start_year"" AND U0.""end_year"" OR 2012 BETWEEN U0.""start_year"" AND U0.""end_year"" OR 2013 BETWEEN U0.""start_year"" AND U0.""end_year"") AND U0.""product_id"" = (""aces_pies_data_product"".""id"")))

}}}


**Edit**
Poked around some of the annotation source code and got a work around

To remove the annotations from my selects and thus drastically increase the count(*) query speed, I just did products.query.annotation_select.pop('fits_vehicle').  This correctly filters without actually selecting, big improvement in speed!

"	Cleanup/optimization	closed	Database layer (models, ORM)	2.0	Normal	duplicate	slow query		Unreviewed	0	0	0	0	0	0
