Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#29577 closed Cleanup/optimization (duplicate)

Generated SQL in PostGres for exists filter does Exists(...)=true. Having just Exists(...) without "true" is faster by nearly 50% in many of my queries

Reported by: Michael Chiciak Owned by: nobody
Component: Database layer (models, ORM) Version: 2.0
Severity: Normal Keywords: slow query
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Michael Chiciak)

I am trying to optimize my exists filters, since on large record sets I noticed some slowness. I have narrowed down part of it to the "= true" portion of the "Exists(...)" subquery. Removing the "= true" gives a noticeable performance boost

Also, it would be really nice to be able to just filter and not have a field inside the select portion of the query. When I use the Paginator class for pagination, it turns the count query into one big subquery and adds all my Exists filters as select fields and thus performs a Group BY on each of them, drastically slowing down my count query. Only having the Exists in the where clause would be very nice. Is there a way to do this already?

Sample django code

fits_vehicle = ProductFitment.objects.filter_vehicle(2008, 2013, 'Chevrolet', 'Corvette').filter(product=OuterRef('pk')).only("pk")
queryset = queryset.annotate(fits_vehicle=Exists(fits_vehicle)).filter(fits_vehicle=True)

# With exists = true 500ms
SELECT COUNT(*)
FROM "aces_pies_data_product" 
WHERE EXISTS(SELECT 1 FROM "aces_pies_data_productfitment" U0 INNER JOIN "django_vehiclefitment_vehicle" U1 ON (U0."vehicle_id" = U1."id") INNER JOIN "django_vehiclefitment_vehiclemake" U2 ON (U1."make_id" = U2."id") INNER JOIN "django_vehiclefitment_vehiclemodel" U3 ON (U1."model_id" = U3."id") WHERE (U2."name" = 'Chevrolet' AND U3."name" = 'Corvette' AND (2008 BETWEEN U0."start_year" AND U0."end_year" OR 2009 BETWEEN U0."start_year" AND U0."end_year" OR 2010 BETWEEN U0."start_year" AND U0."end_year" OR 2011 BETWEEN U0."start_year" AND U0."end_year" OR 2012 BETWEEN U0."start_year" AND U0."end_year" OR 2013 BETWEEN U0."start_year" AND U0."end_year") AND U0."product_id" = ("aces_pies_data_product"."id"))) = true

#With no "true" 350MS
SELECT COUNT(*)
FROM "aces_pies_data_product" 
WHERE EXISTS(SELECT 1 FROM "aces_pies_data_productfitment" U0 INNER JOIN "django_vehiclefitment_vehicle" U1 ON (U0."vehicle_id" = U1."id") INNER JOIN "django_vehiclefitment_vehiclemake" U2 ON (U1."make_id" = U2."id") INNER JOIN "django_vehiclefitment_vehiclemodel" U3 ON (U1."model_id" = U3."id") WHERE (U2."name" = 'Chevrolet' AND U3."name" = 'Corvette' AND (2008 BETWEEN U0."start_year" AND U0."end_year" OR 2009 BETWEEN U0."start_year" AND U0."end_year" OR 2010 BETWEEN U0."start_year" AND U0."end_year" OR 2011 BETWEEN U0."start_year" AND U0."end_year" OR 2012 BETWEEN U0."start_year" AND U0."end_year" OR 2013 BETWEEN U0."start_year" AND U0."end_year") AND U0."product_id" = ("aces_pies_data_product"."id")))

Edit
Poked around some of the annotation source code and got a work around

To remove the annotations from my selects and thus drastically increase the count(*) query speed, I just did products.query.annotation_select.pop('fits_vehicle'). This correctly filters without actually selecting, big improvement in speed!

Change History (5)

comment:1 by Michael Chiciak, 6 years ago

Did some playing around with the "only having columns as filters". If you use "values", it does take away the annotation which is what I want. But it would be nice if defer behaved the same way so I can still retain all the model fields naturally.

comment:2 by Michael Chiciak, 6 years ago

Summary: Generated SQL for exists filter does Exists(...)=true. Having just Exists(...) without "true" is faster by nearly 50% in many of my queriesGenerated SQL in PostGres for exists filter does Exists(...)=true. Having just Exists(...) without "true" is faster by nearly 50% in many of my queries

comment:3 by Michael Chiciak, 6 years ago

Description: modified (diff)

comment:4 by Simon Charette, 6 years ago

Resolution: duplicate
Status: newclosed

Duplicate of #25367 which would allow the Exist() expression to be directly passed to filter instead of having to annotate it and filter against.

comment:5 by Simon Charette, 6 years ago

Here's a discussion about not including the =true part on backends that support it that might interest you.

https://github.com/django/django/pull/8119#issuecomment-373844543

Note: See TracTickets for help on using tickets.
Back to Top