Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#27269 closed Uncategorized (needsinfo)

ORM extremely slow to run annotate/filter query, much faster directly in dbshell

Reported by: Jerome Leclanche Owned by: nobody
Component: Database layer (models, ORM) Version: 1.10
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I have a table with a few million rows on Postgresql 9.5.4. The model looks like this:

class GlobalGame(Model):
    digest = CharField(max_length=40)
    ...

The following query takes several seconds to run (results unaltered - there aren't many):

>>> GlobalGame.objects.exclude(digest=None).order_by().values("digest").annotate(c=Count("id")).filter(c__gt=1, id__lt=100000)
<QuerySet [{'c': 2, 'digest': '682c11b036d1a81cf24bc8619922af3ac6d25a25'}, {'c': 2, 'digest': '9bba9dc5df0c0db37f3cc06e2a1a30824c5c6812'}, {'c': 2, 'digest': 'b3dda775e8d9641225c4a113d8145a4817fae37d'}, {'c': 2, 'digest': 'd23e4bfeb1eacc531439e32d4a8d6d330e678f3f'}, {'c': 2, 'digest': 'e215de0646134713cac57edf430acfb60378a816'}]>

Despite that, if I print the .query object and run *that* directly into psql, I get the results in 2ms. EXPLAIN ANALYZE below:

"""
                                                                 QUERY PLAN                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=40396.36..40477.82 rows=6517 width=49) (actual time=2.024..2.158 rows=5 loops=1)
   Group Key: digest
   Filter: (count(id) > 1)
   Rows Removed by Filter: 2123
   ->  Bitmap Heap Scan on games_globalgame  (cost=573.34..40292.09 rows=13902 width=49) (actual time=0.294..1.151 rows=2133 loops=1)
         Recheck Cond: (id < 100000)
         Filter: (digest IS NOT NULL)
         Rows Removed by Filter: 632
         Heap Blocks: exact=90
         ->  Bitmap Index Scan on games_globalgame_pkey  (cost=0.00..569.86 rows=14058 width=0) (actual time=0.217..0.217 rows=2765 loops=1)
               Index Cond: (id < 100000)
 Planning time: 0.087 ms
 Execution time: 2.203 ms
(13 rows)
"""

It's worth noting that if I remove the id__lt condition, or make it much larger (more permissive), I get the results back *immediately* - as long as there's enough results to show "remaining elements truncated".

This leads me to believe either Django isn't running the query it says it's running, or it's doing a bunch of extremely unoptimized operations for no good reason.

Thoughts?

Change History (2)

comment:1 by Tim Graham, 7 years ago

Resolution: needsinfo
Status: newclosed

You can use logging or a print statement to check the query Django is running, or try to do some profiling to see what's slow. In general, you should use support channels to ask questions like this and open a ticket once you've identified that Django is it fault. Feel free to reopen with more specifics if your investigation leads to that conclusion. Thanks.

comment:2 by Aymeric Augustin, 7 years ago

The reporter says he printed queryset.query and that query runs fast. This suggests:

  • either time is spent elsewhere, but I'm not sure what could account for a delay of several seconds on such a query
  • the query being run isn't the query he captured

I suggest configuring PostgreSQL to log all statements (just slow statements could suffice), to see:

  • how much time is spent executing the statement in PostgreSQL
  • if the query executed by PostgreSQL is the same query you tested separately

Honestly the chances that someone can reproduce the problem with the info you provided are slim; such problems are highly environment-dependent; that's why I suggest investigating further in the environment where you're seeing the problem.

It seems unlikely to me that Django is at fault here so I'll leave the ticket closed for now.

Last edited 7 years ago by Aymeric Augustin (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top