#33973 closed Uncategorized (needsinfo)
Performance regression when moving from 3.1 to 3.2
Reported by: | Marc Parizeau | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.2 |
Severity: | Normal | Keywords: | performance regression |
Cc: | Simon Charette | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I am seing a sharp increase in execution time for some of my queries when moving from Django 3.1 to 3.2. And the performance hit appears to be the same for Django 4.
My backend is Postgres 14.2.
My Django project has forums for different types of content. The forum app consists essentially of 5 tables:
- a
Post
table that contains forum posts (essentially a text field); - a
Thread
table where rows point to a specific content and post; - a
FollowUp
table where rows point to a specific thread and post; - a
ThreadEntry
table where rows point to a thread, a user, and the last seen thread post for this user; - a
FollowUpEntry
table where rows point to a followup, a user, and the last seen followup post for this user.
Here is an example query that executes 2 times slower on 3.2 than on 3.1:
Content.objects.all().annotate( has_unread_posts=Greatest( # a content is unread if at least one condition is true Exists( # a thread has never been read (has no user entry) Thread.objects.filter( content=OuterRef('pk'), ).exclude(threadentry__user=user) ), Exists( # a thread entry is not up-to-date ThreadEntry.objects.filter( thread__content=OuterRef('pk'), user=user, ).exclude(post=F('thread__post')) ), Exists( # a followup has never been read FollowUp.objects.filter( thread__content=OuterRef('pk') ).exclude(followupentry__user=user) ), Exists( # a followup entry is not up-to-date FollowUpEntry.objects.filter( followup__thread__content=OuterRef('pk'), user=user, ).exclude(post=F('followup__post')) ), ) ).filter( has_unread_posts=True, ).order_by( 'course__uid', '-version__start', ).select_related( 'course', 'version', )
Course
and Version
are other tables related to Content
.
I want to know with this query, for each content, whether or not there is something new in the corresponding forum for a given user. There is something new if any one of the following condition is true:
- there exists a thread for which the user has no thread entry (an entry is added when the thread is first read by the user);
- there exists a user thread entry for which the last read post is not up to date with the current thread post (the thread owner has modified the post since);
- there exists a followup for which the user has no followup entry (an entry is added when the followup is first read by the user);
- there exists a user followup entry for which the last read post is not up to date with the followup post (the followup owner has modified the post since).
On my machine, just by changing the Django version using pip, and nothing else, this query takes about 1 second of execution on Django 3.1.14, and a little more than 2 seconds on Django 3.2.15, so about a 2x increase. Here are the current table sizes for my forum app:
Thread
: ~30KFollowUp
: ~46KThreadEntry
: ~1.3MFollowUpEntry
: ~4.5MPost
: ~103K
And there is 33 Content
rows.
Am I the only one observing such performance regressions with Django 3.2? On other more complex queries that contain nested subqueries, I have seen up to 30x execution time increases.
Did something major happen in SQL generation from 3.1 to 3.2?
Am I doing something wrong? How can this happen?
Any help on understanding what is going on with Django 3.2 would be much appreciated.
Best regards,
Change History (8)
comment:2 by , 2 years ago
So the exact Django query that lead to the above results is:
Content.objects.all().annotate( unread=Exists( Thread.objects.filter( content=OuterRef('pk'), ).exclude(threadentry__user=user) ).filter( unread=True, )
Which produces the following SQL.
On Django 3.1.14:
SELECT "content_content"."id", "content_content"."created", "content_content"."description", "content_content"."disabled", "content_content"."modified", "content_content"."objectives", "content_content"."published", "content_content"."title", "content_content"."course_id", "content_content"."version_id", "content_content"."exercise_weight", "content_content"."midterm_weight", "content_content"."final_weight", "content_content"."project_weight", "content_content"."forum_weight", "content_content"."welcome_video", "content_content"."navigate_video", "content_content"."forum_video", "content_content"."lambda_fct_id", "content_content"."dl_key_version", "content_content"."dl_key_token", "content_content"."exam_key_token", "content_content"."lectures_url", EXISTS( SELECT V0."id", V0."created", V0."disabled", V0."modified", V0."title", V0."content_id", V0."owner_id", V0."post_id", V0."tag", V0."start", V0."uid", V0."question", V0."important", V0."resolved", V0."authorized", V0."votes", V0."mptt_id", V0."linked_content_type_id", V0."linked_content_id" FROM "forum_thread" V0 WHERE ( V0."content_id" = "content_content"."id" AND NOT ( V0."id" IN ( SELECT U1."thread_id" FROM "forum_threadentry" U1 WHERE ( U1."user_id" = 59 AND U1."thread_id" IS NOT NULL ) ) ) ) ) AS "unread" FROM "content_content" LEFT OUTER JOIN "content_course" ON ( "content_content"."course_id" = "content_course"."id" ) LEFT OUTER JOIN "content_version" ON ( "content_content"."version_id" = "content_version"."id" ) WHERE EXISTS( SELECT V0."id", V0."created", V0."disabled", V0."modified", V0."title", V0."content_id", V0."owner_id", V0."post_id", V0."tag", V0."start", V0."uid", V0."question", V0."important", V0."resolved", V0."authorized", V0."votes", V0."mptt_id", V0."linked_content_type_id", V0."linked_content_id" FROM "forum_thread" V0 WHERE ( V0."content_id" = "content_content"."id" AND NOT ( V0."id" IN ( SELECT U1."thread_id" FROM "forum_threadentry" U1 WHERE ( U1."user_id" = 59 AND U1."thread_id" IS NOT NULL ) ) ) ) ) ORDER BY "content_course"."uid" ASC, "content_version"."start" DESC
On Django 3.2.15:
SELECT "content_content"."id", "content_content"."created", "content_content"."description", "content_content"."disabled", "content_content"."modified", "content_content"."objectives", "content_content"."published", "content_content"."title", "content_content"."course_id", "content_content"."version_id", "content_content"."exercise_weight", "content_content"."midterm_weight", "content_content"."final_weight", "content_content"."project_weight", "content_content"."forum_weight", "content_content"."welcome_video", "content_content"."navigate_video", "content_content"."forum_video", "content_content"."lambda_fct_id", "content_content"."dl_key_version", "content_content"."dl_key_token", "content_content"."exam_key_token", "content_content"."lectures_url", EXISTS( SELECT (1) AS "a" FROM "forum_thread" V0 WHERE ( V0."content_id" = "content_content"."id" AND NOT ( EXISTS( SELECT (1) AS "a" FROM "forum_threadentry" U1 WHERE ( U1."user_id" = 59 AND U1."thread_id" = V0."id" ) LIMIT 1 ) ) ) LIMIT 1 ) AS "unread" FROM "content_content" LEFT OUTER JOIN "content_course" ON ( "content_content"."course_id" = "content_course"."id" ) LEFT OUTER JOIN "content_version" ON ( "content_content"."version_id" = "content_version"."id" ) WHERE EXISTS( SELECT (1) AS "a" FROM "forum_thread" V0 WHERE ( V0."content_id" = "content_content"."id" AND NOT ( EXISTS( SELECT (1) AS "a" FROM "forum_threadentry" U1 WHERE ( U1."user_id" = 59 AND U1."thread_id" = V0."id" ) LIMIT 1 ) ) ) LIMIT 1 ) ORDER BY "content_course"."uid" ASC, "content_version"."start" DESC}}} Again, hope this helps!
comment:3 by , 2 years ago
Did something major happen in SQL generation from 3.1 to 3.2?
Each version of Django includes many bugfixes and new ORM features that affect the generated SQL.
Can you reproduce the same regression on Django 4.1 and the current main
branch? Can you bisect a regression?
comment:4 by , 2 years ago
Here is the "explanation" for Django 4.0.7:
time 0.3011052919999955 Sort (cost=62.30..62.31 rows=1 width=1207) Sort Key: content_course.uid, content_version.start DESC -> Nested Loop Left Join (cost=0.41..62.29 rows=1 width=1207) -> Nested Loop Left Join (cost=0.28..40.58 rows=1 width=1202) -> Nested Loop (cost=0.14..39.68 rows=1 width=1124) -> Seq Scan on content_teacher (cost=0.00..7.65 rows=1 width=8) Filter: (user_id = 59) -> Index Scan using content_semester_pkey on content_content (cost=0.14..28.99 rows=1 width=1124) Index Cond: (id = content_teacher.content_id) Filter: ((id <> 58) AND GREATEST((SubPlan 3), false)) SubPlan 3 -> Nested Loop Anti Join (cost=0.43..18547.19 rows=909 width=0) -> Seq Scan on forum_thread v0_1 (cost=0.00..1799.79 rows=1174 width=8) Filter: (content_id = content_content.id) -> Index Only Scan using forum_threadentry_thread_id_user_id_cd21c4a5_uniq on forum_threadentry u1_1 (cost=0.43..55.66 rows=33 width=8) Index Cond: ((thread_id = v0_1.id) AND (user_id = 59)) -> Index Scan using content_course_pkey on content_course (cost=0.14..0.82 rows=1 width=86) Index Cond: (id = content_content.course_id) -> Index Scan using content_version_pkey on content_version (cost=0.14..0.82 rows=1 width=12) Index Cond: (id = content_content.version_id) SubPlan 1 -> Nested Loop Anti Join (cost=0.43..18547.19 rows=909 width=0) -> Seq Scan on forum_thread v0 (cost=0.00..1799.79 rows=1174 width=8) Filter: (content_id = content_content.id) -> Index Only Scan using forum_threadentry_thread_id_user_id_cd21c4a5_uniq on forum_threadentry u1 (cost=0.43..55.66 rows=33 width=8) Index Cond: ((thread_id = v0.id) AND (user_id = 59))
And the corresponding SQL:
SELECT "content_content"."id", "content_content"."created", "content_content"."description", "content_content"."disabled", "content_content"."modified", "content_content"."objectives", "content_content"."published", "content_content"."title", "content_content"."course_id", "content_content"."version_id", "content_content"."exercise_weight", "content_content"."midterm_weight", "content_content"."final_weight", "content_content"."project_weight", "content_content"."forum_weight", "content_content"."welcome_video", "content_content"."navigate_video", "content_content"."forum_video", "content_content"."lambda_fct_id", "content_content"."dl_key_version", "content_content"."dl_key_token", "content_content"."exam_key_token", "content_content"."lectures_url", GREATEST(EXISTS(SELECT (1) AS "a" FROM "forum_thread" V0 WHERE (V0."content_id" = ("content_content"."id") AND NOT (EXISTS(SELECT (1) AS "a" FROM "forum_threadentry" U1 WHERE (U1."user_id" = 59 AND U1."thread_id" = (V0."id")) LIMIT 1))) LIMIT 1), False) AS "unread" FROM "content_content" INNER JOIN "content_teacher" ON ("content_content"."id" = "content_teacher"."content_id") LEFT OUTER JOIN "content_course" ON ("content_content"."course_id" = "content_course"."id") LEFT OUTER JOIN "content_version" ON ("content_content"."version_id" = "content_version"."id") WHERE ("content_teacher"."user_id" = 59 AND NOT ("content_content"."id" = 58) AND GREATEST(EXISTS(SELECT (1) AS "a" FROM "forum_thread" V0 WHERE (V0."content_id" = ("content_content"."id") AND NOT (EXISTS(SELECT (1) AS "a" FROM "forum_threadentry" U1 WHERE (U1."user_id" = 59 AND U1."thread_id" = (V0."id")) LIMIT 1))) LIMIT 1), False)) ORDER BY "content_course"."uid" ASC, "content_version"."start" DESC
As for Django 4.1, I can't do it right now, because I am getting Expression contains mixed types: IntegerField, SmallIntegerField
errors which I haven't been able to remove so far.
But my issues definitely started with Django 3.2, not with Django 4 which seems to behave similarly for my specific query.
follow-up: 6 comment:5 by , 2 years ago
Cc: | added |
---|---|
Component: | Uncategorized → Database layer (models, ORM) |
Resolution: | → needsinfo |
Status: | new → closed |
Using EXISTS
to exclude multi-valued relationship was introduced in 8593e162c9cb63a6c0b06daf045bc1c21eb4d7c1. However this should be is easier to optimize for query planers, it's not clear to me why you're facing a performance regression. Perhaps the query plan should be regenerated or this is an issue in PostgreSQL 14.2. I don't think there is anything that we could improve in Django.
comment:6 by , 2 years ago
Replying to Mariusz Felisiak:
Using
EXISTS
to exclude multi-valued relationship was introduced in 8593e162c9cb63a6c0b06daf045bc1c21eb4d7c1. However this should be is easier to optimize for query planers, it's not clear to me why you're facing a performance regression. Perhaps the query plan should be regenerated or this is an issue in PostgreSQL 14.2. I don't think there is anything that we could improve in Django.
Not sure I understand what you are saying. You are saying that the regression I am observing is definitely not caused by the Django generated SQL? That the issue must stem from the query planers of postgres? Are you sure about that?
Any suggestion on what I should do moving forward? What postgres version do you suggest I use to get the proper query planers for Django 3.2/4?
follow-up: 8 comment:7 by , 2 years ago
Bonjour Marc,
There are some sources listed on the PR that introduced these changes that documents the performance benefits of using EXISTS
over IN
and the case where IN
will be faster which I believe is what you're experiencing here.
Given there are performance benefits in most cases and IN
is only faster when the subquery returns a small number of rows as a hashmap can be created I believe we should keep the change around.
The good news is that you can restore the previous behaviour by being more explicit in your usage of exclude
by using an explicit IN
lookup instead.
user_thread_entries = ThreadEntry.objects.filter(user=user, thread__isnull=False).values("thread") Content.objects.annotate( unread=Exists( Thread.objects.filter( content=OuterRef('pk'), ).exclude(pk__in=user_thread_entries) ) ).filter( unread=True, )
Keep in mind though that, as pointed out in the article, this query will only be faster for users with a relatively small number of thread entries as Postgres will turn the set of rows into a hashmap which is means that this query will become more expensive as the number of threads user creates increase.
comment:8 by , 2 years ago
Replying to Simon Charette:
Thanks for the suggestion. I've tried it and it does not seem to reduce the execution time for my test case that has about 26K threads for it's user. So, the limitation that you mention probably applies. I am still trying to understand what might be going on (my mastery of SQL is somewhat limited), and why Django 3.1 was doing better than 3.2 for my specific query. Anyway, for the time being, I've decided to move on to other problems, since this one is mostly severe for myself, and not for my regular users who have access to only a subset of the available contents. I worry, however, that the problem will only grow as more content is added...
I've also tried indexing, but so far to no improvement.
If I eventually find a solution, I will report it here.
Thanks for your help!
After more digging, it appears that the two worst culprits in my query are the 1st and 3rd
Exists
. The 1st is about 10x slower on 3.2.15 than on 3.1.14. The 3rd is about 6x slower on 3.2.15 than on 3.1.14. The execution times for the 2nd and 4thExists
are about the same for 3.2 and 3.1.If I use only the 1st
Exists
in my query, by commenting out the 3 others, and by adding a False for theGreatest
requirement of min 2 args, I obtain de following explanations (I also removed theorder_by
andselect_related
clauses to simplify the query as much as possible):On Django 3.1.14:
On Django 3.2.15:
So it appears that the strategies of the two Django versions are indeed not the same. And it also appears that the regression stems from the implementation of the
exclude(threadentry__user=user)
clause.Hope this info can help a gourou isolate the problem source (SQL optimization is out of my league).