Opened 6 weeks ago

Closed 6 weeks ago

#35637 closed Bug (invalid)

Paginator executes additional SQL query when using QuerySet

Reported by: amir-mahdih86 Owned by:
Component: Database layer (models, ORM) Version: 5.0
Severity: Normal Keywords: Paginator, Queryset, Performance, SQL Queries
Cc: amir-mahdih86 Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

When using Django's Paginator class with a QuerySet, it executes an additional SQL count query to determine the total count of items. This does not happen when a list is passed to the Paginator.

Steps to Reproduce:

  1. Create a view that uses Paginator with a QuerySet object.
  2. Compare the SQL queries executed with those executed when a list is used instead.

Expected Behavior:
The Paginator shouldn't execute an additional query to count a QuerySet's items while it can use len() to do that.

Actual Behavior:
An additional query is executed when a QuerySet is passed while it's possible to convert QuerySet to a list before passing it to Paginator with no difference in returned object but 1 less database query execution.

Example Code:

# This code is simplified to be smaller

# This bellow view executes 3 SQL queries
class PostList(APIView):
    def get(self, request):
        posts = Post.objects.filter(is_published=True)
        paginator = Paginator(posts, 10)
        posts = paginator.page(1)
        serializer = PostSerializer(posts, many=True)
        return Response(serializer.data)


# But this one executes 2 SQL queries
class PostList(APIView):
    def get(self, request):
        posts = list(Post.objects.filter(is_published=True))
        paginator = Paginator(posts, 10)
        posts = paginator.page(1)
        serializer = PostSerializer(posts, many=True)
        return Response(serializer.data)

My Solution:
A simple solution is to modify the __init__ method of the Paginator class to convert object_list to a list object.
This is just a simple solution and it may be necessary to enhance the problem in a deeper layer.

Here is the modified version of that method:

    def __init__(
        self,
        object_list,
        per_page,
        orphans=0,
        allow_empty_first_page=True,
        error_messages=None,
    ):
        self.object_list = list(object_list)
        self._check_object_list_is_ordered()
        self.per_page = int(per_page)
        self.orphans = int(orphans)
        self.allow_empty_first_page = allow_empty_first_page
        self.error_messages = (
            self.default_error_messages
            if error_messages is None
            else self.default_error_messages | error_messages
        )

Change History (3)

comment:1 by amir-mahdih86, 6 weeks ago

Patch needs improvement: set

comment:2 by amir-mahdih86, 6 weeks ago

Type: UncategorizedBug

comment:3 by David Sanders, 6 weeks ago

Resolution: invalid
Status: newclosed

Thanks though this is intentional. Part of the reason why pagination is used is to avoid loading large datasets which would likely result in failures. The additional count query is to get the total number of records without loading the entire result set.

As you've discovered, if folks want to avoid the additional query they can materialise the queryset before passing to the paginator.

Note: See TracTickets for help on using tickets.
Back to Top