Opened 2 months ago

Closed 2 months ago

#34067 closed Bug (needsinfo)

django.core.Paginator wrong query slicing

Reported by: Hristo Trendafilov Owned by: nobody
Component: Core (Other) Version: 3.2
Severity: Normal Keywords: Paginator, slice, queryset
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Hristo Trendafilov)

A have a ListView defined like so:

class AllPracticesListView(LoginRequiredMixin, PermissionRequiredMixin, ListView, ToolBoxFunctions):
    template_name = 'practice_list.html'
    model = Practice
    paginate_by = 6
    permission_required = 'customauth.access_practices'

    def get_queryset(self):
        qs = self.get_all_practices_by_user().order_by(
            'company_branch__pk', 'practice_profile__personal_profile__last_name').prefetch_related('practice_profile')
        
        # method < get_all_practices_by_user > is a toolbox method that returns:  
        # Practice.objects.filter(company_branch__owner=self.request.user).order_by('pk')

        return qs

Models are defined like so:

      
 class Practice(models.Model):           
                company_branch = models.ForeignKey(
                         CompanyBranch,
                         on_delete=models.PROTECT,
                         related_name='practices',
                    )                
                # CompanyBranch model has an owner field which ForeignKey to the user model       
                
                practice_profile = models.ForeignKey(
                    PracticeProfile,
                    on_delete=models.PROTECT,
                    related_name='practices',
                )
      
class PracticeProfile(models.Model):               
                    personal_profile = models.ForeignKey(
                        PersonalProfile,
                        on_delete=models.PROTECT,
                        blank=True,
                        null=True,
                    )
                    # PersonalProfile model has a field called < last_name > 

Or schematically:

Practice object relations: 
      -> CompanyBranch 
      -> PracticeProfile 
          -> PersonalProfile /*NULLABLE RELATION/
       

When the code is run, the view does not display results correctly.

I did like so to get the WRONG result:
Added a breakpoint on Paginator._get_page return statement

Queryset is passed okay and is available in
Paginator.object_list

But in the args of Paginator._get_page the queryset is totally different than expected / should be a slice from the first six elements of Paginator.object_list /

I have tried to add a breakpoint like so https://imgur.com/SdaQUt6
Then I get this result https://imgur.com/5zzLNV0.

As you can see, PKs 1632, 1624, etc. are missing, objects are different/marked red and purple/, and are actually removed and never shown in the view.

If you run slice manually at this point - everything is again fine https://imgur.com/yvn8KMO
This happens only on NULL PersonalProfile objects.

I did like so to get OKAY results:

  1. Added a simple breakpoint is added on Paginator.page like so https://imgur.com/d7J5BMZ and that breakpoint is just run then the result is okay https://imgur.com/ILDHAMN
  1. changed paginate_by to 10
  1. changed order_by in get_queryset to -pk
  1. call again the page = paginator.page(page_number)

It is Django fault for those reasons:

  • order_by might be complicated but is supported by the framework
  • Chained order_by is also supported
  • queryset slicing is a feature of Django queryset API.
  • queryset slicing should act like list and string slicing and namely to provide you with results [ from : to ] indexes provided and not to change or lose data in between. Or in other words - what is passed in the Paginator.object_list should be just sliced [ from : to ]. Paginator should never care for the queryset, what is the ordering of the queryset, what is excluded and so on - it should just work with it.
  • changing paginate_by to 10 or greater fixes the problem. paginate_by is a standard and documented Django ListView setting.
  • changing order_by to some other order_by also fixes the problem. order_by should work the same or if any limitations those should be well documented
  • removing order_by also fixes the problem.

P.S. That is also happening in Django 2.2

Attachments (2)

images.zip (189.1 KB) - added by Hristo Trendafilov 2 months ago.
Some of the images
images2.zip (254.6 KB) - added by Hristo Trendafilov 2 months ago.
Other images

Download all attachments as: .zip

Change History (11)

Changed 2 months ago by Hristo Trendafilov

Attachment: images.zip added

Some of the images

Changed 2 months ago by Hristo Trendafilov

Attachment: images2.zip added

Other images

comment:1 Changed 2 months ago by Hristo Trendafilov

Description: modified (diff)

comment:2 Changed 2 months ago by Mariusz Felisiak

Resolution: needsinfo
Status: newclosed

Thanks for the report, however Paginator works for me. I'm also not sure why you're using _get_page() which is a private undocumented API. I don't think you've explained the issue in enough detail to confirm a bug in Django. Please reopen the ticket if you can debug your issue and provide details about why and where Django is at fault. If you're having trouble in debugging, see TicketClosingReasons/UseSupportChannels for ways to get help.

comment:3 in reply to:  2 Changed 2 months ago by Hristo Trendafilov

Description: modified (diff)

Replying to Mariusz Felisiak:

Thanks for the report, however Paginator works for me. I'm also not sure why you're using _get_page() which is a private undocumented API. I don't think you've explained the issue in enough detail to confirm a bug in Django. Please reopen the ticket if you can debug your issue and provide details about why and where Django is at fault. If you're having trouble in debugging, see TicketClosingReasons/UseSupportChannels for ways to get help.

comment:4 in reply to:  2 Changed 2 months ago by Hristo Trendafilov

Description: modified (diff)

Replying to Mariusz Felisiak:

Thanks for the report, however Paginator works for me. I'm also not sure why you're using _get_page() which is a private undocumented API. I don't think you've explained the issue in enough detail to confirm a bug in Django. Please reopen the ticket if you can debug your issue and provide details about why and where Django is at fault. If you're having trouble in debugging, see TicketClosingReasons/UseSupportChannels for ways to get help.

I don't use _get_page(), I have just added a breakpoint inside it so that I could trace what is populated inside of it since it is what generates the Page object after all. It is clear that args that Page object is called with are different than expected.
What more details do you need me to provide?
I could debug and replicate that issue every time.
It is clearly visible from the screenshot that I have provided that slicing gives a different result than expected and is also skipping data which is happening in a production environment.

comment:5 Changed 2 months ago by Tim Graham

You should explain why Django is at fault. I'm guessing that your complicated query might not be returning results in a consistent order. There's nothing Django can do about that.

comment:6 Changed 2 months ago by Hristo Trendafilov

Description: modified (diff)

comment:7 in reply to:  6 Changed 2 months ago by Hristo Trendafilov

Replying to Hristo Trendafilov:

Just added that in the description. At the end of the day, it is expected Paginator just slice whatever is provided, not to change or lose data in between. Changing paginate_by leads to fix and that is a feature of Django.

comment:8 Changed 2 months ago by Hristo Trendafilov

Resolution: needsinfo
Status: closednew

comment:9 in reply to:  description Changed 2 months ago by Mariusz Felisiak

Resolution: needsinfo
Status: newclosed

It is Django fault for those reasons:

  • order_by might be complicated but is supported by the framework
  • Chained order_by is also supported

Yes, but you should be aware what you are using in the ORDER BY clause and that some fields may cause issues, e.g. multi-valued fields (see notes in docs). Django cannot decide for you.

  • queryset slicing is a feature of Django queryset API.
  • queryset slicing should act like list and string slicing and namely to provide you with results [ from : to ] indexes provided and not to change or lose data in between. Or in other words - what is passed in the Paginator.object_list should be just sliced [ from : to ]. Paginator should never care for the queryset, what is the ordering of the queryset, what is excluded and so on - it should just work with it.

This would require fetching all results and slicing on the Python side not in the database which is very ineffective and would cause many performance issues.

  • changing paginate_by to 10 or greater fixes the problem. paginate_by is a standard and documented Django ListView setting.
  • changing order_by to some other order_by also fixes the problem. order_by should work the same or if any limitations those should be well documented

Limitations are mostly documented, but we cannot document all caveats.

P.S. That is also happening in Django 2.2

Django 2.2 is not supported anymore, and Django 3.2 is in extended support so it receives only security patches.

I understand that you spend a lot of time trying to debug your issue, but we don't have your app, your data, and your database, so we're unable to help. Also, Trac is not a support channel.

I'd like to highly recommend using one of support channel where folks will help you. If you'll still be convinced that Django is at fault, then please attach a sample minimal project that reproduces the issue.

Note: See TracTickets for help on using tickets.
Back to Top