Opened 8 years ago

Closed 8 years ago

#27159 closed Bug (fixed)

Pickling query with an __in=inner_qs lookup causes evaluation evaluation of inner_qs

Reported by: Jani Tiainen Owned by: Jani Tiainen
Component: Database layer (models, ORM) Version: 1.10
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

According to documentation pickling queryset.query should be correct approach and it should only pickle only relevant information to recreate queryset.query again.

Given models:

from django.db import models

# Create your models here.
class ParentModel(models.Model):
    name = models.CharField(max_length=64)

class ChildModel(models.Model):
    name = models.CharField(max_length=64)

    parent = models.ForeignKey('ParentModel', related_name='children')

Following querys, when pickled causes unwanted evaluation and creating quite large pickled result:

for x in range(1,10000):
    ParentModel(name='Parent {}'.format(x), ).save()

ChildModel(name='Child 1', parent=ParentModel.objects.all()[0]).save()

parents_1 = ParentModel.objects.all().values_list('pk', flat=True)
children_1 = ChildModel.objects.filter(parent__in=parents_1)

pickled_stuff_1 = pickle.dumps(children_1.query)

parents_2 = ParentModel.objects.all()
children_2 = ChildModel.objects.filter(parent__in=parents_2)

pickled_stuff_2 = pickle.dumps(children_2.query)

# First len is about 74 kilobytes, second len is about 2 megabytes.
print len(pickled_stuff_1), len(pickled_stuff_2)

When comparing sizes of pickled queries both are relatively big. When inspecting latter pickle it can be seen that it actually contains all instances from fully evaluated queryset. Same behavior exists in 1.8 and 1.10 as well.

Attachments (1)

ticket_27159.diff (2.4 KB ) - added by DavidFozo 8 years ago.

Download all attachments as: .zip

Change History (13)

comment:1 by Tim Graham, 8 years ago

Component: UncategorizedDatabase layer (models, ORM)
Triage Stage: UnreviewedAccepted
Type: UncategorizedBug

I didn't attempt to reproduce but I trust the reporter. If you could transform the reproduction information into a test case for queryset_pickle, that's always helpful.

comment:2 by Jani Tiainen, 8 years ago

I added crude, initial test to queryset_pickle to https://github.com/jtiai/django/tree/issue_27159

Interesting enough, if you check pickled output it's clearly visible that it contains all instances from Parent-model.

If you compare query output as a string both are equal.

comment:3 by Jani Tiainen, 8 years ago

After doing some debugging it looks like pickle goes to WhereNode which in turn goes to RelatedIn query, which goes to QuerySet.__getstate__ which, by design populates full cache.

comment:4 by DavidFozo, 8 years ago

Owner: changed from nobody to DavidFozo
Status: newassigned

by DavidFozo, 8 years ago

Attachment: ticket_27159.diff added

comment:5 by DavidFozo, 8 years ago

I deleted self._fetch_all() from django/db/models/query.py QuerySet.__getstate__ , so now it doesn't populate the cache before pickling. Is it a valid solution? All tests are passing.

comment:6 by Jani Tiainen, 8 years ago

Has patch: set
Last edited 8 years ago by Tim Graham (previous) (diff)

comment:7 by Tim Graham, 8 years ago

Patch needs improvement: set

I left some comments for improvement on the PR.

comment:8 by Tim Graham, 8 years ago

Patch needs improvement: unset

comment:9 by Jani Tiainen, 8 years ago

Owner: changed from DavidFozo to Jani Tiainen
Summary: Pickling query from queryset causes unintended queryset evaluationPickling query with an __in=inner_qs lookup causes evaluation evaluation of inner_qs

comment:10 by Simon Charette, 8 years ago

Patch needs improvement: set

Left some comments on the PR.

comment:11 by Jani Tiainen, 8 years ago

Patch needs improvement: unset

Made all suggested changes and improved test cases.

comment:12 by Tim Graham <timograham@…>, 8 years ago

Resolution: fixed
Status: assignedclosed

In 7a2c271:

Fixed #27159 -- Prevented pickling a query with an in=inner_qs lookup from evaluating inner_qs.

Note: See TracTickets for help on using tickets.
Back to Top