Opened 7 years ago

Closed 3 months ago

#22841 closed New feature (wontfix)

ModelChoiceField does not make it easy to reuse querysets

Reported by: Marc Tamlyn Owned by:
Component: Forms Version: master
Severity: Normal Keywords:
Cc: loic84, robinchew@…, mattdentremont@…, Carlton Gibson Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

ModelChoiceField is designed to aggressively ensure that the queryset is always copied and executed anew each time. This is generally a good idea, but it has performance implications, especially where formsets of modelforms are concerned. The situation is not restricted to formsets however, there are other use cases where you may already have the executed queryset you need for the form within that request/response cycle.

Here's a simple example of a view and form with the API I might like to see.

# views.py

def book_create(request):
    categories = Category.objects.all()
    if request.method == 'POST':
        form = BookForm(data=request.POST, categories=categories)
        if form.is_valid():
            form.save()
            return HttpResponseRedirect(reverse('book_list'))
    else:
        form = BookForm(categories=categories)
    context = {
        'categories': categories,
        'form': form,
    }
    return render('book_form.html', context)

# forms.py

class BookForm(forms.ModelForm):
    class Meta:
        model = Book
        fields = ['name', 'category']

    def __init__(self, categories, **kwargs):
        super(BookForm, self).__init__(**kwargs)
        self.fields['category'].evaluated_queryset = categories

So we have a view to create a book, but that view has the list of categories in the context as it also includes a by-category navigation in a sidebar. As a result, in order to render the view we currently have to execute Category.objects.all() twice - once to render the navigation and once for the form.

I have introduced a new proposed API to the ModelChoiceField (form.fields['category'] in the example), currently called evaluated_queryset. This will be used by the ModelChoiceIterator *without* calling .all() on it, allowing the same queryset cache to be used twice within the view.


The current "best approach" for doing this that I've found looks as follows:

class BookForm(forms.ModelForm):
    # ...
    def __init__(self, categories, **kwargs):
        super(BookForm, self).__init__(**kwargs)
        iterator = ModelChoiceIterator(self.fields['category'])
        choices = [iterator.choice(obj) for obj in categories]
        self.fields['category'].choices = choices

Whilst this is functional, it is not a particularly nice API. If we are happy with it as the correct pattern, we should document it, but at present ModelChoiceIterator is not documented, and it probably shouldn't be.


Possible more general improvements which become possible with a feature like this:

  • Automatic sharing of querysets between identical forms in a formset
  • Similarly, if the queryset has been executed then we can check inside it instead of doing the additional .get() query on data validation. This has a small performance gain on write in certain circumstances - in particular where you have a formset with 10+ fields, loading the full queryset once will be more efficient than doing 10 .get() queries.
  • Inlines and list editable in the admin could use this feature for significant performance improvements

Change History (10)

comment:1 Changed 7 years ago by Baptiste Mispelon

Triage Stage: UnreviewedAccepted

+1

comment:2 Changed 7 years ago by loic84

Cc: loic84 added

+1.

I'm not sure if we document that form instances should be short-lived (i.e. the span of a request), but if we don't we should and fixing this ticket would make it a requirement.

comment:3 Changed 7 years ago by Lucio Asnaghi

i encounter this in every app i'm coding that has lot of foreign keys. looking at the produced sql queries there are tons of equals queries just to populate the same identical dropdowns. should definately be addressed as it is a huge performance bottleneck !

+1

comment:4 Changed 7 years ago by tomek

+1

I see this is quite naive solution, but couldn't we just eliminate all() like that?

class ModelChoiceIterator(object):
    def __iter__(self):
         if self.field.empty_label is not None:
             yield ("", self.field.empty_label)
         if self.field.cache_choices:
             if self.field.choice_cache is None:
                self.field.choice_cache = [
                    self.choice(obj) for obj in self.queryset.all()
                ]
            for choice in self.field.choice_cache:
                yield choice
        else:
            for obj in self.queryset.all():
                yield self.choice(obj)
class ModelChoiceIterator(object):
    def __iter__(self):
         if self.field.empty_label is not None:
             yield ("", self.field.empty_label)
         if self.field.cache_choices:
             if self.field.choice_cache is None:
                self.field.choice_cache = [
                    self.choice(obj) for obj in self.queryset.all()
                ]
            for choice in self.field.choice_cache:
                yield choice
        else:
            try:
                for obj in self.queryset:
                    yield self.choice(obj)
            except TypeError:
                for obj in self.queryset.all():
                    yield self.choice(obj)

As I said, naive... so I'll be glad if someone points out why not this way.

comment:5 Changed 6 years ago by Robin

Cc: robinchew@… added

comment:6 Changed 5 years ago by Matt d'Entremont

Cc: mattdentremont@… added

comment:7 Changed 4 years ago by Michael Vasiliou

Owner: changed from nobody to Michael Vasiliou
Status: newassigned

comment:8 Changed 11 months ago by Carlton Gibson

Cc: Carlton Gibson added
Owner: Michael Vasiliou deleted
Status: assignednew

comment:9 Changed 3 months ago by powderflask

+1
I run into this a fair bit with ModelFormsets that include ModelChoiceFields. Work arounds are clunky and largely undocumented. Thanks to anyone who has a clever idea.

comment:10 Changed 3 months ago by Carlton Gibson

Resolution: wontfix
Status: newclosed

I gave a talk on this exact topic for DjangoCon Europe 2020. It's only 20 minutes. I show examples for a DRF serializer form, a regular form with django-filter and a FormSet via the admin using list_editable

The video is here: Choose and Choose Quickly: Optimizing ModelChoiceField

  • We can't drop the all() calls, since they're explicitly there to avoid reusing cached data. (From the talk: "There's only one thing worse that slow data, and that's wrong data.")
  • I don't think an additional evaluated_queryset attribute is necessary because (at any place you have that code in hand) setting choices directly is no more work (certainly not enough to merit extra API)
  • Setting choices is exactly the intended API here. There's not an exact optimization pattern documented but, choices has been ≈forever, and ModelChoiceIterator has been documented since 5da85ea73724d75e609c5ee4316e7e5be8f17810.

TBH I don't think one needs to bring ModelChoiceIterator into play. The example I use in the talk is just:

author_choices = list(forms.ModelChoiceField(Author.objects.all()).choices)

Because I'm doing this outside of the form context, in order to provide to the form later:

self.fields["author"].choices = author_choices

The example in the description, purely inside BookForm.__init__() could well just be:

self.fields['category'].choices  = list(self.fields['category'].choices)

You evaluate once, setting the output for reuse later.

Note: See TracTickets for help on using tickets.
Back to Top