Opened 15 months ago

Last modified 4 months ago

#22841 new New feature

ModelChoiceField does not make it easy to reuse querysets

Reported by: mjtamlyn Owned by: nobody
Component: Forms Version: master
Severity: Normal Keywords:
Cc: loic84, robinchew@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

ModelChoiceField is designed to aggressively ensure that the queryset is always copied and executed anew each time. This is generally a good idea, but it has performance implications, especially where formsets of modelforms are concerned. The situation is not restricted to formsets however, there are other use cases where you may already have the executed queryset you need for the form within that request/response cycle.

Here's a simple example of a view and form with the API I might like to see.

# views.py

def book_create(request):
    categories = Category.objects.all()
    if request.method == 'POST':
        form = BookForm(data=request.POST, categories=categories)
        if form.is_valid():
            form.save()
            return HttpResponseRedirect(reverse('book_list'))
    else:
        form = BookForm(categories=categories)
    context = {
        'categories': categories,
        'form': form,
    }
    return render('book_form.html', context)

# forms.py

class BookForm(forms.ModelForm):
    class Meta:
        model = Book
        fields = ['name', 'category']

    def __init__(self, categories, **kwargs):
        super(BookForm, self).__init__(**kwargs)
        self.fields['category'].evaluated_queryset = categories

So we have a view to create a book, but that view has the list of categories in the context as it also includes a by-category navigation in a sidebar. As a result, in order to render the view we currently have to execute Category.objects.all() twice - once to render the navigation and once for the form.

I have introduced a new proposed API to the ModelChoiceField (form.fields['category'] in the example), currently called evaluated_queryset. This will be used by the ModelChoiceIterator *without* calling .all() on it, allowing the same queryset cache to be used twice within the view.


The current "best approach" for doing this that I've found looks as follows:

class BookForm(forms.ModelForm):
    # ...
    def __init__(self, categories, **kwargs):
        super(BookForm, self).__init__(**kwargs)
        iterator = ModelChoiceIterator(self.fields['category'])
        choices = [iterator.choice(obj) for obj in categories]
        self.fields['category'].choices = choices

Whilst this is functional, it is not a particularly nice API. If we are happy with it as the correct pattern, we should document it, but at present ModelChoiceIterator is not documented, and it probably shouldn't be.


Possible more general improvements which become possible with a feature like this:

  • Automatic sharing of querysets between identical forms in a formset
  • Similarly, if the queryset has been executed then we can check inside it instead of doing the additional .get() query on data validation. This has a small performance gain on write in certain circumstances - in particular where you have a formset with 10+ fields, loading the full queryset once will be more efficient than doing 10 .get() queries.
  • Inlines and list editable in the admin could use this feature for significant performance improvements

Change History (5)

comment:1 Changed 15 months ago by bmispelon

  • Triage Stage changed from Unreviewed to Accepted

+1

comment:2 Changed 15 months ago by loic84

  • Cc loic84 added

+1.

I'm not sure if we document that form instances should be short-lived (i.e. the span of a request), but if we don't we should and fixing this ticket would make it a requirement.

comment:3 Changed 14 months ago by kunitoki

i encounter this in every app i'm coding that has lot of foreign keys. looking at the produced sql queries there are tons of equals queries just to populate the same identical dropdowns. should definately be addressed as it is a huge performance bottleneck !

+1

comment:4 Changed 14 months ago by tomek

+1

I see this is quite naive solution, but couldn't we just eliminate all() like that?

class ModelChoiceIterator(object):
    def __iter__(self):
         if self.field.empty_label is not None:
             yield ("", self.field.empty_label)
         if self.field.cache_choices:
             if self.field.choice_cache is None:
                self.field.choice_cache = [
                    self.choice(obj) for obj in self.queryset.all()
                ]
            for choice in self.field.choice_cache:
                yield choice
        else:
            for obj in self.queryset.all():
                yield self.choice(obj)
class ModelChoiceIterator(object):
    def __iter__(self):
         if self.field.empty_label is not None:
             yield ("", self.field.empty_label)
         if self.field.cache_choices:
             if self.field.choice_cache is None:
                self.field.choice_cache = [
                    self.choice(obj) for obj in self.queryset.all()
                ]
            for choice in self.field.choice_cache:
                yield choice
        else:
            try:
                for obj in self.queryset:
                    yield self.choice(obj)
            except TypeError:
                for obj in self.queryset.all():
                    yield self.choice(obj)

As I said, naive... so I'll be glad if someone points out why not this way.

comment:5 Changed 4 months ago by robin

  • Cc robinchew@… added
Note: See TracTickets for help on using tickets.
Back to Top