Opened 4 years ago
Last modified 2 years ago
#32244 closed Cleanup/optimization
ORM inefficiency: ModelFormSet executes a single-object SELECT query per formset instance when saving/validating — at Initial Version
Reported by: | Lushen Wu | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.1 |
Severity: | Normal | Keywords: | formsets |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Conceptual summary of the issue:
Let's say we have a Django app with Author
and Book
models, and use a BookFormSet
to add / modify / delete books that are created by a given Author
. The problem is when the BookFormSet
is validated, ModelChoiceField.to_python()
ends up calling self.queryset.get(id=123)
which results in a single-object SELECT query for each book in the formset. That means if I want to update 15 books, Django performs 15 separate SELECT queries, which seems incredibly inefficient. (Our actual app is an editor that can update any number of objects in a single formset, e.g. 50+).
My failed attempts to solve this:
- First I tried passing a queryset to the
BookFormSet
, i.e.formset = BookFormSet(data=request.POST, queryset=Book.objects.filter(author=1))
, but theModelChoiceField
still does its single-object SELECT queries. - Then I tried to see where the
ModelChoiceField
defines its queryset, which seems to be inBaseModelFormSet.add_fields()
. I tried initiating theModelChoiceField
with the same queryset that I passed to the formset, e.g.Book.objects.filter(author=1)
instead of the original code which would beBook._default_manager.get_queryset()
. But this doesn't help because I guess the new queryset I defined isn't actually linked to what was passed to the formset (and we don't have a cache running). So the multiple SELECT queries still happen. (Note: I realize_default_manager.get_queryset
is necessary for use cases where you would want to switch one Model instance for another one which might not be in the original queryset passed to theBaseModelFormset
, but this is not our use case) - I noticed that
BaseFormSet._existing_object()
provides a way to check whether an object exists in the queryset that was giving to the FormSet constructor, which means that queryset is evaluated at most once and the results stored inBaseFormSet._object_dict
. I thought there might be some way to haveModelChoiceField.to_python()
do something similar before callingself.queryset.get(id=123)
, but I don't thinkModelChoiceField
is aware ofBaseFormSet
, and it would seem an anti-pattern to reach up the hierarchy like this.
The easiest solution seems to me to pass BaseFormSet._object_dict
in some way to each ModelForm
that's created, and then allow the ModelChoiceField
to check _object_dict
before making another SELECT query.