unhelpful queryset handling for model formsets with data
|Reported by:||Jim Bailey||Owned by:|
|Has patch:||yes||Needs documentation:||no|
|Needs tests:||yes||Patch needs improvement:||yes|
When a model formset is constructed with data there are two related issues with the queryset handling.
The first is a performance issue - if a queryset is not specified then the _object_dict cache gets filled with every model instance in the database even if the data only deals with a subset of the instances.
The second problem can happen when a queryset is specified, and can lead to unexpected duplication of records in the database. If the specified queryset does not cover all the instances specified in the data, then the missmatched instances are assumed to be new, their pk is ignored, and they are written as new instances on save.
For example, a user has their pagination set to 2 items per page. There are the following Authors in the system:
They load the second page of authors sorted by name so the formset has details for (3, 'e') and (4, 'g').
Another user then adds another author (5, 'b').
The first user then saves the page without changes. The model formset is contructed with the data (3, 'e') and (4, 'g'), but also a queryset for the second page of authors sorted by name, which now contains (2, 'c') and (3, 'e').
The records with pk 3 is found and updated as expected. The record with pk 4 is not found in the queryset, so a new
Author instance is created: (6, 'g').
There are now two authors with the name 'g'.
I think the solution to both issues is the same. When a model formset is constructed with data the only records that should be loaded from the database and operated on are those specified by the pks in the data. The queryset used to pull in that data can and should be constructed automatically from the data. Any queryset specified in the constructor should be ignored.