Opened 3 weeks ago
Last modified 3 weeks ago
#36913 assigned Cleanup/optimization
Optimise ChoiceField / MultipleChoiceField handling of duplicate submissions
| Reported by: | Jake Howard | Owned by: | Sarah Boyce |
|---|---|---|---|
| Component: | Forms | Version: | 6.0 |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Accepted | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
When a ChoiceField / MultipleChoiceField has 5 possible choices, but the form submits 25 values, the choices values are compared once per submitted value. If the submitted values are duplicates, the validation doesn't terminate early, but can still spend a lot of time unnecessarily validating values. This can be very slow when large (~30k) numbers of values are submitted.
A suggested fix is to only validate the unique submitted values (for example for val in set(value)).
This issue was reported to the Security Team, but deemed not a security issue due to the minimal impact when given reasonable input (in the bounds of the security policy).
Change History (5)
comment:1 by , 3 weeks ago
comment:2 by , 3 weeks ago
| Triage Stage: | Unreviewed → Accepted |
|---|
follow-up: 5 comment:3 by , 3 weeks ago
I’d like to work on this issue.
From what I understand the slowdown happens because duplicate submitted values are validated again and again. I’m thinking of validating only the unique values internally while keeping the original list unchanged so behaviour stays the same. I’ll add some tests too to make sure duplicates and invalid values are still handled correctly.
If this sounds reasonable and no one is working on this, i would like to assign the issue to myself.
comment:4 by , 3 weeks ago
I’d like to work on this issue.
From what I understand the slowdown happens because duplicate submitted values are validated again and again. I’m thinking of validating only the unique values internally while keeping the original list unchanged so behaviour stays the same. I’ll add some tests too to make sure duplicates and invalid values are still handled correctly.
If this sounds reasonable and no one else is working on it, i would like to assign the issue to myself.
comment:5 by , 3 weeks ago
Replying to Abhimanyu Singh Negi:
I’d like to work on this issue.
From what I understand the slowdown happens because duplicate submitted values are validated again and again. I’m thinking of validating only the unique values internally while keeping the original list unchanged so behaviour stays the same. I’ll add some tests too to make sure duplicates and invalid values are still handled correctly.
If this sounds reasonable and no one is working on this, i would like to assign the issue to myself.
This ticket has been assigned already and has been reserved to evaluate under the Djangonaut Space umbrella. Sorry!
Note that validation that submissions (and choices) are unique is being handled in a separate feature request: https://github.com/django/new-features/issues/121