Opened 6 weeks ago

Last modified 42 hours ago

#36913 assigned Cleanup/optimization

Optimise ChoiceField / MultipleChoiceField handling of duplicate submissions — at Version 9

Reported by: Jake Howard Owned by: Afenomamy
Component: Forms Version: 6.0
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Afenomamy)

When a ChoiceField / MultipleChoiceField has 5 possible choices, but the form submits 25 values, the choices values are compared once per submitted value. If the submitted values are duplicates, the validation doesn't terminate early, but can still spend a lot of time unnecessarily validating values. This can be very slow when large (~30k) numbers of values are submitted.

A suggested fix is to only validate the unique submitted values (for example for val in set(value)).

This issue was reported to the Security Team, but deemed not a security issue due to the minimal impact when given reasonable input (in the bounds of the security policy).

sugessted PR : https://github.com/django/django/pull/20960

Change History (9)

comment:1 by Jake Howard, 6 weeks ago

Note that validation that submissions (and choices) are unique is being handled in a separate feature request: https://github.com/django/new-features/issues/121

comment:2 by Jacob Walls, 6 weeks ago

Triage Stage: UnreviewedAccepted

comment:3 by Abhimanyu Singh Negi, 6 weeks ago

I’d like to work on this issue.

From what I understand the slowdown happens because duplicate submitted values are validated again and again. I’m thinking of validating only the unique values internally while keeping the original list unchanged so behaviour stays the same. I’ll add some tests too to make sure duplicates and invalid values are still handled correctly.

If this sounds reasonable and no one is working on this, i would like to assign the issue to myself.

Last edited 6 weeks ago by Abhimanyu Singh Negi (previous) (diff)

comment:4 by Abhimanyu Singh Negi, 6 weeks ago

Last edited 6 weeks ago by Abhimanyu Singh Negi (previous) (diff)

in reply to:  3 comment:5 by Natalia Bidart, 6 weeks ago

Replying to Abhimanyu Singh Negi:

I’d like to work on this issue.

From what I understand the slowdown happens because duplicate submitted values are validated again and again. I’m thinking of validating only the unique values internally while keeping the original list unchanged so behaviour stays the same. I’ll add some tests too to make sure duplicates and invalid values are still handled correctly.

If this sounds reasonable and no one is working on this, i would like to assign the issue to myself.

This ticket has been assigned already and has been reserved to evaluate under the Djangonaut Space umbrella. Sorry!

comment:6 by Afenomamy, 2 weeks ago

Hello, I am Aina (Team Saturn) from Djangonaut Space. I will be looking at this ticket

comment:7 by Afenomamy, 2 weeks ago

Owner: changed from Sarah Boyce to Afenomamy

comment:8 by Afenomamy, 42 hours ago

Has patch: set

comment:9 by Afenomamy, 42 hours ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.
Back to Top