Opened 4 weeks ago

Closed 3 weeks ago

Last modified 3 weeks ago

#36833 closed Bug (needsinfo)

HttpRequest.accepted_types incorrectly splits Accept header on commas inside quoted parameter values

Reported by: Naveed Qadir Owned by: Naveed Qadir
Component: HTTP handling Version: dev
Severity: Normal Keywords: HTTP_ACCEPT, accept
Cc: Naveed Qadir Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The accepted_types property in HttpRequest uses str.split(",") to parse the Accept header, which incorrectly splits on commas that appear inside quoted parameter values.

Example

# Accept header with quoted parameter containing comma
header = 'text/plain; param="a,b", application/json'

# Current behavior (WRONG):
header.split(",")
# Returns: ['text/plain; param="a', 'b"', ' application/json']
# 3 parts - comma inside quotes was incorrectly treated as separator

# Expected behavior (per RFC 7231):
# Should return 2 media types:
# 1. text/plain; param="a,b"
# 2. application/json

RFC Reference

RFC 7231 Section 5.3.2 specifies that media-type parameters can contain quoted-string values, and RFC 7230 Section 3.2.6 allows commas within quoted strings.

Proposed Fix

Add a split_header_words() helper function to django/utils/http.py that splits on commas while respecting quoted strings, similar to how _parseparam() handles semicolons.

A patch with tests is available.

Change History (2)

comment:1 by Jacob Walls, 3 weeks ago

Keywords: HTTP_ACCEPT accept added
Resolution: needsinfo
Status: assignedclosed

Do you have an example of real-world HTTP traffic that sends params like that for the accept header?

Looking at the provided patch, this is too much complexity for the benefit. I'd also expect to block this on a resolution for #35440, with the hope that we can leverage some existing pattern for param parsing using python's stdlib.

in reply to:  1 comment:2 by Naveed Qadir, 3 weeks ago

Thanks for the feedback and for pointing me to #35440 — I agree that reusing a stdlib-based approach would be preferable if we can do so without performance regressions.
Regarding real-world usage: I’m not aware of common browsers or clients emitting Accept headers with quoted parameters containing commas or escaped quotes. The change was motivated by spec-permitted behavior and to avoid incorrect parsing when such headers do appear, but I agree this is rare in practice.
Given the complexity concerns and the direction of #35440, I’m happy to defer this and continue the discussion there.
Replying to Jacob Walls:

Do you have an example of real-world HTTP traffic that sends params like that for the accept header?

Looking at the provided patch, this is too much complexity for the benefit. I'd also expect to block this on a resolution for #35440, with the hope that we can leverage some existing pattern for param parsing using python's stdlib.

Note: See TracTickets for help on using tickets.
Back to Top