Opened 9 years ago

Closed 8 years ago

Last modified 7 years ago

#12197 closed (invalid)

parse_accept_lang_header parse HTTP_ACCEPT_LANGUAGE with wrong Quality factors

Reported by:… Owned by: nobody
Component: Internationalization Version: 1.1
Severity: Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description (last modified by Adrian Holovaty)

According to the RFC 2616 <>, HTTP_ACCEPT_LANGUAGE could have several languages options, such as

da, en-gb;q=0.8, en;q=0.7

So, django parse those string with parse_accept_lang_header method, and order the options base on its 'q' quality factor.

But the re.split seems return a list with a empty string as first element, such as

s = 'da, en-gb;q=0.8, en;q=0.7'


['', 'da', None, '', 'en-gb', '0.8', '', 'en', '0.7', '']

But parse_accept_lang_header use it from index 0, it means they align to wrong index

To fix the issues, you could modify the accept_language_re, or use the return list from index 1.

def parse_accept_lang_header(lang_string):
    Parses the lang_string, which is the body of an HTTP Accept-Language
    header, and returns a list of (lang, q-value), ordered by 'q' values.

    Any format errors in lang_string results in an empty list being returned.
    result = []
    pieces = accept_language_re.split(lang_string)
    if pieces[-1]:
        return []
    for i in range(0, len(pieces) - 1, 3):
        first, lang, priority = pieces[i : i + 3]
        if first:
            return []
        priority = priority and float(priority) or 1.0
        result.append((lang, priority))
    result.sort(lambda x, y: -cmp(x[1], y[1]))
    return result

Change History (4)

comment:1 Changed 9 years ago by anonymous

comment:2 Changed 8 years ago by Adrian Holovaty

Description: modified (diff)

comment:3 Changed 8 years ago by Ramiro Morales

Has patch: unset
Resolution: invalid
Status: newclosed

There is a test case in the Django test suite that test exactly for a header value almost identical to the one you posted that demonstrates the algorihtm is working as expected:

The fact that the list after splitting the header value by the RE is represented by three consecutive items for each lang specification element is by design and the value of the first one (first variable) is used to detect incorrectly formatted values if not empty, and then discarded.

So, I don't know your report is because of an issue you are seeing in real world usage or the result of a theoretical analisys. If it's the former please post more details, in the meantime I'm marking this ticket as invalid.

comment:4 Changed 7 years ago by Jacob

milestone: 1.2

Milestone 1.2 deleted

Note: See TracTickets for help on using tickets.
Back to Top