Opened 2 years ago

Closed 22 months ago

Last modified 22 months ago

#20530 closed Bug (fixed)

Incorrect QUERY_STRING handling on Python 3

Reported by: mitsuhiko Owned by: aaugustin
Component: Core (URLs) Version: 1.5
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Certain browsers (IE cough) will not fully encode the path in the request in all situations. As such you will encounter non ascii letters in the request line. Currently the QueryDict does not handle that properly. In addition to that it also means that the WSGI QUERY_STRING variable needs to be handled the same way as PATH_INFO and SCRIPT_NAME.

Here is what is necessary to handle the case properly:

  1. the environ['QUERY_STRING'] attribute needs to go through the PEP 3333 dance on Python 3 that creates a bytes object
  2. unquoting happens on the bytes
  3. finally everything is done to the intended encoding (UTF-8)

The logic currently employed by QueryDict in combination with the WSGIRequest object is double wrong:

  1. the WSGIRequest object is not properly doing the dance and passes a (potentially mangled) unicode string to query dict
  2. the query dict decodes that incorrectly formatted unicode string (WSGI on 3.x intentionally incorrectly encodes information) causing invalid data to show up in request.args

Independently of that if bytes are passed to the QueryDict it does not do proper decoding unless the bytes are a subset of ASCII.

Change History (7)

comment:1 Changed 2 years ago by aaugustin

  • Needs documentation unset
  • Needs tests unset
  • Owner changed from nobody to aaugustin
  • Patch needs improvement unset
  • Status changed from new to assigned
  • Triage Stage changed from Unreviewed to Accepted

Thanks for the report. I'll take care of that.

comment:3 Changed 22 months ago by Aymeric Augustin <aymeric.augustin@…>

In 7bb627936034c1b9500a8d250cce75b30f980b23:

Fixed an encoding issue in the test client.

Fixed
comment_tests.tests.test_comment_view.CommentViewTests.testCommentPostRedirectWithInvalidIntegerPK.

Refs #20530.

comment:4 Changed 22 months ago by Aymeric Augustin <aymeric.augustin@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In 65b6eff322a4a3331601e111934dee95c090961c:

Fixed #20530 -- Properly decoded non-ASCII query strings on Python 3.

Thanks mitsuhiko for the report.

This commit just adds a test since the problem was fixed in 8aaca651.

comment:5 Changed 22 months ago by Aymeric Augustin <aymeric.augustin@…>

In 9244447cc4a91c22f8f2668f9667e92a1b2de958:

[1.6.x] Fixed an encoding issue in the test client.

Refs #20530.

Backport of 7bb62793 and 476b0764 from master.

Conflicts:

django/test/client.py

comment:6 Changed 22 months ago by Aymeric Augustin <aymeric.augustin@…>

In 7fcd6aa6695b39370154d6993cdbb3ba4363de91:

[1.6.x] Fixed #20530 -- Properly decoded non-ASCII query strings on Python 3.

Thanks mitsuhiko for the report.

Backport of 65b6eff3 and adaptation of 8aaca65 from master.

comment:7 Changed 22 months ago by Aymeric Augustin <aymeric.augustin@…>

In 63b95ca452ea7ef1103e599f8dd733b67278c8dc:

[1.6.x] Fixed 9244447c -- incomplete backport.

The test client had been refactored in the mean time. This commit
de-factors the fix. Refs #20530.

Note: See TracTickets for help on using tickets.
Back to Top