Code

Opened 11 months ago

Closed 8 months ago

Last modified 8 months ago

#20530 closed Bug (fixed)

Incorrect QUERY_STRING handling on Python 3

Reported by: mitsuhiko Owned by: aaugustin
Component: Core (URLs) Version: 1.5
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Certain browsers (IE cough) will not fully encode the path in the request in all situations. As such you will encounter non ascii letters in the request line. Currently the QueryDict does not handle that properly. In addition to that it also means that the WSGI QUERY_STRING variable needs to be handled the same way as PATH_INFO and SCRIPT_NAME.

Here is what is necessary to handle the case properly:

  1. the environ['QUERY_STRING'] attribute needs to go through the PEP 3333 dance on Python 3 that creates a bytes object
  2. unquoting happens on the bytes
  3. finally everything is done to the intended encoding (UTF-8)

The logic currently employed by QueryDict in combination with the WSGIRequest object is double wrong:

  1. the WSGIRequest object is not properly doing the dance and passes a (potentially mangled) unicode string to query dict
  2. the query dict decodes that incorrectly formatted unicode string (WSGI on 3.x intentionally incorrectly encodes information) causing invalid data to show up in request.args

Independently of that if bytes are passed to the QueryDict it does not do proper decoding unless the bytes are a subset of ASCII.

Attachments (0)

Change History (7)

comment:1 Changed 11 months ago by aaugustin

  • Needs documentation unset
  • Needs tests unset
  • Owner changed from nobody to aaugustin
  • Patch needs improvement unset
  • Status changed from new to assigned
  • Triage Stage changed from Unreviewed to Accepted

Thanks for the report. I'll take care of that.

comment:3 Changed 8 months ago by Aymeric Augustin <aymeric.augustin@…>

In 7bb627936034c1b9500a8d250cce75b30f980b23:

Fixed an encoding issue in the test client.

Fixed
comment_tests.tests.test_comment_view.CommentViewTests.testCommentPostRedirectWithInvalidIntegerPK.

Refs #20530.

comment:4 Changed 8 months ago by Aymeric Augustin <aymeric.augustin@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In 65b6eff322a4a3331601e111934dee95c090961c:

Fixed #20530 -- Properly decoded non-ASCII query strings on Python 3.

Thanks mitsuhiko for the report.

This commit just adds a test since the problem was fixed in 8aaca651.

comment:5 Changed 8 months ago by Aymeric Augustin <aymeric.augustin@…>

In 9244447cc4a91c22f8f2668f9667e92a1b2de958:

[1.6.x] Fixed an encoding issue in the test client.

Refs #20530.

Backport of 7bb62793 and 476b0764 from master.

Conflicts:

django/test/client.py

comment:6 Changed 8 months ago by Aymeric Augustin <aymeric.augustin@…>

In 7fcd6aa6695b39370154d6993cdbb3ba4363de91:

[1.6.x] Fixed #20530 -- Properly decoded non-ASCII query strings on Python 3.

Thanks mitsuhiko for the report.

Backport of 65b6eff3 and adaptation of 8aaca65 from master.

comment:7 Changed 8 months ago by Aymeric Augustin <aymeric.augustin@…>

In 63b95ca452ea7ef1103e599f8dd733b67278c8dc:

[1.6.x] Fixed 9244447c -- incomplete backport.

The test client had been refactored in the mean time. This commit
de-factors the fix. Refs #20530.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.