Opened 7 years ago

Last modified 7 years ago

#28796 closed Cleanup/optimization

reverse() coerces bytes in args/kwargs to str which adds "b" prefixes to the ouput — at Version 7

Reported by: Nick Owned by: nobody
Component: Documentation Version: 2.0
Severity: Normal Keywords:
Cc: nathompson7@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Tim Graham)

Suppose we have:

import hmac
from hashlib import sha384
from django.utils.http import urlsafe_base64_encode

hashed_msg = hmac.new(b"secret", msg=b"qwerty", digestmod=sha384)
url = urlsafe_base64_encode(hashed_msg.digest())

Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-like objects could be used in reverse(), creating (say) localhost:8000/asdfwerqwefa

However, in Django 2.0, the url becomes (say)

localhost:8000/b'asdfwerqwefa'

which seems less 'urlsafe', as the quotes and the 'b' are added into the url. Shouldn't urlsafe_base64_encode now return a string, if bytes in urls now retain the quotes?

I have created a full project which reproduces the bug here:

https://github.com/NAThompson/urlsafe_bug

To reproduce the bug, type:

$ ./manage.py shell
>>> import reproduce

This bug is present in commit d90936f41a2d7a3361e51fc49be033ba0f05f458, but much before that the 'django.urls.path' doesn't exist, so bisection doesn't work cleanly.

Change History (7)

comment:1 by Tim Graham, 7 years ago

Component: UncategorizedUtilities
Summary: urlsafe_base64_encode broken on Django 2.0django.utils.http.urlsafe_base64_encode() broken in Django 2.0
Type: UncategorizedBug

Could you please edit the description to include a complete code snippet that reproduces the problem?

Ideally, you could also bisect to find the commit where the behavior changed.

comment:2 by Claude Paroz, 7 years ago

I would be all for making urlsafe_base64_encode return a string, but it would be plainly backwards incompatible, as much existing code is already doing a decode() on the result of this function. See usage in django/contrib/auth/forms.py for example.

comment:3 by Nick, 7 years ago

Description: modified (diff)

comment:4 by Nick, 7 years ago

Description: modified (diff)

comment:5 by Tim Graham, 7 years ago

Component: UtilitiesCore (URLs)
Summary: django.utils.http.urlsafe_base64_encode() broken in Django 2.0reverse() coerces bytes in args/kwargs to str which adds "b" prefixes to the ouput

I completed the bisect by replacing path(...) with url('(?P<token>.*)', ... in the sample project. The commit where the behavior changed is 301de774c21d055e9e5a7073e5bffdb52bc71079.

The change in behavior is in reverse(). Args and kwargs have str() applied rather than force_text() which adds the b prefix.
https://github.com/django/django/commit/301de774c21d055e9e5a7073e5bffdb52bc71079#diff-b2f90a810a553411112f412de2a26617R424

Do you consider that a bug, Claude?

comment:6 by Claude Paroz, 7 years ago

I would say that reverse is not supposed to handle arbitrary bytestrings, so generally speaking, it's developer's role to decode any bytestring to a string (e.g. force_text(b'caf\xe9') will still produce a DjangoUnicodeDecodeError).

Now I could understand that we privilege backwards compatibility and still accept ASCII bytesstrings by reintroducing force_text (at the expense of a slight overhead). It's some sort of design choice.

comment:7 by Tim Graham, 7 years ago

Description: modified (diff)

I think we could document the backwards incompatibility and point out the solution to add .decode(), which should also work without older versions of Django. Does it seem okay, Nick?

Note: See TracTickets for help on using tickets.
Back to Top