Opened 9 years ago
Last modified 9 years ago
#28121 closed Bug
force_text incorrectly handles SafeBytes under PY3 — at Version 7
| Reported by: | Thomas Achtemichuk | Owned by: | nobody |
|---|---|---|---|
| Component: | Utilities | Version: | 1.11 |
| Severity: | Normal | Keywords: | |
| Cc: | tom@… | Triage Stage: | Unreviewed |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Under python 3 & Django 1.8.18, 1.9.13, 1.10.7, 1.11 and master, calling force_text on an instance of SafeBytes causes a str to be returned rather than an instance of SafeText.
>>> from django.utils.safestring import SafeBytes, SafeText
>>> from django.utils.encoding import force_text
>>> type(force_text(SafeText('')))
django.utils.safestring.SafeText
>>> type(force_text(SafeBytes(b'')))
str
This causes byte strings run through mark_safe and rendered in a template to be incorrectly escaped.
>>> from django.template import Template, Context
>>> from django.utils.safestring import mark_safe
>>> Template('{{ x }}').render(Context({'x': mark_safe(b'&')}))
'&'
>>> Template('{{ x }}').render(Context({'x': mark_safe('&')}))
'&'
Edit: This behavior differs from the same code run under PY2:
>>> type(force_text(SafeBytes(b'&'))) django.utils.safestring.SafeText
And disagrees with the comment in force_text:
# Note: We use .decode() here, instead of six.text_type(s, encoding,
# errors), so that if s is a SafeBytes, it ends up being a
# SafeText at the end.
Change History (11)
comment:1 by , 9 years ago
comment:2 by , 9 years ago
Added some patches against various stable branches and master. Not sure of the process for submitting PRs - is one per branch OK?
Also see that SafeBytes has been deprecated for internal use in 2.0, so perhaps best just to ignore the patch against master.
comment:3 by , 9 years ago
| Resolution: | → wontfix |
|---|---|
| Status: | new → closed |
Based on the supported versions policy, the patch doesn't seem to qualify for a backport to the stable branches, so closing as wontfix since the issue isn't really applicable on master which supports Python 3 only.
comment:4 by , 9 years ago
Tim,
This came up when bootstrapping a SPA's template with the output of DRF's JSONRenderer which produces utf-8 encoded json. Something like the following:
def app_home(request):
return render(
request,
'app_base.html',
{'init_data': mark_safe(JSONRenderer().render(SomeSerializer.data))}
)
We're preparing to switch over to python3, and this bug has lead to a fairly extensive audit of everywhere we use mark_safe and pass values into templates.
Is it certain that the that text version of an arbitrary bytestring is also safe
If it isn't, then the way that force_text has behaved under PY2 for the last 5+ years should be examined:
>>> type(force_text(SafeBytes(b'&'))) django.utils.safestring.SafeText
comment:5 by , 9 years ago
| Resolution: | wontfix |
|---|---|
| Status: | closed → new |
Tim,
Reopening as I didn't make clear in my initial report that the behavior differs between PY3:
>>> type(force_text(SafeBytes(b'&'))) str
and PY2:
>>> type(force_text(SafeBytes(b'&'))) django.utils.safestring.SafeText
If this behavior is incorrect under PY2, let me know and I'll open another ticket to address it. But it definitely seems one of the above is incorrect.
comment:6 by , 9 years ago
Also, there is this, fairly explicit comment in force_text (added 10 years ago) that makes me believe that the behavior under PY3 is wrong:
# Note: We use .decode() here, instead of six.text_type(s, encoding,
# errors), so that if s is a SafeBytes, it ends up being a
# SafeText at the end.
comment:7 by , 9 years ago
| Description: | modified (diff) |
|---|---|
| Version: | master → 1.11 |
Could you give a use case where the current behavior causes a problem? Is it certain that the that text version of an arbitrary bytestring is also safe?