Opened 7 years ago

Last modified 7 years ago

#28121 closed Bug

force_text incorrectly handles SafeBytes under PY3 — at Version 7

Reported by: Thomas Achtemichuk Owned by: nobody
Component: Utilities Version: 1.11
Severity: Normal Keywords:
Cc: tom@… Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Thomas Achtemichuk)

Under python 3 & Django 1.8.18, 1.9.13, 1.10.7, 1.11 and master, calling force_text on an instance of SafeBytes causes a str to be returned rather than an instance of SafeText.

>>> from django.utils.safestring import SafeBytes, SafeText
>>> from django.utils.encoding import force_text
>>> type(force_text(SafeText('')))
django.utils.safestring.SafeText
>>> type(force_text(SafeBytes(b'')))
str

This causes byte strings run through mark_safe and rendered in a template to be incorrectly escaped.

>>> from django.template import Template, Context
>>> from django.utils.safestring import mark_safe
>>> Template('{{ x }}').render(Context({'x': mark_safe(b'&')}))
'&'
>>> Template('{{ x }}').render(Context({'x': mark_safe('&')}))
'&'

Edit: This behavior differs from the same code run under PY2:

>>> type(force_text(SafeBytes(b'&')))
django.utils.safestring.SafeText

And disagrees with the comment in force_text:

            # Note: We use .decode() here, instead of six.text_type(s, encoding,
            # errors), so that if s is a SafeBytes, it ends up being a
            # SafeText at the end.

Change History (11)

comment:1 by Tim Graham, 7 years ago

Could you give a use case where the current behavior causes a problem? Is it certain that the that text version of an arbitrary bytestring is also safe?

by Thomas Achtemichuk, 7 years ago

Attachment: 28121_1_8.patch added

Test and patch for 1.8

by Thomas Achtemichuk, 7 years ago

Attachment: 28121_1_10.patch added

Test and patch for 1.10

by Thomas Achtemichuk, 7 years ago

Attachment: 28121_1_11.patch added

Test and patch for 1.11

by Thomas Achtemichuk, 7 years ago

Attachment: 28121_master.patch added

Test and patch for master

comment:2 by Thomas Achtemichuk, 7 years ago

Added some patches against various stable branches and master. Not sure of the process for submitting PRs - is one per branch OK?

Also see that SafeBytes has been deprecated for internal use in 2.0, so perhaps best just to ignore the patch against master.

comment:3 by Tim Graham, 7 years ago

Resolution: wontfix
Status: newclosed

Based on the supported versions policy, the patch doesn't seem to qualify for a backport to the stable branches, so closing as wontfix since the issue isn't really applicable on master which supports Python 3 only.

comment:4 by Thomas Achtemichuk, 7 years ago

Tim,

This came up when bootstrapping a SPA's template with the output of DRF's JSONRenderer which produces utf-8 encoded json. Something like the following:

def app_home(request):
    return render(
        request,
        'app_base.html',
        {'init_data': mark_safe(JSONRenderer().render(SomeSerializer.data))}
    )

We're preparing to switch over to python3, and this bug has lead to a fairly extensive audit of everywhere we use mark_safe and pass values into templates.

Is it certain that the that text version of an arbitrary bytestring is also safe

If it isn't, then the way that force_text has behaved under PY2 for the last 5+ years should be examined:

>>> type(force_text(SafeBytes(b'&')))
django.utils.safestring.SafeText

comment:5 by Thomas Achtemichuk, 7 years ago

Resolution: wontfix
Status: closednew

Tim,

Reopening as I didn't make clear in my initial report that the behavior differs between PY3:

>>> type(force_text(SafeBytes(b'&')))
str

and PY2:

>>> type(force_text(SafeBytes(b'&')))
django.utils.safestring.SafeText

If this behavior is incorrect under PY2, let me know and I'll open another ticket to address it. But it definitely seems one of the above is incorrect.

comment:6 by Thomas Achtemichuk, 7 years ago

Also, there is this, fairly explicit comment in force_text (added 10 years ago) that makes me believe that the behavior under PY3 is wrong:

            # Note: We use .decode() here, instead of six.text_type(s, encoding,
            # errors), so that if s is a SafeBytes, it ends up being a
            # SafeText at the end.
Last edited 7 years ago by Thomas Achtemichuk (previous) (diff)

comment:7 by Thomas Achtemichuk, 7 years ago

Description: modified (diff)
Version: master1.11
Note: See TracTickets for help on using tickets.
Back to Top