Opened 2 years ago

Closed 2 years ago

#33567 closed Bug (fixed)

Builtin csrf_failure() view uses wrong charset

Reported by: MrVichr Owned by: Claude Paroz
Component: CSRF Version: 4.0
Severity: Normal Keywords: csrf
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description

When Django detects wrong CSRF token, it shows an error using view django.views.csrf.csrf_failure. That file ends with

    return HttpResponseForbidden(t.render(c), content_type="text/html;")

When the template (CSRF_FAILURE_TEMPLATE_NAME) is written using UTF-8, it is rendered incorrectly. I suggest changing that line to

    return HttpResponseForbidden(t.render(c), content_type="text/html;"+
                                              f" charset={settings.DEFAULT_CHARSET};")

or perhaps leaving out the content_type entirely.

Currently I'm using a workaround, by adding

<meta http-equiv="Content-type" content="text/html; charset=utf-8" />

to the template's HEAD, but it seems to me that the suggested fix is a better solution.

Change History (9)

comment:1 by Carlton Gibson, 2 years ago

Resolution: needsinfo
Status: newclosed

Hi. Can you provide a working example of the issue please, and exactly what you mean by " it is rendered incorrectly".

Creating a test project with a UTF8 403_csrf.html works as expected, so I'm guessing there's something else going on.

Thanks.

comment:2 by Claude Paroz, 2 years ago

I think I'd rather omit content_type when it is text/html (several other locations are affected, too). I'll try a patch to see what's the outcome in tests.

comment:3 by Claude Paroz, 2 years ago

Here's the PR.

comment:4 by Carlton Gibson, 2 years ago

Hey Claude,

Yes, I took it out in that one place, with no effect.

But, if there is an error, it would be good to see how that could come up.
(It should be a no-op right? Maybe some browser default charset if it's not specified issue... 🤔)

Last edited 2 years ago by Carlton Gibson (previous) (diff)

comment:5 by Claude Paroz, 2 years ago

When the charset is missing from a response, the browser has to guess (either using a default, a user preference, or sniffing the content). We should avoid that. Read https://www.w3.org/International/articles/http-charset/index.en

comment:6 by Carlton Gibson, 2 years ago

Has patch: set
Resolution: needsinfo
Status: closednew
Triage Stage: UnreviewedAccepted

Good link!

It is very important to always label Web documents explicitly. HTTP 1.1 says that the default charset is ISO-8859-1. ...

So we can presume the doc is interpreted as ISO-8859-1, rather than UTF-8, and so an error (although exact example...)

>>> r = HttpResponse("Hello 🎉", content_type="text/html")
>>> r.charset
'utf-8'
>>> r.serialize_headers()
b'Content-Type: text/html'

>>> r = HttpResponse("Hello 🎉")
>>> r.serialize_headers()
b'Content-Type: text/html; charset=utf-8'

Claude's, of removing the half-specified content_type automatically picks up MrVichr's idea to add the explicit charset.

comment:7 by Carlton Gibson, 2 years ago

Owner: changed from nobody to Claude Paroz
Status: newassigned

comment:8 by Carlton Gibson, 2 years ago

Triage Stage: AcceptedReady for checkin

comment:9 by Carlton Gibson <carlton@…>, 2 years ago

Resolution: fixed
Status: assignedclosed

In 93803a1:

Fixed #33567 -- Avoided setting default text/html content type on responses.

Note: See TracTickets for help on using tickets.
Back to Top