﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
36483	IntegerField will accept non-ASCII digits, which leads to the same page appearing at many URLs	Morgan Wahl		"Hello,

I was recently surprised to find that a simple detail view URL with a model ID in it was also accessible at a URL using ""full width"" digit characters. For example the page at ""/pizza/123"" could also be returned from ""/pizza/１２３"". That's the Unicode characters U+FF11 U+FF12 U+FF13. It turns out this is ultimately because the model `IntegerField` is using `int` to get an integer from the string that was originally in the URL. And I was surprised to find Python's `int` constructor uses `unicodedata.decimal` (or some equivalent) to translate from characters in a string to decimal digits.

That was a cool accidental feature to discover, however now I'm concerned about URL canonicalization. Python 3.13.3 accepts _68_ different characters for each digit. This means the same content is hypothetically accessible from many, many URLs. I've heard that can make a site look spammy to search engines. And maybe this could be an element of a security hole if something is assuming there is only one URL for a given page.

The SEO problem could be addressed by setting a `<link rel=canonical>` in the page to point to `Pizza.objects.get(pk=id).get_absolute_url()` or some similar logic, or you could address the problem as a whole by setting up redirects or 404 responses, but all those approaches require a separate implementation for every view, since the view code ultimately doesn't know which parts of the URL are going to be treated as values of a `IntegerField`.

Possible solutions I can think of are either:

1. make some mechanism to very easily canonicalize URLs, by allowing users to somehow mark this situation explicitly in the URL conf, and then Django can set a property on the request object with the ""canonicalized"" URL. Then redirects or 404s or <link> tags could be implemented just once for all such URLs. (Redirects and 404s in a middleware, <link> tags in a base template.)
2. Don't just pass strings to `int` in the model `IntegerField`. Instead only allow strings with ASCII digits to be used.
"	Bug	new	Uncategorized	5.2	Normal			Morgan Wahl	Unreviewed	0	0	0	0	0	0
