Any backslash-escaped character in a URL doesn't get unescaped by the {% url ...%} tag (and presumably by other methods of view reversing). For example,
urlpatterns = patterns('',
(r'^prices/less_than_\$(?P<price>\d+)/$', 'cost_less'),
(r'^headlines/(?P<year>/d+)\.(?P<month>\d+)\.(?P<day>\d+)/$', 'daily_headlines'),
(r'^priests/(?P<name>\w+)\+/$', 'priest_homepage'),
(r'^windows_path/(?P<drive_name>[A-Z]):\\\\(?P<path>.+)', 'windows_path'),
)
The dollar sign, dot, plus, and backslash in each of the URL patterns match a single character, but don't get converted back to that character by the reverse function.
It seems that there aren't that many of these. Any escape sequence that doesn't match a constant string (i.e. something like \s or \d or \w) had better be part of a pattern so that it can be replaced with the right string to get the URL you're expecting. That leaves the following, I think.
| Pattern | Replacement
|
| \A | '' (equivalent to ^)
|
| \Z | '' (equivalent to $)
|
| \b and \B | '' (these shouldn't appear in urls, but can only match the empty string)
|
| \., \^, \$, \*, \+, \?, \(, \), \{, \}, \[, \], and \\ | the same character, without a backslash
|
As a first stab, I'd just get rid of \A, \Z, \b, and \B, just as the current code does for ^ and $. This is actually kind of complicated, because you have to make sure that the \ in front isn't part of a pair of backslashes. In other words, \\b should become \b, but \\\b should just become \. Also, the current code removes all ^ and $. That's wrong if they're preceded by a backslash and meant to be the actual character.
There are some gotchas--when you insert values, you have to escape characters that you'll be unescaping later. I do check for character classes that don't map to a single definite character (e.g., \d and \w) and raise an exception if they're still there when we finish (since the reverse lookup can't work). I don't check for things like [a-z] or a{2,3}, but that will almost guarantee the reversing fails, too.
Note that #2977 also addresses this problem, but it does other things, too. Also I think that code may not handle some corner cases correctly. Meanwhile, my patch may be overly agressive and might include handling for characters that will never appear in a URL.
Give SmileyChris? and me some time to work this out.