Opened 9 years ago
Closed 9 years ago
#26077 closed Cleanup/optimization (wontfix)
Change latin map in urlify to correctly translate Umlauts
Reported by: | Christian Peters | Owned by: | nobody |
---|---|---|---|
Component: | Internationalization | Version: | 1.9 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The latin map translates the umlauts ä, ü and ö to a, u and o:
https://github.com/django/django/blob/master/django/contrib/admin/static/admin/js/urlify.js#L5-L16
The correct way of doing this would be ae, ue and oe (it's the correct writing and preferred for seo reasons).
Change History (7)
comment:1 by , 9 years ago
comment:2 by , 9 years ago
I'll look for references, but the basic point is, that the url version of an german umlaut should be like proposed.
If you have a website with mydomain.de/apple and mydomain.de/apples than the german version of it should be mydomain.de/apfel and mydomain.de/aepfel
In other cases, where you do not have a singular / plural ambiguity the slugified version is simply a misspelled version of the original one and this is noted by google (for the german rätsel, aka riddle=
https://www.google.com/search?q=ratsel -> google tries to autocorrect you to rätsel
https://www.google.com/search?q=raetsel -> google accepts the query as rätsel
This link describes the issue: http://blog.webcertain.com/do-umlauts-matter-how-to-handle-the-most-annoying-characters-in-german-seo-2/10/04/2014/
TL;DR: Converting ä to a results in misspelled words.
comment:3 by , 9 years ago
Among the Western languages I'm familiar with, ä, ö and ü are most common in German and this is indeed the proper way to transliterate them.
However it would look rather weird for the handful of French, Spanish and Brazilian Portuguese words that include that letter.
I checked scandinavian languagues quickly and it looks like there's no general rule. Per https://en.wikipedia.org/wiki/Finnish_orthography:
The Germanic umlaut or convention of considering digraph ae equivalent to ä, and oe equivalent to ö is inapplicable in Finnish.
comment:4 by , 9 years ago
It's a judgement call, but I'm -0 on this change. It would involve doing something more complicated that sometimes doesn't make sense, rather than doing something simple that isn't always optimal.
I won't stand in the way if we consider that German usage of umlauts is so dominant that we should ignore the edge cases in other languages.
comment:5 by , 9 years ago
I get the point.
Maybe URLify could expose an API that one could add / override MAPs? One could then add some logic based on the settings.py to configure the correct language?
It's used in Wagtail very heavily for slug generation and ATM i override the entire URLify.
comment:6 by , 9 years ago
For better or worse, that's pretty much the expected solution if you need something more tailored than Django's default utilities :-|
There's the same situation with the Python slugify function. Django has a naive, four-line version that works fine for Western languages. If you want something more advanced, there's https://github.com/mozilla/unicode-slugify.
comment:7 by , 9 years ago
Component: | Uncategorized → Internationalization |
---|---|
Resolution: | → wontfix |
Status: | new → closed |
Type: | Uncategorized → Cleanup/optimization |
I guess the original proposal is a "wontfix" unless a discussion on the DevelopersMailingList yields a different consensus.
Could you provide a reference for the claim or point to another system that has the proposed behavior? Thanks.