Opened 3 months ago

Last modified 3 months ago

#31540 new Bug

i18n URLs are not matched against the fallback language.

Reported by: osxisl Owned by: nobody
Component: Internationalization Version: 3.0
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by osxisl)

http://example.com/de/id-button/ - 200 OK
http://example.com/id/id-button/ - 200 OK
http://example.com/any-other-slug/ - 200 OK
http://example.com/id-button/ - 404 error:

Using the URLconf defined in example.urls, Django tried these URL patterns, in this order:
id/
The current path, id-button/, didn't match any of these.
urls.py file:
urlpatterns = i18n_patterns(
    path('admin/', admin.site.urls),
    path('', cache_page(cache_homepage)(homepage_views.index), name='index'),
    path('search/', search_views.search, name='search'),
    path('<slug:slug>/', item_views.item, name='item'),
    prefix_default_language=False,
)

The item have a slug field in DB "id-button". If I rename this to "idbutton": http://example.com/idbutton/ - 200 OK

Change History (10)

comment:1 Changed 3 months ago by osxisl

Description: modified (diff)

comment:2 Changed 3 months ago by osxisl

Description: modified (diff)

comment:3 Changed 3 months ago by felixxm

Component: UncategorizedInternationalization
Resolution: invalid
Status: newclosed
Summary: A 404 error with “id-” in slug on multilingual website with Indonesian languageA 404 error with “id-” in slug on multilingual website with Indonesian language.

Thanks for this ticket, however I cannot reproduce this issue. IMO it's not an issue in Django, you can try to ask on one of support channels.

comment:4 in reply to:  3 Changed 3 months ago by osxisl

Replying to felixxm:

Thanks for this ticket, however I cannot reproduce this issue. IMO it's not an issue in Django, you can try to ask on one of support channels.

For me it looks like there is a problem regex here https://github.com/django/django/blob/92507bf3ea4dc467f68edf81a686548fac7ff0e9/django/utils/translation/trans_real.py#L46
It activates "id" language when meets "id-button" in slug despite I only use in settings "id" not "id-button" locale. And I'm sure no such locale in Unicode as well.

comment:5 in reply to:  3 Changed 3 months ago by osxisl

Replying to felixxm:

Thanks for this ticket, however I cannot reproduce this issue. IMO it's not an issue in Django, you can try to ask on one of support channels.

I think there is problem in design here, if someone needs custom locale like 'id-button' it should be set separately in settings.py

Last edited 3 months ago by osxisl (previous) (diff)

comment:6 Changed 3 months ago by osxisl

Resolution: invalid
Status: closednew

comment:7 Changed 3 months ago by Carlton Gibson

Summary: A 404 error with “id-” in slug on multilingual website with Indonesian language.i18n URLs are not matched against the fallback language.
Triage Stage: UnreviewedAccepted

I'm going to accept this. I think there is an issue. I don't see that we can do much about slugs that match the language code regex, here id-... but the 404 isn't optimal...

If id-button correctly falls back to id as the langauge code, but then tries to match the URL against id/ — which it doesn't match.

But either we should use the submitted URL as the prefix, or redirect to the fallback URL.

The id-button example seems wrong, but it occurs with en-us. That would fallback to en, except it 404s.

More details and reproduce on the forum thread here:
https://forum.djangoproject.com/t/i-get-a-404-error-if-slug-begins-with-id/2272/7

Clean project, enable LocaleMiddleware:

from django.conf.urls.i18n import i18n_patterns
from django.urls import path
from django.http import HttpResponse


def hello(request):
    return HttpResponse("hello")


urlpatterns = i18n_patterns(
    path("", hello),
    prefix_default_language=True,
)

/en/ 200.
/en-gb/ 200
/en-us/ 404

Using the URLconf defined in ticket_31540.urls, Django tried these URL patterns, in this order:
    en/
The current path, en-us/, didn't match any of these.

So this is because en-us isn’t a configured language but when determining the language we fell-back to en correctly. Then we didn’t resolve the URL using the same, or redirect to the fallback language URL.

It seems like that’s something we should do, or… ? It seems suboptimal.
(Very happy if someone can correct me as to why not...🙂)

It seems like the complement of #17734, #27402 &co.

comment:8 in reply to:  7 Changed 3 months ago by osxisl

Replying to Carlton Gibson:

If id-button correctly falls back to id as the langauge code, but then tries to match the URL against id/ — which it doesn't match.

Are you sure, that a fallback from id-button to id as the language code is a correct behaviour when we have set prefix_default_language=False ?
I think, it may be a some kind of correct behaviour in case we set prefix_default_language=True, but when we hide default language, we get a big problem - we need to check that we don't have items in DB with slug that begins with any of our "language code" + "-" +"any word".

For me it looks like the principle of encapsulation is broken.

For example: When website editor adds content to English version of a website he don't need to have a limitations, to set a slug id-button or in-love if the website doesn't have a language id-button or in-love even if it has "id" and "in" language.

comment:9 Changed 3 months ago by Sjoerd Job Postmus

The way I see it, there are two different concerns that need attention:

  • Make it so that when id-gibberish/ is used in the path, that the knowledge about id-gibberish is passed from LocaleMiddleware to LocalePrefixPattern. I think this is at least a bug, and I was also able to reproduce this locally.
  • Make it configurable whether or not LocaleMiddleware should be lenient in parsing the language from id-gibberish to "looks enough like id to me". If we'd change this "hard" from lenient to "must match strictly" this might break current applications. See the en-us vs en-gb example.

The prefix_default_language does not actually seem to influence the behaviour. Both an-example (Aragonese) and as-always (Assemese) would be affected.

comment:10 in reply to:  9 Changed 3 months ago by osxisl

Replying to Sjoerd Job Postmus:

  • Make it configurable whether or not LocaleMiddleware should be lenient in parsing the language from id-gibberish to "looks enough like id to me". If we'd change this "hard" from lenient to "must match strictly" this might break current applications. See the en-us vs en-gb example.

I see the solution in implementing something like a standardised dictionary https://unicode-org.github.io/cldr-staging/charts/37/supplemental/language_territory_information.html where language codes and territories are predefined, but anyone can add custom languages and territories to their apps via settings.py.

I think it won't affect the majority of applications, and the affected ones can add their custom languages in settings. Or for a better backward compatibility a Boolean variable can be added to settings.py that sets lenient/strictl match.

Note: See TracTickets for help on using tickets.
Back to Top