Opened 8 years ago

Closed 7 years ago

Last modified 7 years ago

#27063 closed Bug (fixed)

i18n_patterns() matches too much of the URL as the language code

Reported by: Keith Hackbarth Owned by: nobody
Component: Internationalization Version: 1.10
Severity: Normal Keywords:
Cc: Krzysztof Urbaniak Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I'm having two issues on using a regex based url pattern with "prefix_default_language" that I was hoping to shed some insight on.

My url.py file:

urlpatterns += i18n_patterns(
    url(r'^(?P<city>.+-parking)/?$', CityView.as_view()),
    prefix_default_language=False
)

Issue 1: Regex that also match language prefixs

If a city that matches my regex but begins with a language prefix, it shows a 404.
For example: "/de-moines-parking" shows a 404. But "de/de-moines-parking" returns a 200. Also, cities that don't match one of my enabled languages match as well "/boise-parking" for example.

Issue 2: Stacking

Also the urls will stack languages, for example, "/de/fr/it/de-moines-parking" is loading a successful page instead of a 404.

Any insight? Is this a bug or am I missing something obvious in my regex? Prior to Django 1.10, I used solid_i18n_patterns plugin and did not have these issues. I posted on #django-users but no one responded.

Thanks in advance,

Keith

Change History (17)

comment:1 by Claude Paroz, 8 years ago

May I ask you to create a new ticket for the second issue? It's better practice to have one issue per ticket.

comment:2 by Tim Graham, 8 years ago

Cc: Krzysztof Urbaniak added
Component: UncategorizedInternationalization
Type: UncategorizedBug

Krzysztof, could you triage this ticket since you worked on the prefix_default_language feature (#25933)?

comment:3 by Krzysztof Urbaniak, 8 years ago

The language prefix is separated from the rest of the url with / instead of -.

So the url for english (with default language en) will be /moines-parking/ and the same page for german will have url /de/moines-parking/.

You can also translate the -parking part using _/gettext.

About the /de/fr/it/de-moines-parking issue - it looks like a german page for city named fr/it/de-moines, can you replace the .* in city match with something like \w or [\w-] to not catch the / as part of the city name? Then I think it should return 404.

comment:4 by Keith Hackbarth, 8 years ago

@urbaniak - Not sure I understand the feedback.

Using the below

urlpatterns += i18n_patterns(
    url(r'^(?P<city>[\w-]+-parking)', CityView.as_view()),
    prefix_default_language=False
)

Still reproduces the same two issues. I understand that in theory, the language prefix is separated from the rest of the url with / instead of -. But in practice, with a regex, this does not appear to be the case?

Thanks for your help!

comment:5 by Krzysztof Urbaniak, 8 years ago

Language prefix always ends with slash - https://github.com/django/django/blob/bc1e2d8e8edde6cc7d2657c68242a13ee65a15b8/django/urls/resolvers.py#L419. It's added before your url regex.

I cannot reproduce the issue with /de/fr/whatever, I've made a testcase here https://github.com/urbaniak/i18n_patterns_test - can you make a fork of it and try to reproduce this issue?

There's a tests.py file which assumes 404 on those "nested" urls.

response = self.client.get('/de/pl/test-parking/')
self.assertEqual(response.status_code, 404)
self.assertEqual(response.wsgi_request.LANGUAGE_CODE, 'de')

response = self.client.get('/pl/de/test-parking/')
self.assertEqual(response.status_code, 404)
self.assertEqual(response.wsgi_request.LANGUAGE_CODE, 'pl')

Last edited 8 years ago by Krzysztof Urbaniak (previous) (diff)

comment:6 by Keith Hackbarth, 8 years ago

@urbaniak - The issue I'm having is the one described in the ticket

Failing test case:

response = self.client.get('/de-moines-parking/')
self.assertEqual(response.status_code, 200)
self.assertEqual(response.wsgi_request.LANGUAGE_CODE, 'en')

I'm sure I'm missing something obvious. But not working for me.

comment:7 by Krzysztof Urbaniak, 8 years ago

Do you want here the german page for 'moines' city? Or english page for 'de-moines' city?

comment:8 by Keith Hackbarth, 8 years ago

English page

comment:9 by Tim Graham, 8 years ago

Triage Stage: UnreviewedAccepted

The regular expression to extract the language code from the URLconf interprets "de-moines-parking" as a language. Maybe it makes sense to validate that against the actual list of languages that the project uses?

comment:10 by Krzysztof Urbaniak, 8 years ago

We can validate agains the project languages or make sure that language prefix ends with /.

I'll try to submit a pull request with fix this weekend, will try language validation approach first.

comment:11 by Krzysztof Urbaniak, 8 years ago

I've submited pull request - https://github.com/django/django/pull/7162

Waiting for comments.

@keithhackbarth, can you test django version from this pull request with your project?

From my tests it's fixing the issue, but want to be sure.

comment:12 by Keith Hackbarth, 8 years ago

Thanks @urbaniak. I'll test and get comments sometime this week.

comment:13 by Jay Kumar, 8 years ago

@urbaniak - As discussed, I added a test for the case in which the language code is prefixed with a - (i.e de-test-parking). With your updated code and new language_code_prefix_re, the test passes now as expected results of 200 with language code of 'en'. The case of stacking language prefixes now returns the expected 404 as well.

Thanks for the help!

comment:14 by Claude Paroz, 8 years ago

Patch needs improvement: set

comment:15 by Tim Graham, 7 years ago

Has patch: set
Patch needs improvement: unset
Summary: Regex url routing + Localization (prefix_default_language)i18n_patterns() matches too much of the URL as the language code
Triage Stage: AcceptedReady for checkin

comment:16 by Tim Graham <timograham@…>, 7 years ago

Resolution: fixed
Status: newclosed

In a01d887a:

Fixed #27063 -- Prevented i18n_patterns() from using too much of the URL as the language.

comment:17 by Tim Graham <timograham@…>, 7 years ago

In a7dabe42:

[1.10.x] Fixed #27063 -- Prevented i18n_patterns() from using too much of the URL as the language.

Backport of a01d887a3ad9029ed198b6af974534fec46223f9 from master

Note: See TracTickets for help on using tickets.
Back to Top