Opened 6 weeks ago

Closed 6 weeks ago

Last modified 4 weeks ago

#28688 closed Bug (fixed)

Unicode slugs are not properly slugified due to javascript limitations

Reported by: Sævar Öfjörð Magnússon Owned by: Sævar Öfjörð Magnússon
Component: contrib.admin Version: 1.11
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When using unicode slugs, and the slug contains a unicode character and a word from the removelist in urlify.js right after it (e.g. the letter "a"), the unicode character is detected as a word boundary in javascript and the text that matches the removelist is stripped from the slug...
The removelist contains the following words:

var removelist = [
"a", "an", "as", "at", "before", "but", "by", "for", "from", "is",
"in", "into", "like", "of", "off", "on", "onto", "per", "since",
"than", "the", "this", "that", "to", "up", "via", "with"
];

And when I try to slugify this text: "Kaupa miða"
The result becomes: "kaupa-mið"

This can be tested using the following line in console:
URLify("Kaupa miða", 255, true)

(only when urlify.js is loaded, of course)

Change History (7)

comment:1 Changed 6 weeks ago by Tim Graham

Triage Stage: UnreviewedAccepted

comment:2 Changed 6 weeks ago by Claude Paroz

The solution could be to check if the string contains any non-ASCII char and simply skip the removelist removal, as the language is probably not English in that case.

comment:3 Changed 6 weeks ago by Sævar Öfjörð Magnússon

Owner: changed from nobody to Sævar Öfjörð Magnússon
Status: newassigned

comment:4 Changed 6 weeks ago by Sævar Öfjörð Magnússon

Has patch: set

I've created a pull request that implements Claude's suggestion: https://github.com/django/django/pull/9219

Last edited 6 weeks ago by Sævar Öfjörð Magnússon (previous) (diff)

comment:5 Changed 6 weeks ago by Claude Paroz

Triage Stage: AcceptedReady for checkin

comment:6 Changed 6 weeks ago by Tim Graham <timograham@…>

Resolution: fixed
Status: assignedclosed

In f90be0a8:

Fixed #28688 -- Made admin's URLify.js skip removal of English words if non-ASCII chars are present.

comment:7 Changed 4 weeks ago by Tim Graham <timograham@…>

In 8b9a163:

Refs #28688 -- Updated a selenium test for admin's URLify.js change.

English words aren't removed if non-ASCII chars are present.

Note: See TracTickets for help on using tickets.
Back to Top