Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#23558 closed Cleanup/optimization (fixed)

document slugify limitations

Reported by: Mikhail Korobov Owned by: David Hoffman
Component: Documentation Version: 1.7
Severity: Normal Keywords:
Cc: mmitar@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no


Currently slugify docs say:

slugify Converts to lowercase, removes non-word characters (alphanumerics and underscores) and converts spaces to hyphens. Also strips leading and trailing whitespace.

In Python 3:

>>> 'вася'.isalnum()
>>> True

but slugify doesn't work like documented for such strings. Isn't it a bug if something doesn't work as documented? was closed as wontfix. If there is no intention to make slugify work better it should be documented when it works and when people should find alternative solutions.

Change History (11)

comment:1 Changed 3 years ago by Mitar

Cc: mmitar@… added

comment:2 Changed 3 years ago by Aymeric Augustin

Triage Stage: UnreviewedAccepted
Type: UncategorizedCleanup/optimization

s/alphanumerics/ASCII alphanumerics/ should do the job :-)

More seriously, let's take this opportunity to expand a bit the documentation and point to unicode-slugify -- which produces unicode slugs, while django produces ASCII slugs.

comment:3 Changed 3 years ago by Mitar

You can also point to slugify2. ;-)

comment:4 Changed 3 years ago by David Hoffman

Owner: changed from nobody to David Hoffman
Status: newassigned

comment:5 Changed 3 years ago by David Hoffman

Has patch: set

comment:6 Changed 3 years ago by David Hoffman

I created a new pull request based on the feedback to my previous one:

comment:7 Changed 3 years ago by Chris Beaven

The problem is, it's not *just* non-ASCII alphanumerics removed either so the suggested description is still incorrect:

Python 3.4.0 (default, Apr 11 2014, 13:05:11) 
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from django.utils.text import slugify
>>> slugify('Māori')

I think if the wording is to be improved, the initial process of trying to "ASCIIify" the unicode characters as the first step also needs to be explicitly documented too.

Last edited 3 years ago by Chris Beaven (previous) (diff)

comment:8 Changed 3 years ago by David Hoffman

Ah, good point.

I have updated with another pull request:

comment:9 Changed 3 years ago by Tim Graham <timograham@…>

Resolution: fixed
Status: assignedclosed

In 03467368dbd6a427985f86463faa61619f08c833:

Fixed #23558 -- documented slugify limitations

comment:10 Changed 3 years ago by Tim Graham <timograham@…>

In d9bb7128fe7bc565a2dbbd1e89bb0345711b46f6:

[1.7.x] Fixed #23558 -- documented slugify limitations

Backport of 03467368db from master

comment:11 Changed 3 years ago by Håken Lid

Is this the new documentation?

Converts to ASCII. Converts spaces to hyphens. Removes characters that aren't alphanumerics, underscores, or hyphens. Converts to lowercase. Also strips leading and trailing whitespace.

Where can I find out which alphanumerics are deleted, and which alphanumerics are converted to ASCII?

Note: See TracTickets for help on using tickets.
Back to Top