Opened 10 years ago

Closed 10 years ago

Last modified 6 years ago

#23558 closed Cleanup/optimization (fixed)

document slugify limitations

Reported by: Mikhail Korobov Owned by: David Hoffman
Component: Documentation Version: 1.7
Severity: Normal Keywords:
Cc: mmitar@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description

Currently slugify docs say:

slugify Converts to lowercase, removes non-word characters (alphanumerics and underscores) and converts spaces to hyphens. Also strips leading and trailing whitespace.

In Python 3:

>>> 'вася'.isalnum()
>>> True

but slugify doesn't work like documented for such strings. Isn't it a bug if something doesn't work as documented?

https://code.djangoproject.com/ticket/8391 was closed as wontfix. If there is no intention to make slugify work better it should be documented when it works and when people should find alternative solutions.

Change History (12)

comment:1 by Mitar, 10 years ago

Cc: mmitar@… added

comment:2 by Aymeric Augustin, 10 years ago

Triage Stage: UnreviewedAccepted
Type: UncategorizedCleanup/optimization

s/alphanumerics/ASCII alphanumerics/ should do the job :-)

More seriously, let's take this opportunity to expand a bit the documentation and point to unicode-slugify -- which produces unicode slugs, while django produces ASCII slugs.

comment:3 by Mitar, 10 years ago

You can also point to slugify2. ;-)

comment:4 by David Hoffman, 10 years ago

Owner: changed from nobody to David Hoffman
Status: newassigned

comment:5 by David Hoffman, 10 years ago

Has patch: set

comment:6 by David Hoffman, 10 years ago

I created a new pull request based on the feedback to my previous one: https://github.com/django/django/pull/3357

comment:7 by Chris Beaven, 10 years ago

The problem is, it's not *just* non-ASCII alphanumerics removed either so the suggested description is still incorrect:

Python 3.4.0 (default, Apr 11 2014, 13:05:11) 
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from django.utils.text import slugify
>>> slugify('Māori')
'maori'

I think if the wording is to be improved, the initial process of trying to "ASCIIify" the unicode characters as the first step also needs to be explicitly documented too.

Last edited 10 years ago by Chris Beaven (previous) (diff)

comment:8 by David Hoffman, 10 years ago

Ah, good point.

I have updated with another pull request: https://github.com/django/django/pull/3358

comment:9 by Tim Graham <timograham@…>, 10 years ago

Resolution: fixed
Status: assignedclosed

In 03467368dbd6a427985f86463faa61619f08c833:

Fixed #23558 -- documented slugify limitations

comment:10 by Tim Graham <timograham@…>, 10 years ago

In d9bb7128fe7bc565a2dbbd1e89bb0345711b46f6:

[1.7.x] Fixed #23558 -- documented slugify limitations

Backport of 03467368db from master

comment:11 by Håken Lid, 10 years ago

Is this the new documentation?

Converts to ASCII. Converts spaces to hyphens. Removes characters that aren't alphanumerics, underscores, or hyphens. Converts to lowercase. Also strips leading and trailing whitespace.

Where can I find out which alphanumerics are deleted, and which alphanumerics are converted to ASCII?

comment:12 by Ülgen Sarıkavak, 6 years ago

Documentation is still misleading, why this issue is closed?

Note: See TracTickets for help on using tickets.
Back to Top