Opened 10 years ago

Closed 9 years ago

#2276 closed enhancement (duplicate)

[patch] make slugify filter to be aware of unicode

Reported by: nkeric Owned by: Adrian Holovaty
Component: Template system Version: master
Severity: normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

Currently the slugify filter only works with english alphanumeric characters, here is a simple patch that makes it to be able to "match the characters [0-9_] plus whatever is classified as alphanumeric in the Unicode character properties database".

here is the ref:

http://docs.python.org/lib/re-syntax.html

http://docs.python.org/lib/node115.html

Regards,

  • Eric

Attachments (6)

defaultfilters.py.diff (600 bytes) - added by nkeric 10 years ago.
defaultfilters.py.2.diff (606 bytes) - added by nkeric 10 years ago.
new diff with re.U replaced with re.UNICODE
validators.py.diff (603 bytes) - added by nkeric 10 years ago.
patch the validator isSlug
defaultfilters.py.3.diff (740 bytes) - added by nkeric 10 years ago.
output string should be utf-8 encoded as input string
validators.py.2.diff (1.0 KB) - added by nkeric 10 years ago.
decode utf-8 encoded string for doing re search
defaultfilters.py.patch (770 bytes) - added by Jonas 9 years ago.
Normallizes string

Download all attachments as: .zip

Change History (13)

Changed 10 years ago by nkeric

Attachment: defaultfilters.py.diff added

comment:1 Changed 10 years ago by nkeric

I tested it and it handles Chinese Characters properly (should work with other language's characters too), and all Chinese punctuations are removed as expected :)

Changed 10 years ago by nkeric

Attachment: defaultfilters.py.2.diff added

new diff with re.U replaced with re.UNICODE

Changed 10 years ago by nkeric

Attachment: validators.py.diff added

patch the validator isSlug

Changed 10 years ago by nkeric

Attachment: defaultfilters.py.3.diff added

output string should be utf-8 encoded as input string

Changed 10 years ago by nkeric

Attachment: validators.py.2.diff added

decode utf-8 encoded string for doing re search

comment:2 Changed 10 years ago by Adrian Holovaty

Resolution: duplicate
Status: newclosed

Closing in favor of #2489.

comment:3 Changed 10 years ago by (none)

milestone: Version 1.0

Milestone Version 1.0 deleted

comment:4 Changed 9 years ago by Jonas

Resolution: duplicate
Status: closedreopened

I found this program: Slughifi is slugify with support for international characters.

http://amisphere.com/contrib/python-django/

http://amisphere.com/contrib/python-django/slughifi.py

comment:5 Changed 9 years ago by Malcolm Tredinnick

Resolution: duplicate
Status: reopenedclosed

Somebody brought that up on the mailing list at one point. Further investigation revealed is was under the GPL license (no license is mentioned in the file, which is even worse), so we cannot use it.

We have a different solution in the works over at #4365 which I'll resolve in the near future.

comment:6 Changed 9 years ago by Jonas

Resolution: duplicate
Status: closedreopened

By now, I found a solution.

Got from http://www.djangosnippets.org/snippets/98/ (_string_to_slug method) that points to
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/251871 (Aaron Bentley)

E.g:
It translates: Á È ï ô ü ñ
to:            A E i o u n

so it's very usefull for a lot of languages.

I add a patch.

Changed 9 years ago by Jonas

Attachment: defaultfilters.py.patch added

Normallizes string

comment:7 Changed 9 years ago by Jonas

Resolution: duplicate
Status: reopenedclosed

Closing in favor of #4365

Note: See TracTickets for help on using tickets.
Back to Top