Code

Opened 8 years ago

Closed 7 years ago

#2276 closed enhancement (duplicate)

[patch] make slugify filter to be aware of unicode

Reported by: nkeric Owned by: adrian
Component: Template system Version: master
Severity: normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

Currently the slugify filter only works with english alphanumeric characters, here is a simple patch that makes it to be able to "match the characters [0-9_] plus whatever is classified as alphanumeric in the Unicode character properties database".

here is the ref:

http://docs.python.org/lib/re-syntax.html

http://docs.python.org/lib/node115.html

Regards,

  • Eric

Attachments (6)

defaultfilters.py.diff (600 bytes) - added by nkeric 8 years ago.
defaultfilters.py.2.diff (606 bytes) - added by nkeric 8 years ago.
new diff with re.U replaced with re.UNICODE
validators.py.diff (603 bytes) - added by nkeric 8 years ago.
patch the validator isSlug
defaultfilters.py.3.diff (740 bytes) - added by nkeric 8 years ago.
output string should be utf-8 encoded as input string
validators.py.2.diff (1.0 KB) - added by nkeric 8 years ago.
decode utf-8 encoded string for doing re search
defaultfilters.py.patch (770 bytes) - added by Jonas 7 years ago.
Normallizes string

Download all attachments as: .zip

Change History (13)

Changed 8 years ago by nkeric

comment:1 Changed 8 years ago by nkeric

I tested it and it handles Chinese Characters properly (should work with other language's characters too), and all Chinese punctuations are removed as expected :)

Changed 8 years ago by nkeric

new diff with re.U replaced with re.UNICODE

Changed 8 years ago by nkeric

patch the validator isSlug

Changed 8 years ago by nkeric

output string should be utf-8 encoded as input string

Changed 8 years ago by nkeric

decode utf-8 encoded string for doing re search

comment:2 Changed 8 years ago by adrian

  • Resolution set to duplicate
  • Status changed from new to closed

Closing in favor of #2489.

comment:3 Changed 7 years ago by anonymous

  • milestone Version 1.0 deleted

Milestone Version 1.0 deleted

comment:4 Changed 7 years ago by Jonas

  • Resolution duplicate deleted
  • Status changed from closed to reopened

I found this program: Slughifi is slugify with support for international characters.

http://amisphere.com/contrib/python-django/

http://amisphere.com/contrib/python-django/slughifi.py

comment:5 Changed 7 years ago by mtredinnick

  • Resolution set to duplicate
  • Status changed from reopened to closed

Somebody brought that up on the mailing list at one point. Further investigation revealed is was under the GPL license (no license is mentioned in the file, which is even worse), so we cannot use it.

We have a different solution in the works over at #4365 which I'll resolve in the near future.

comment:6 Changed 7 years ago by Jonas

  • Resolution duplicate deleted
  • Status changed from closed to reopened

By now, I found a solution.

Got from http://www.djangosnippets.org/snippets/98/ (_string_to_slug method) that points to
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/251871 (Aaron Bentley)

E.g:
It translates: Á È ï ô ü ñ
to:            A E i o u n

so it's very usefull for a lot of languages.

I add a patch.

Changed 7 years ago by Jonas

Normallizes string

comment:7 Changed 7 years ago by Jonas

  • Resolution set to duplicate
  • Status changed from reopened to closed

Closing in favor of #4365

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.