Opened 18 years ago

Closed 17 years ago

Last modified 13 years ago

#4365 closed (fixed)

[unicode] slug generation can be enhanced

Reported by: Malcolm Tredinnick Owned by: Malcolm Tredinnick
Component: Internationalization Version: unicode
Severity: Keywords: slug unicode i18n
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

It is worthwhile enhancing both the slugify filter and admin/media/js/admin/urlify.js to handle some foreign letters a bit better. We aren't looking to create a transliteration function, but some lightweight improvements are certainly possible.

A start was made by Bill de hÓra in this thread (django-developers) and there is also some good data in #2716.

Attachments (4)

greek.js (828 bytes ) - added by orestis@… 18 years ago.
greek map for urlify.js
greek.2.js (828 bytes ) - added by orestis@… 18 years ago.
defaultfilters.py.patch (770 bytes ) - added by Jonas von Poser 18 years ago.
Normallizes string
turkish.js (273 bytes ) - added by Ahmet 18 years ago.
Turkish map for urlify.js

Download all attachments as: .zip

Change History (12)

comment:1 by Simon G. <dev@…>, 18 years ago

Triage Stage: UnreviewedAccepted

by orestis@…, 18 years ago

Attachment: greek.js added

greek map for urlify.js

by orestis@…, 18 years ago

Attachment: greek.2.js added

comment:2 by orestis@…, 18 years ago

Attached the GREEK_MAP, as requested from Bill de hÓra. Sorry for the double attachment, Trac barfed. They are the same. Note that it is encoded in UTF-8, you have to download the original and open it with an UTF-8 aware editor to see the greek characters.

comment:3 by Jonas von Poser, 18 years ago

Has patch: set
Keywords: slug unicode i18n added

I found a solution for characters with accents, and anothers used on Europe.

Got from http://www.djangosnippets.org/snippets/98/ (_string_to_slug method) that points to http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/251871 (Aaron Bentley)

E.g:
It translates: Á È ï ô ü ñ
to:            A E i o u n

so it's very usefull for a lot of languages.

A patch is added.

by Jonas von Poser, 18 years ago

Attachment: defaultfilters.py.patch added

Normallizes string

by Ahmet, 18 years ago

Attachment: turkish.js added

Turkish map for urlify.js

comment:4 by Malcolm Tredinnick, 17 years ago

Resolution: fixed
Status: newclosed

(In [5608]) unicode: Added unicode-aware slugify filter (in Python) and better non-ASCII
handling for the Javascript slug creator in admin. Can never be perfect here,
but this is more tolerant in many cases. Fixed #4365. Thanks, Bill de h?\195?\147ra,
Baptiste, orestis@…, Ahmet and Jonas for contributions to this.

comment:5 by boobsd@…, 17 years ago

Has patch: unset
Needs tests: set
Resolution: fixed
Status: closedreopened
Version: other branchunicode

Slug don't work with Opera-9.20 browser.

comment:6 by Malcolm Tredinnick, 17 years ago

Fixed (again) in [5610].

comment:7 by John Shaffer <jshaffer2112@…>, 17 years ago

Needs tests: unset
Resolution: fixed
Status: reopenedclosed

comment:8 by Martin v. Löwis, 13 years ago

In [16948]:

Dummy merge.
Merged revisions 5609-5612,5614-5626,5629-5632,5636,5638-5646,5649-5654,5658-5660,5662-5700 via svnmerge from
https://code.djangoproject.com/svn/django/trunk

........

r5609 | mtredinnick | 2007-07-04 14:11:04 +0200 (Mi, 04 Jul 2007) | 5 lines


Merged Unicode branch into trunk (r4952:5608). This should be fully
backwards compatible for all practical purposes.


Fixed #2391, #2489, #2996, #3322, #3344, #3370, #3406, #3432, #3454, #3492, #3582, #3690, #3878, #3891, #3937, #4039, #4141, #4227, #4286, #4291, #4300, #4452, #4702

........

r5610 | mtredinnick | 2007-07-04 14:25:43 +0200 (Mi, 04 Jul 2007) | 3 lines


Fixed Javascript syntax from [5608] that was causing a problem in Opera. Fixed
#4365.

........

r5611 | mtredinnick | 2007-07-04 14:31:19 +0200 (Mi, 04 Jul 2007) | 3 lines


Fixed #4766 -- Added Russian support to Javascript slug creation. Thanks,
boobsd@….

........

r5612 | mtredinnick | 2007-07-04 14:48:12 +0200 (Mi, 04 Jul 2007) | 2 lines


Fixed some ReST errors.

........

r5614 | mtredinnick | 2007-07-05 03:25:05 +0200 (Do, 05 Jul 2007) | 3 lines


Form encoding should be changed only via HttpRequest, not on GET and POST
directly.

........

r5615 | mtredinnick | 2007-07-05 05:25:11 +0200 (Do, 05 Jul 2007) | 2 lines


Fixed #4717 -- Updated Catalan translation. Thanks, marc.garcia@….

........

r5616 | mtredinnick | 2007-07-05 05:29:18 +0200 (Do, 05 Jul 2007) | 3 lines


Fixed #4753 -- Updated Spanish translation. Also move translators' names out of
PO file and into AUTHORS.

........

r5617 | mtredinnick | 2007-07-05 12:27:22 +0200 (Do, 05 Jul 2007) | 3 lines


Added a test that shows the problem in #4470. This fails only for the mysql_old
backend. Refs #4470.

........

r5618 | mtredinnick | 2007-07-05 13:08:40 +0200 (Do, 05 Jul 2007) | 3 lines


Added CACHE_MIDDLEWARE_SECONDS to global settings and documentation (it's
used by the cache middleware). Refs #1015.

........

r5619 | mtredinnick | 2007-07-05 13:10:27 +0200 (Do, 05 Jul 2007) | 5 lines


Fixed #1015 -- Fixed decorator_from_middleware to return a real decorator even
when arguments are given. This looks a bit ugly, but it's fully backwards
compatible and all the extra work is done at import time, so it shouldn't have
any real performance impact.

........

r5620 | russellm | 2007-07-05 14:54:42 +0200 (Do, 05 Jul 2007) | 2 lines


Fixed minor typo in assertion message.

........

r5621 | gwilson | 2007-07-06 06:04:42 +0200 (Fr, 06 Jul 2007) | 2 lines


Fixed #4779 -- Fixed a couple typos in the test_client_regress tests that surfaced when typo was corrected in [5620]. Thanks ferringb@….

........

r5622 | mtredinnick | 2007-07-06 08:53:27 +0200 (Fr, 06 Jul 2007) | 2 lines


Fixed #4781 -- Typo fix. Pointed out by Simon Litchfield.

........

r5623 | mtredinnick | 2007-07-06 10:04:04 +0200 (Fr, 06 Jul 2007) | 4 lines


Fixed #4770 -- Fixed some Unicode conversion problems in the mysql_old backend
with old MySQLdb versions. Tested against 1.2.0, 1.2.1 and 1.2.1p2 with only
expected failures.

........

r5624 | mtredinnick | 2007-07-06 10:35:25 +0200 (Fr, 06 Jul 2007) | 3 lines


Fixed #4782 -- Updated Slovenian translation. Thanks, Gasper Koren. Also moved
contributor names into AUTHORS file.

........

r5625 | mtredinnick | 2007-07-06 12:21:14 +0200 (Fr, 06 Jul 2007) | 4 lines


Fixed #4776 -- Fixed a problem with handling of upload_to attributes. The new
solution still works with non-ASCII filenames. Based on a patch from
mike.j.thompson@….

........

r5626 | russellm | 2007-07-07 04:16:23 +0200 (Sa, 07 Jul 2007) | 2 lines


Added some uncredited authors that worked on the Oracle branch.

........

r5629 | mtredinnick | 2007-07-07 19:15:54 +0200 (Sa, 07 Jul 2007) | 8 lines


Changed HttpRequest.path to be a Unicode object. It has already been
URL-decoded by the time we see it anyway, so keeping it as a UTF-8 bytestring
was causing unnecessary problems.


Also added handling for non-ASCII URL fragments in feed creation (the portion
that was outside the control of the Feed class was messed up).

........

r5630 | mtredinnick | 2007-07-07 20:24:27 +0200 (Sa, 07 Jul 2007) | 4 lines


Fixed #4772 -- Fixed reverse URL creation to work with non-ASCII arguments.
Also included a test for non-ASCII strings in URL patterns, although that
already worked correctly.

........

r5631 | mtredinnick | 2007-07-07 20:39:23 +0200 (Sa, 07 Jul 2007) | 3 lines


Corrected misleading comment from [5619]. Not sure what I was smoking at the
time.

........

r5632 | mtredinnick | 2007-07-08 02:39:32 +0200 (So, 08 Jul 2007) | 5 lines



Fixed reverse URL lookup using functions when the original URL pattern was a
string. This is now just as fragile as it was prior to [5609], but works in a
few cases that people were relying on, apparently.

........

r5636 | mtredinnick | 2007-07-08 13:22:53 +0200 (So, 08 Jul 2007) | 4 lines


Fixed #4798-- Made sure that function keyword arguments are strings (for the
keywords themselves) when using Unicode URL patterns.

........

r5638 | gwilson | 2007-07-10 04:34:42 +0200 (Di, 10 Jul 2007) | 2 lines


Fixed #4817 -- Removed leading forward slashes from some urlconf examples in the documentation.

........

r5639 | gwilson | 2007-07-10 04:45:11 +0200 (Di, 10 Jul 2007) | 2 lines


Fixed #4814 -- Fixed some whitespace issues in tutorial01, thanks John Shaffer.

........

r5640 | gwilson | 2007-07-10 05:26:26 +0200 (Di, 10 Jul 2007) | 2 lines


Fixed #4812 -- Fixed an octal escape in regular expression that is used in the isValidEmail validator, thanks batchman@….

........

r5641 | mtredinnick | 2007-07-10 14:02:06 +0200 (Di, 10 Jul 2007) | 3 lines


Fixed #4823 -- Fixed a Python 2.3 incompatibility from [5636] (it was even
demonstrated by existing tests, so I really screwed this up).

........

r5642 | mtredinnick | 2007-07-10 14:03:36 +0200 (Di, 10 Jul 2007) | 3 lines


Fixed #4804 -- Fixed a problem when validating choice lists with non-ASCII
data. Thanks, django@….

........

r5643 | mtredinnick | 2007-07-10 14:33:55 +0200 (Di, 10 Jul 2007) | 4 lines


Fixed #3760 -- Added the ability to manually set feed- and item-level id
elements in Atom feeds. This is fully backwards compatible. Based on a patch
from spark343@….

........

r5644 | mtredinnick | 2007-07-11 08:55:12 +0200 (Mi, 11 Jul 2007) | 3 lines


Fixed #4815 -- Fixed decoding of request parameters when the input encoding is
not UTF-8. Thanks, Jordan Dimov.

........

r5645 | mtredinnick | 2007-07-11 09:00:27 +0200 (Mi, 11 Jul 2007) | 3 lines


Fixed #4802 -- Updated French translation. Combined contribution from
baptiste.goupil@… and rocherl@….

........

r5646 | mtredinnick | 2007-07-11 09:12:50 +0200 (Mi, 11 Jul 2007) | 2 lines


Fixed #4753 -- Small update to Spanish translation from Mario Gonzalez.

........

r5649 | jacob | 2007-07-12 02:33:44 +0200 (Do, 12 Jul 2007) | 1 line


Fixed #4615: corrected reverse URL resolution examples in tutorial 4. Thanks for the patch, simeonf.

........

r5650 | adrian | 2007-07-12 06:43:29 +0200 (Do, 12 Jul 2007) | 1 line


Added 'New in Django development version' note to docs/syndication_feeds.txt changes from [5643]

........

r5651 | adrian | 2007-07-12 06:44:45 +0200 (Do, 12 Jul 2007) | 1 line


Edited changes to docs/tutorial04.txt from [5649]

........

r5652 | adrian | 2007-07-12 07:23:47 +0200 (Do, 12 Jul 2007) | 1 line


Added helpful error message to SiteManager.get_current() if the user hasn't set SITE_ID

........

r5653 | adrian | 2007-07-12 07:28:04 +0200 (Do, 12 Jul 2007) | 1 line


Added RequestSite class to sites framework

........

r5654 | adrian | 2007-07-12 07:29:32 +0200 (Do, 12 Jul 2007) | 1 line


Improved syndication feed framework to use RequestSite if the sites framework is not installed -- i.e., the sites framework is no longer required to use the syndication feed framework. This is backwards incompatible if anybody has subclassed Feed and overridden init(), because the second parameter is now expected to be an HttpRequest object instead of request.path

........

r5658 | russellm | 2007-07-12 09:45:35 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4459 -- Added 'raw' argument to save method, to override any pre-save processing, and modified serializers to use a raw-save. This enables serialization of DateFields with auto_now/auto_now_add. Also modified serializers to invoke save() directly on the model baseclass, to avoid any (potentially order-dependent, data modifying) behavior in a custom save() method.

........

r5659 | russellm | 2007-07-12 13:24:16 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #3770 -- Remove null=True tag from OneToOne serialization test. OneToOne fields can't have a value of null.

........

r5660 | russellm | 2007-07-12 13:27:38 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #3768 -- Disabled NullBooleanField PK serialization test. We can't and don't test null PK values.

........

r5662 | russellm | 2007-07-12 14:33:24 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4837 -- Updated Debian packaging details. Thanks for the suggestion, Yasushi Masuda <whosaysni@…>.

........

r5663 | russellm | 2007-07-12 14:44:05 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4808 -- Added Chilean regions in localflavor. Thanks, Marijn Vriens <marijn@…>.

........

r5664 | russellm | 2007-07-12 14:48:27 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4745 -- Updated docs to point out that 0 is not a valid SITE_ID when running the tests. Thanks for the suggestion, Lars Stavholm <stava@…>.

........

r5665 | russellm | 2007-07-12 14:50:02 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4763 -- Minor typo in cache documentations. Thanks, dan@….

........

r5666 | russellm | 2007-07-12 14:55:28 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4627 -- Added details on MacPorts packaging of Django. Thanks, Paul Bissex.

........

r5667 | russellm | 2007-07-12 15:23:11 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4640 -- Fixed import to stringfilter in docs. Proposed solution to move stringfilter into django.template.init introduces a circular import problem.

........

r5668 | russellm | 2007-07-12 15:32:00 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4722 -- Clarified discussion about PYTHONPATH in modpython docs. Thanks for the suggestion, Collin Grady <cgrady@…>.

........

r5669 | russellm | 2007-07-12 15:37:59 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4755 -- Modified newforms MultipleChoiceField to use list comprehension, rather than iteration.

........

r5670 | russellm | 2007-07-12 15:41:27 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4764 -- Added reference to Locale middleware in middleware docs. Thanks, dan@….

........

r5671 | russellm | 2007-07-12 15:55:19 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4768 -- Converted timesince and dateformat to use explicit floor division (pre-emptive avoidance of Python 3000 compatibility problem), and removed a redundant millisecond check. Thanks, John Shaffer <jshaffer2112@…>.

........

r5672 | russellm | 2007-07-12 16:00:13 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4775 -- Added some missing Hungarian accents to the urlify.js LATIN_MAP. Thanks, Pistahh <szekeres@…>.

........

r5673 | russellm | 2007-07-12 16:05:16 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4502 -- Clarified reference to view in tutorial. Thanks for the suggestion, Carl Karsten <carl@…>.

........

r5674 | russellm | 2007-07-12 16:11:41 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4522 -- Clarified the allowed filter arguments on the time and date filters. Thanks for the suggestion, admackin@….

........

r5675 | russellm | 2007-07-12 16:21:51 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4525 -- Fixed mistaken documentation on arguments to runfcgi. Thanks, Johan Bergstrom <bugs@…>.

........

r5676 | russellm | 2007-07-12 16:41:32 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4538 -- Split the installation instructions to differentiate between installing a distribution package and installing an official release. Thanks to Carl Karsten for the idea, and Paul Bissex for the patch.

........

r5677 | russellm | 2007-07-12 17:26:37 +0200 (Do, 12 Jul 2007) | 2 lines


Fixed #4526 -- Modified the test Client login method to fail when a user is inactive. Thanks, marcin@….

........

r5678 | russellm | 2007-07-13 07:03:33 +0200 (Fr, 13 Jul 2007) | 2 lines


Fixed #3505 -- Added handling for the error raised when the user forgets the comma in a single element tuple when defining AUTHENTICATION_BACKENDS. Thanks for the help identifying this problem, Mario Gonzalez <gonzalemario@…>.

........

r5679 | mtredinnick | 2007-07-13 10:52:07 +0200 (Fr, 13 Jul 2007) | 3 lines


Fixed #2591 -- Fixed a problem with inspectdb with psycopg2 (only). Patch from
Gary Wilson.

........

r5680 | mtredinnick | 2007-07-13 11:09:59 +0200 (Fr, 13 Jul 2007) | 3 lines


Fixed #4807 -- Fixed a couple of corner cases in decimal form input validation.
Based on a suggestion from Chriss Moffit.

........

r5681 | mtredinnick | 2007-07-13 11:14:51 +0200 (Fr, 13 Jul 2007) | 3 lines


Fixed #4839 -- Added repr methods to URL classes that show the pattern they
contain. Thanks, Thomas Güttler.

........

r5682 | mtredinnick | 2007-07-13 12:56:30 +0200 (Fr, 13 Jul 2007) | 3 lines


Fixed #4842 -- Added slightly more robust error reporting. Thanks, Thomas
Güttler.

........

r5683 | mtredinnick | 2007-07-13 13:05:01 +0200 (Fr, 13 Jul 2007) | 3 lines


Fixed #4846 -- Fixed some Python 2.3 encoding problems in the admin interface.
Based on a patch from daybreaker12@….

........

r5684 | mtredinnick | 2007-07-13 14:03:20 +0200 (Fr, 13 Jul 2007) | 3 lines


Fixed #4861 -- Removed some duplicated logic from the newforms RegexField by
making it a subclass of CharField. Thanks, Collin Grady.

........

r5685 | mtredinnick | 2007-07-13 15:15:35 +0200 (Fr, 13 Jul 2007) | 3 lines


Fixed #4865 -- Replaced a stray generator comprehension with a list
comprehension so that we don't break Python 2.3.

........

r5686 | mtredinnick | 2007-07-13 16:13:35 +0200 (Fr, 13 Jul 2007) | 3 lines


Fixed #4469 -- Added slightly more informative error messages to max- and
min-length newform validation. Based on a patch from A. Murat Eren.

........

r5687 | mtredinnick | 2007-07-13 16:14:47 +0200 (Fr, 13 Jul 2007) | 2 lines


Added author credit for [5686]. Refs #4469.

........

r5688 | mtredinnick | 2007-07-13 16:33:46 +0200 (Fr, 13 Jul 2007) | 3 lines


Fixed #4484 -- Fixed APPEND_SLASH handling to handle an empty path value.
Thanks, VesselinK.

........

r5689 | mtredinnick | 2007-07-13 16:40:39 +0200 (Fr, 13 Jul 2007) | 2 lines


Fixed #4556 -- Stylistic changes to [5500]. Thanks, glin@….

........

r5690 | gwilson | 2007-07-13 22:36:01 +0200 (Fr, 13 Jul 2007) | 2 lines


Refs #2591 -- Removed int conversion and try/except since the value in the single-item list is already an int. I overlooked this in my original patch, which was applied in [5679].

........

r5691 | adrian | 2007-07-13 23:20:07 +0200 (Fr, 13 Jul 2007) | 1 line


Documented the 'commit' argument to save() methods on forms created via form_for_model() or form_for_instance()

........

r5692 | mtredinnick | 2007-07-14 07:27:22 +0200 (Sa, 14 Jul 2007) | 3 lines


Fixed #4869 -- Added a note that syncdb does not alter existing tables. Thanks,
James Bennett.

........

r5693 | mtredinnick | 2007-07-14 14:48:24 +0200 (Sa, 14 Jul 2007) | 3 lines


Fixed #4863 -- Removed comment references to a no-longer present link. Pointed
out by Thomas Güttler.

........

r5694 | mtredinnick | 2007-07-14 15:14:28 +0200 (Sa, 14 Jul 2007) | 2 lines


Fixed #4862 -- Fixed invalid Javascript creation in popup windows in admin.

........

r5695 | mtredinnick | 2007-07-14 15:39:41 +0200 (Sa, 14 Jul 2007) | 2 lines


Fixed a problem with translatable strings from [5686].

........

r5696 | mtredinnick | 2007-07-14 16:47:14 +0200 (Sa, 14 Jul 2007) | 3 lines


Fixed #4731 -- Changed management.setup_environ() so that it no longer assumes
the settings module is called "settings". Patch from SmileyChris.

........

r5697 | mtredinnick | 2007-07-14 16:50:35 +0200 (Sa, 14 Jul 2007) | 3 lines


Fixed #4870 -- Removed unneeded import and fixed a docstring in an example.
Thanks, Collin Grady.

........

r5698 | adrian | 2007-07-14 18:58:54 +0200 (Sa, 14 Jul 2007) | 1 line


Edited docs/db-api.txt changes from [5658]

........

r5699 | adrian | 2007-07-14 19:04:30 +0200 (Sa, 14 Jul 2007) | 1 line


Negligible capitalization fix in test/client.py docstring

........

r5700 | russellm | 2007-07-15 06:41:59 +0200 (So, 15 Jul 2007) | 2 lines


Clarified the documentation on the steps that happen during a save, and how raw save affects those steps.

........

Note: See TracTickets for help on using tickets.
Back to Top