#16878 closed New feature (fixed)
Improve the "intword" filter
Reported by: | Rohan Jain | Owned by: | Rohan Jain |
---|---|---|---|
Component: | contrib.humanize | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | yes |
Needs tests: | yes | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
The current code for intword function is repetitive. For each name,
same code block is re-written.
The patch from this pull request,
makes the code more DRY and also allowes the use words which
represent much larger numbers then trillion. In the code, if the lots of
zeroes don't look too good, they can be replaced to exponent form and used
as 10**n
. I used zeroes thinking that might save some compution. The
test in other commit can be neglected if not needed.
Attachments (1)
Change History (15)
comment:1 by , 13 years ago
Patch needs improvement: | set |
---|---|
Triage Stage: | Unreviewed → Accepted |
Version: | 1.3 → SVN |
comment:2 by , 13 years ago
I understand this is not of much importance. I was going through the code and found this so wrote a quick patch.
I have fixed the code to pass the translation regression tests and updated the tests to cover some new larger numbers. Also to make it look cleaner updated the code so that in the conversions list the exponent to 10 is specified instead of numbers with large number of zeroes.
comment:3 by , 13 years ago
Needs documentation: | set |
---|---|
Needs tests: | set |
The proposed patch (on Github) is incorrect in a number of ways:
- even if they are very big numbers, they still have a singular and a plural in some languages (e.g, "Septillion" and "Septillionen" in German)
- it doesn't actually mark the number words for translation since you using string concatenation to create the string to passed to
ungettext
- this needs more thorough tests with another (non-English) language
comment:4 by , 13 years ago
Type: | Cleanup/optimization → New feature |
---|
comment:5 by , 13 years ago
Updated the code to user lamda functions for each large number from the list.
Commit (on gihub): [d81c5566e](https://github.com/crodjer/django/commit/d81c5566e28dba8330798bed7f8f4ce83001c738)
comment:6 by , 13 years ago
The patch still needs improvement:
- The
ungettext
call shouldn't only contain the number word (e.g.ungettext('billion', 'billion', n)
) but also a placeholder for the value (e.g.ungettext('%(value)s billion', '%(value)s billion', n)
. The reason is simple: there are languages which don't follow the same order of words of "<value> <numberword>". That's why It should be left to the translator to decide it.
- You don't have to update all the po files manually, updating the base translation file (the English po file) is enough since that's what the Django project on transifex.net will pickup automatically.
follow-up: 8 comment:7 by , 13 years ago
To take care of the "<value> <numberword>" issue exactly, I have kept another two ungettext calls '%(value)s %(suffix)s', '%(value)s %(suffix)s' (lines 98,99) which generate a translation for specifying that in the translation files. So apart from the definations of large number words, the format can also be specified with it.
In the commit, I have manually edited only one language translation file (de) for use in the i18n regressions tests. Rest are all autogenerated.
follow-up: 9 comment:8 by , 13 years ago
Replying to crodjer:
To take care of the "<value> <numberword>" issue exactly, I have kept another two ungettext calls '%(value)s %(suffix)s', '%(value)s %(suffix)s' (lines 98,99) which generate a translation for specifying that in the translation files. So apart from the definations of large number words, the format can also be specified with it.
That's not enough, the string needs to be one and only one string to be translated, not something that consists of two separate strings.
In the commit, I have manually edited only one language translation file (de) for use in the i18n regressions tests. Rest are all autogenerated.
Right, please revert the changes done to the other po files. You can prevent updating all those files by passing the specific language you want makemessages to update with the -l
option.
comment:9 by , 13 years ago
Replying to jezdez:
That's not enough, the string needs to be one and only one string to be translated, not something that consists of two separate strings.
In the commit, I have manually edited only one language translation file (de) for use in the i18n regressions tests. Rest are all autogenerated.
Right, please revert the changes done to the other po files. You can prevent updating all those files by passing the specific language you want makemessages to update with the
-l
option.
Updated the code according to your suggestions. Now no changes in existing translations would be needed, just additions maybe for largeer number strings.
comment:10 by , 13 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
In [16897]:
(The changeset message doesn't reference this ticket)
comment:11 by , 13 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
The humanize
tests are failing, most likely because intword_converters.items()
returns the items in the wrong order. The attached patch makes used of SortedDict
to enforce a deterministic order.
by , 13 years ago
Attachment: | 16878.diff added |
---|
comment:12 by , 13 years ago
Resolution: | → worksforme |
---|---|
Status: | reopened → closed |
I don't see how this patch would fix the error we see on the buildbout: http://ci.djangoproject.com/job/Django/296/database=sqlite3,python=python2.6/console
I think this is related to stale files on the buildbot due to moving the test directory, not the order. At least I can't reproduce a situation in which the order (as tried to be fixed by @julienphalip's patch) would be of importance to the intword filter.
comment:13 by , 13 years ago
Resolution: | worksforme |
---|---|
Status: | closed → reopened |
So yeah, after having tested this on Python 2.5 I see the error after all:
Creating test database for alias 'default'... Creating test database for alias 'other'... ..F.F..... ====================================================================== FAIL: test_i18n_intword (django.contrib.humanize.tests.HumanizeTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/jezdez/Code/git/django/django/contrib/humanize/tests.py", line 82, in test_i18n_intword self.humanize_tester(test_list, result_list, 'intword') File "/Users/jezdez/Code/git/django/django/contrib/humanize/tests.py", line 22, in humanize_tester msg="%s test failed, produced '%s', should've produced '%s'" % (method, rendered, result)) AssertionError: u'0,0 decillion' != u'1,0 Million' : intword test failed, produced '0,0 decillion', should've produced '1,0 Million' ====================================================================== FAIL: test_intword (django.contrib.humanize.tests.HumanizeTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/jezdez/Code/git/django/django/contrib/humanize/tests.py", line 64, in test_intword self.humanize_tester(test_list, result_list, 'intword') File "/Users/jezdez/Code/git/django/django/contrib/humanize/tests.py", line 22, in humanize_tester msg="%s test failed, produced '%s', should've produced '%s'" % (method, rendered, result)) AssertionError: u'0.0 decillion' != u'1.0 million' : intword test failed, produced '0.0 decillion', should've produced '1.0 million' ---------------------------------------------------------------------- Ran 10 tests in 0.043s FAILED (failures=2)
Re-re-opening :)
comment:14 by , 13 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Fixed in r16900, thanks! :)
Accepting, though I think this is pretty low priority.
There should be tests added to cover the newly-supported bigger numbers, but they should go in
regressiontests/humanize/tests.py
with the existing intword tests.