Opened 9 years ago

Closed 5 years ago

Last modified 5 years ago

#5025 closed New feature (fixed)

Add a "truncate" template filter

Reported by: Chris Beaven Owned by: nobody
Component: Template system Version: master
Severity: Normal Keywords:
Cc: semente@…, paulschreiber@…, Gabriel Hurley, mikexstudios, floguy@…, timdumol Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

A truncate filter seems to tie in nicely with the truncatewords and truncatewords_html filters (and it's something I needed recently).

The patch adds a truncate text util which is shared for the new truncate and the existing urlizetrunc filter.

Attachments (5)

truncate.patch (4.3 KB) - added by Chris Beaven 9 years ago.
updated-truncatechars-5075.diff (4.9 KB) - added by floguy 6 years ago.
Updated truncatechars patch
5025.diff (6.7 KB) - added by Chris Beaven 6 years ago.
5025.2.diff (17.9 KB) - added by Chris Beaven 6 years ago.
5025.3.diff (755 bytes) - added by timdumol 5 years ago.
To be applied on top of 5025.2.diff. Fixes the issues mentioned below.

Download all attachments as: .zip

Change History (49)

Changed 9 years ago by Chris Beaven

Attachment: truncate.patch added

comment:1 Changed 9 years ago by Adrian Holovaty

Summary: truncate filterAdd a "truncate" template filter

comment:2 Changed 9 years ago by Chris Beaven

Triage Stage: UnreviewedDesign decision needed

I'll push to DD since it's a complete patch, but inclusion needs approval.

comment:3 Changed 9 years ago by Jacob

Resolution: fixed
Status: newclosed
Triage Stage: Design decision neededAccepted

comment:4 Changed 9 years ago by Chris Beaven

Huh? Fixed where?

comment:5 Changed 9 years ago by Chris Beaven

Resolution: fixed
Status: closedreopened

comment:6 Changed 8 years ago by Renato Alves

This patch is more than two years old now, was this added at all?

comment:8 Changed 8 years ago by Chris Beaven

Triage Stage: AcceptedDesign decision needed

But you're right, it has sat in Accepted for with a patch for long enough - it needs a design decision.

comment:9 Changed 8 years ago by Renato Alves

I tried slice, but it doesn't do exactly the same thing. The ... in the end give it a quite different meaning.
At the moment I just implemented this filter as a custom filter on my application.
I couldn't find any other built-in feature that would do the same task.
I will be glad to drop the custom filter if this patch gets in though.

Thanks for the reply.

comment:10 Changed 7 years ago by EmilStenstrom

There are more differences between slice and truncate: slice doesn't work on things that are not strings, while truncate do. So trying to do {{ article|slice:":20" }} (hoping to limit the unicode_ representation of the article), will not work. {{ article|truncate:20 }} does work like intended.

The patch works fine and fills a real need. I'm voting +1 on adding this.

comment:11 Changed 7 years ago by Chris Beaven

PS: Bringing it up on the django dev list makes more sense than adding a +1 here.

comment:12 Changed 7 years ago by Guilherme Gondim <semente@…>

Cc: semente@… added

I made a similar and simple filter, but your patch is better to Django.

For reference: http://www.djangosnippets.org/snippets/1715/.

comment:13 Changed 7 years ago by Paul Schreiber

The proposal in #12200 uses a unicode ellipsis and provides for head, tail and middle truncation. It also avoids breaking in the middle of words.

comment:14 Changed 7 years ago by Paul Schreiber

Cc: paulschreiber@… added

comment:15 Changed 7 years ago by Gabriel Hurley

Cc: Gabriel Hurley added

comment:16 Changed 6 years ago by mikexstudios

Cc: mikexstudios added

comment:17 Changed 6 years ago by Luke Plant

One problem with this filter is how to correctly handle unicode, where grapheme != code point. Perhaps some library can do it, I don't think that Python bundles such a library. I don't think this is as much of a problem with truncatewords because you rarely have a combining character preceding a space (is that ever valid?).

comment:18 Changed 6 years ago by Malcolm Tredinnick

Triage Stage: Design decision neededAccepted

Design decision (based on in-person discussion between some maintainers and "some" mailing list dicussion): we will add such a filter. It will be written like utils.text.truncatewords so that the truncate filter can be replaced with a user's own, but with different ellipsis sequence. The default ellipsis sequence will be marked as translatable. It must operate on unicode strings, not treat them as bytes. In fact, it must use Python's unicodedata library to evalute combining characters correctly w.r.t. length.

comment:19 Changed 6 years ago by Gabriel Grant

Needs documentation: set

marking as needs improvement, as per mtredinnick's comment, above

comment:20 Changed 6 years ago by Gabriel Grant

Needs documentation: unset
Patch needs improvement: set

comment:21 Changed 6 years ago by floguy

Patch needs improvement: unset

comment:22 Changed 6 years ago by Luke Plant

Patch needs improvement: set

This implementation does not correctly deal with the situation where you have combining characters that do not have precomposed forms:

>>> print truncate_chars(u"__B\u030A____", 6) 
__B...                                             
                                                   
>>> print truncate_chars(u"__B\u030A____", 7) 
__B̊...                                             

(Contrast u"__A\u03A____", which is dealt with correctly because there is a precomposed Å character).

This kind of thing does matter - http://gizmodo.com/382026/a-cellphones-missing-dot-kills-two-people-puts-three-more-in-jail

I don't know how easy it is to fix - standing on the sidelines and cheering/booing is more my style today :-)

There are also bugs where length - len_end_text < 0.

Changed 6 years ago by floguy

Updated truncatechars patch

comment:23 Changed 6 years ago by floguy

Patch needs improvement: unset

Thanks for the good catches Luke, I've updated the implementation to handle your two issues, as well as added tests to ensure it doesn't regress.

comment:24 Changed 6 years ago by Chris Beaven

I'm not sure that _(end_text) makes sense: the default ellipsis should just use ugettext_lazy instead.

comment:25 Changed 6 years ago by floguy

Can't do ugettext_lazy because we need to know how long it is.

comment:26 Changed 6 years ago by Chris Beaven

Sure you can. It'll still be translated when it's used.

comment:27 Changed 6 years ago by floguy

Sorry, I'm not sure I follow what's being suggested.

comment:28 Changed 6 years ago by floguy

Cc: floguy@… added

comment:29 Changed 6 years ago by EmilStenstrom

floguy: I understand the suggestion as: "translate "..." directly instead of the variable". Makes sense?

comment:30 Changed 6 years ago by Gabriel Hurley

Severity: Normal
Type: New feature

Changed 6 years ago by Chris Beaven

Attachment: 5025.diff added

comment:31 Changed 6 years ago by Chris Beaven

Easy pickings: unset

The length of the truncated string in floguy's previous patch wasn't taking into account the length of the string when it had combining characters with no precomposed form.

I'm pretty happy with the patch I uploaded - I wonder though that if instead of having ugettext_lazy('...') we should use: ugettext_lazy('%(truncated_text)s...')

comment:32 Changed 6 years ago by Jannis Leidel

Patch needs improvement: set

A few things:

  • Use pgettext_lazy instead of ugettext_lazy to give the rather ambiguous string '...' some context. (e.g. end_text=pgettext_lazy('truncatechars end text', '...'). This will explain the translators better what this is about.
  • The naming of truncate_chars and truncatechars seems weird. How does truncate_chars relate to truncate_words? Can they be combined? Why isn't the truncate filter extended in the first place?
  • Don't use inline imports in the template filter.

comment:33 Changed 6 years ago by Jannis Leidel

Ah, and using pgettext_lazy('%(truncated_text)s...') seems like a good idea, indeed.

comment:34 Changed 6 years ago by Chris Beaven

Thanks for the review, Jannis.

  • do we still need pgettext if it's '%(truncated_text)s...'?
  • the naming seems to reflect the current naming of truncate_words/truncatewords, why is that weird?
  • i'm not sure what you mean by extending the truncate filter - what truncate filter?
  • just about every template filter in defaultfilters has an inline import - should this patch refactor them all or are we setting a new precident?

comment:35 in reply to:  34 Changed 6 years ago by Jannis Leidel

Replying to SmileyChris:

Thanks for the review, Jannis.

  • do we still need pgettext if it's '%(truncated_text)s...'?

Yes, pgettext is still useful in the sense that it gives a hint to the translators where it's going to be used.

  • the naming seems to reflect the current naming of truncate_words/truncatewords, why is that weird?

Adding two new functions with almost the same name for a feature that is almost exactly implemented in a different tag and function doesn't make it an awful naming to you?

  • i'm not sure what you mean by extending the truncate filter - what truncate filter?

Apologies, I meant the truncatewords filter. IMO, both serve the same purpose: truncate a string. Couldn't we combine the utility function?

  • just about every template filter in defaultfilters has an inline import - should this patch refactor them all or are we setting a new precident?

Nope, fixing that is a matter of another ticket, I assume this was once done because of import dependencies, but there is no reason to keep following that path, as far as I can see it.

Changed 6 years ago by Chris Beaven

Attachment: 5025.2.diff added

comment:36 Changed 6 years ago by Chris Beaven

Patch needs improvement: unset

New patch uses pgettext and takes on Jannis' suggestion of using a common utility for truncation of strings (I've called the class Truncator, but Truncatorminator would probably have been cooler :P).

For now, I've left the behavior of the truncate end-text to be backwards compatible for the truncatewords and truncatewords_html filters (that is, hardcoded strings of ' ...'). There's a slight behaviour change otherwise: we're dropping the space between the last truncated word and the ellipsis.

Also, truncatewords_html doesn't really make sense to allow for the translated ellipsis anywhere but the end, so it will always be appended to the end, regardless of where it contains '%(truncated_text)s'

comment:37 Changed 5 years ago by Jannis Leidel

Triage Stage: AcceptedReady for checkin
UI/UX: unset

This looks great!

comment:38 Changed 5 years ago by Jannis Leidel

Resolution: fixed
Status: reopenedclosed

In [16542]:

Fixed #5025 -- Add a "truncatechars" template filter. Many thanks to Chris Beaven.

comment:39 Changed 5 years ago by timdumol

Cc: timdumol added

It seems the contributor forgot to register the new tag. Should I open a new ticket, or reopen this?

Changed 5 years ago by timdumol

Attachment: 5025.3.diff added

To be applied on top of 5025.2.diff. Fixes the issues mentioned below.

comment:40 Changed 5 years ago by timdumol

Also, L257 has an extra argument to Truncator.chars (.chars(value, length) should just be .chars(length)) which causes a ValueError and weird behavior. The patch I just attached fixes both.

comment:41 Changed 5 years ago by Aymeric Augustin

A closed ticket is unlikely to get any attention. Could you please open a new ticket?

comment:42 Changed 5 years ago by timdumol

Done. #16510 addresses the issue.

comment:43 Changed 5 years ago by Haisheng HU

Will this be checked in for 1.4?

comment:44 in reply to:  43 Changed 5 years ago by Claude Paroz

Replying to hanson2010:

Will this be checked in for 1.4?

It is, see https://docs.djangoproject.com/en/1.4/ref/templates/builtins/#truncatechars

Note: See TracTickets for help on using tickets.
Back to Top