Opened 15 years ago

Closed 10 years ago

Last modified 10 years ago

#7806 closed Cleanup/optimization (fixed)

django.template refactoring

Reported by: Johannes Dollinger Owned by: Aymeric Augustin
Component: Template system Version: dev
Severity: Normal Keywords: tplrf
Cc: Carl Meyer, research@…, Alexander Koshelev Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: yes Patch needs improvement: yes
Easy pickings: no UI/UX: no


Why should django.template be refactored?

Filter and Variable parsing is inconsistent. Many ad hoc parsers in defaulttags are fragile.
Whitespace handling is ungraceful.

The patch provided splits into separate modules and introduces a TokenStream
class that allows parsing of literals, vars (called lookups) and filter expressions. No docs yet,
have a look at the source (it all happens in ~150 loc).

The patch only touches django.template and django.templatetags.
Tests are separate so you can run them with the old code.

Modules in django.template

  • context (no dependencies)

  • expressions (no dependencies)
    Contains Lookup, Literal, and FilterExpression. All are subclasses of a marker class Expression. These replace Variable and the old FilterExpression class. A Variable equivalent is provided in compat.Variable.

  • nodes (depends on expressions)
    Contains Node, NodeList, TextNode, and ExpressionNode. The latter is the old VariableNode renamed. A VariableNode alias is provided in compat.

  • compiler (depends on context, expressions, nodes)
    Contains Parser, Lexer, Token, TemplateSyntaxError, Origin, StringOrgin, and compile_string(). Those are mostly untouched. Additionally there are TokenStream, and TokenSyntaxError. Those provide a safe way to parse Tokens containing expressions.

  • library (depends on compiler, nodes, context)
    Contains InvalidTemplateLibrary, Library, get_library(), and add_to_builtins(). Mostly untouched. Should be called libraries for consistency, but that would clash with the legacy dict.

  • loader
    Untouched. Moved TemplateDoesNotExist here.

  • utils
    ConditionalNode, base class for IfNode, IfEqualNode. EmptyNode. Helper functions: parse_conditional_nodelists(parser, name), parse_as(bits) parse_args_and_kwargs(bits), resolve_args_and_kwargs(args, kwargs)

  • compat
    Provides backwards compatibility for Variable, VariableNode, TokenParser, and resolve_variable()

  • defaulttags, loader_tags
    Mostly refactored to use TokenStream where appropriate.


  • #4746 - allow whitespace before and after filter separator
    fixed. tests are there (filter-syntax03, filter-syntax04) but expect TemplateSyntaxError.

  • #5270 - empty string literals
    fixed. tests included.

  • #5756 - accept filters (almost) everywhere
    fixed. a few tests included.

  • #5971 - token parser bug
    TokenParser will be deprecated.

  • #6271 - filter arguments with spaces
    fixed. test included. (seems to be fixed in trunk already)

  • #6296 - expressions for ifequal
    fixed. tests included (dup of #5756 ?)

  • #6510 - get_nodes_by_type() problem
    probably fixed.

  • #6535 - negative numeric literals
    fixed. tests included.

  • #7295 - quotes, escaping and translation of string literals handled inconsistently in templates


Comparing templates measures, the refactored version is a bit slower.


{% if not %} is currently valid. The refactored implementation is greedy and reports a syntax error.
Could be fixed, not sure it's worth it. Broken tests are: if-tag-not02, if-tag-not03. DDN.

Numeric literals ending with "." are now considered a syntax error.
Broken test: ifequal-numeric07. DDN.

{% url %} currently accepts an arbitrary unquoted unicode string for the view name.
The refactored implementation only accepts bare python identifiers or quoted string literals.
Broken test: url05. DDN.

TemplateSyntaxError messages need some work. Eventually shortcut methods on TokenStream.

Find better names for TokenSyntaxError and TokenStream.

Provide a decorator that handles the boilerplate bits=parser.token_stream(token)
and bits.assert_consumed() calls.

Investigate performance impact.

Docs and more tests.

Attachments (5)

tplrf.diff (109.6 KB) - added by Johannes Dollinger 15 years ago.
tplrf-tests.diff (6.3 KB) - added by Johannes Dollinger 15 years ago.
tplrf.2.diff (116.9 KB) - added by Johannes Dollinger 15 years ago.
tplrf-r8970.diff (129.2 KB) - added by Johannes Dollinger 15 years ago.
with tests
7806.tplrf-r11593.diff (129.1 KB) - added by Johannes Dollinger 14 years ago.

Download all attachments as: .zip

Change History (29)

Changed 15 years ago by Johannes Dollinger

Attachment: tplrf.diff added

Changed 15 years ago by Johannes Dollinger

Attachment: tplrf-tests.diff added

comment:1 Changed 15 years ago by Johannes Dollinger

Has patch: set
Needs documentation: set
Needs tests: set
Patch needs improvement: set

I'm going to add a tplrf-fixed keyword to #4746, #5270, #5756, #5862, #6271, #6296, #6510, #6535, and #7295 - unless someone tells me that's a bad idea.

Changed 15 years ago by Johannes Dollinger

Attachment: tplrf.2.diff added

comment:2 Changed 15 years ago by Johannes Dollinger

  • made decorators 2.3 compatible
  • added a (clusily named) decorator for parser functions using TokenStream: token_stream_parsed.
  • moved some common syntax error handling to TokenStream.
  • converted some remaining compile_filter() calls

Notes: {% cycle %} is still missing. The docstring/code ratio in defaultags starts to get ugly. A DebugTokenStream should capture the exact token positions for nicer error messages.

comment:3 Changed 15 years ago by Simon Greenhill

milestone: post-1.0
Triage Stage: UnreviewedAccepted

Accepted as post-1.0 following discussion on #7799.

comment:4 Changed 15 years ago by Johannes Dollinger

This would also fix #7377.

Changed 15 years ago by Johannes Dollinger

Attachment: tplrf-r8970.diff added

with tests

comment:5 Changed 15 years ago by Carl Meyer

Cc: carl@… added

comment:6 Changed 15 years ago by anonymous

Cc: research@… added

comment:7 Changed 15 years ago by Chris Beaven

This is all incredibly impressive, emulbreh.

I've got a tag token parser which has a different tact - you're more than welcome to a look. Send me an email if you're interested - [me]

comment:8 Changed 15 years ago by Johannes Dollinger

This would also fix #9315.

comment:9 Changed 15 years ago by Jacob

This looks really good! I've got a few questions:

  • Have you done any benchmarking? I'd be interested in any speed implications.
  • The @uses_token_stream decorator seems a bit clumsy (as you note), but I see why you did it. Any thoughts on a more elegant way of getting the new functionality in in a backwards-compatible way? What about making Token.__iter__ return a TokenStream? That'd allow a pretty natural for x in token: approach in custom tags.
  • There's no documentation of the new TokenStream class, either as docs or docstrings. This needs to happen, if only so I can figure out what the hell pop_lexm means.

I'm sure I'll have more questions later, but for now that'll do. Thanks!

comment:10 Changed 15 years ago by Johannes Dollinger

I haven't done any real benchmarking. Comparing time spent in the "templates" test, it's about 1.18 times slower.

There's a tiny bit of documentation in this thread.
Hopefully enough to dicuss the API. I'll add docstrings in the next patch.

Token.__iter__ would seldomly be useful as you'd typically use the higher level parser functions in custom tags. Perhaps TokenStream should be named TokenParser.
Simon Willison proposed @register.tag(token_stream=True).

comment:11 Changed 14 years ago by Carl Meyer

Cc: Carl Meyer added; carl@… removed

comment:12 Changed 14 years ago by (none)

milestone: post-1.0

Milestone post-1.0 deleted

comment:13 Changed 14 years ago by Alexander Koshelev

Cc: Alexander Koshelev added

comment:14 Changed 14 years ago by Johannes Dollinger

Changed 14 years ago by Johannes Dollinger

Attachment: 7806.tplrf-r11593.diff added

comment:15 Changed 14 years ago by Johannes Dollinger

Resolution: wontfix
Status: newclosed

For the sake of completeness: The current (outdated) version of my git branch.
I'm also closing this ticket as wontfix as I will do no further work on it.

comment:16 Changed 13 years ago by Alexander Koshelev

Resolution: wontfix
Status: closedreopened

I think it must be completed anyway.

comment:17 Changed 12 years ago by Luke Plant

Severity: Normal
Type: Cleanup/optimization

comment:18 Changed 11 years ago by Aymeric Augustin

Easy pickings: unset
Owner: changed from nobody to Aymeric Augustin
Status: reopenednew
UI/UX: unset

comment:19 Changed 10 years ago by Aymeric Augustin

Resolution: fixed
Status: newclosed

All tickets mentioned in the summary have been fixed.

Last week-end, during the sprint, Baptiste Mispelon wrote patches to use Token.split_contents() consistently — that was my goal when assigning this ticket to myself.

I think most of the goals of this ticket have been achieved — well, it's difficult to tell for sure because the patch is large and has been out of date for a few years.

If I've missed something, please re-open a new ticket describing remaining problems.

comment:20 in reply to:  19 Changed 10 years ago by Johannes Dollinger

My original motivation for this refactoring was the poor API for custom tag parsers. As a side effect, the new parser API fixed a bunch of open bugs with builtin tags.
While the primary goal for this ticket hasn't been achieved, I agree that it should be closed. A better API for custom tags should have its own ticket (if it doesn't already).

comment:21 Changed 10 years ago by Aymeric Augustin

Isn't @register.simple_tag the "better API" for custom tags?

comment:22 in reply to:  21 Changed 10 years ago by anonymous

Almost. But I believe that the API should be powerful enough to parse all builtin tags.

comment:23 Changed 10 years ago by Aymeric Augustin

Or maybe the syntax of the builtin tags should be normalized so they can all be parsed with the generic API ;)

comment:24 Changed 10 years ago by jonathanslenders

@aaugustin and @emulbreh: Could some of you have a look at ticket? #20434

I finished the refactoring of a better API of custom tags. See patch2 in that ticket about how I used the "normalized syntax" to rewrite all of the builtin tags.

Note: See TracTickets for help on using tickets.
Back to Top