Opened 16 years ago

Closed 11 years ago

Last modified 11 years ago

#7806 closed Cleanup/optimization (fixed)

django.template refactoring

Reported by: Johannes Dollinger Owned by: Aymeric Augustin
Component: Template system Version: dev
Severity: Normal Keywords: tplrf
Cc: Carl Meyer, research@…, Alexander Koshelev Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: yes Patch needs improvement: yes
Easy pickings: no UI/UX: no


Why should django.template be refactored?

Filter and Variable parsing is inconsistent. Many ad hoc parsers in defaulttags are fragile.
Whitespace handling is ungraceful.

The patch provided splits into separate modules and introduces a TokenStream
class that allows parsing of literals, vars (called lookups) and filter expressions. No docs yet,
have a look at the source (it all happens in ~150 loc).

The patch only touches django.template and django.templatetags.
Tests are separate so you can run them with the old code.

Modules in django.template

  • context (no dependencies)

  • expressions (no dependencies)
    Contains Lookup, Literal, and FilterExpression. All are subclasses of a marker class Expression. These replace Variable and the old FilterExpression class. A Variable equivalent is provided in compat.Variable.

  • nodes (depends on expressions)
    Contains Node, NodeList, TextNode, and ExpressionNode. The latter is the old VariableNode renamed. A VariableNode alias is provided in compat.

  • compiler (depends on context, expressions, nodes)
    Contains Parser, Lexer, Token, TemplateSyntaxError, Origin, StringOrgin, and compile_string(). Those are mostly untouched. Additionally there are TokenStream, and TokenSyntaxError. Those provide a safe way to parse Tokens containing expressions.

  • library (depends on compiler, nodes, context)
    Contains InvalidTemplateLibrary, Library, get_library(), and add_to_builtins(). Mostly untouched. Should be called libraries for consistency, but that would clash with the legacy dict.

  • loader
    Untouched. Moved TemplateDoesNotExist here.

  • utils
    ConditionalNode, base class for IfNode, IfEqualNode. EmptyNode. Helper functions: parse_conditional_nodelists(parser, name), parse_as(bits) parse_args_and_kwargs(bits), resolve_args_and_kwargs(args, kwargs)

  • compat
    Provides backwards compatibility for Variable, VariableNode, TokenParser, and resolve_variable()

  • defaulttags, loader_tags
    Mostly refactored to use TokenStream where appropriate.


  • #4746 - allow whitespace before and after filter separator
    fixed. tests are there (filter-syntax03, filter-syntax04) but expect TemplateSyntaxError.

  • #5270 - empty string literals
    fixed. tests included.

  • #5756 - accept filters (almost) everywhere
    fixed. a few tests included.

  • #5971 - token parser bug
    TokenParser will be deprecated.

  • #6271 - filter arguments with spaces
    fixed. test included. (seems to be fixed in trunk already)

  • #6296 - expressions for ifequal
    fixed. tests included (dup of #5756 ?)

  • #6510 - get_nodes_by_type() problem
    probably fixed.

  • #6535 - negative numeric literals
    fixed. tests included.

  • #7295 - quotes, escaping and translation of string literals handled inconsistently in templates


Comparing templates measures, the refactored version is a bit slower.


{% if not %} is currently valid. The refactored implementation is greedy and reports a syntax error.
Could be fixed, not sure it's worth it. Broken tests are: if-tag-not02, if-tag-not03. DDN.

Numeric literals ending with "." are now considered a syntax error.
Broken test: ifequal-numeric07. DDN.

{% url %} currently accepts an arbitrary unquoted unicode string for the view name.
The refactored implementation only accepts bare python identifiers or quoted string literals.
Broken test: url05. DDN.

TemplateSyntaxError messages need some work. Eventually shortcut methods on TokenStream.

Find better names for TokenSyntaxError and TokenStream.

Provide a decorator that handles the boilerplate bits=parser.token_stream(token)
and bits.assert_consumed() calls.

Investigate performance impact.

Docs and more tests.

Attachments (5)

tplrf.diff (109.6 KB ) - added by Johannes Dollinger 16 years ago.
tplrf-tests.diff (6.3 KB ) - added by Johannes Dollinger 16 years ago.
tplrf.2.diff (116.9 KB ) - added by Johannes Dollinger 16 years ago.
tplrf-r8970.diff (129.2 KB ) - added by Johannes Dollinger 16 years ago.
with tests
7806.tplrf-r11593.diff (129.1 KB ) - added by Johannes Dollinger 14 years ago.

Download all attachments as: .zip

Change History (29)

by Johannes Dollinger, 16 years ago

Attachment: tplrf.diff added

by Johannes Dollinger, 16 years ago

Attachment: tplrf-tests.diff added

comment:1 by Johannes Dollinger, 16 years ago

Has patch: set
Needs documentation: set
Needs tests: set
Patch needs improvement: set

I'm going to add a tplrf-fixed keyword to #4746, #5270, #5756, #5862, #6271, #6296, #6510, #6535, and #7295 - unless someone tells me that's a bad idea.

by Johannes Dollinger, 16 years ago

Attachment: tplrf.2.diff added

comment:2 by Johannes Dollinger, 16 years ago

  • made decorators 2.3 compatible
  • added a (clusily named) decorator for parser functions using TokenStream: token_stream_parsed.
  • moved some common syntax error handling to TokenStream.
  • converted some remaining compile_filter() calls

Notes: {% cycle %} is still missing. The docstring/code ratio in defaultags starts to get ugly. A DebugTokenStream should capture the exact token positions for nicer error messages.

comment:3 by Simon Greenhill, 16 years ago

milestone: post-1.0
Triage Stage: UnreviewedAccepted

Accepted as post-1.0 following discussion on #7799.

comment:4 by Johannes Dollinger, 16 years ago

This would also fix #7377.

by Johannes Dollinger, 16 years ago

Attachment: tplrf-r8970.diff added

with tests

comment:5 by Carl Meyer, 16 years ago

Cc: carl@… added

comment:6 by anonymous, 16 years ago

Cc: research@… added

comment:7 by Chris Beaven, 16 years ago

This is all incredibly impressive, emulbreh.

I've got a tag token parser which has a different tact - you're more than welcome to a look. Send me an email if you're interested - [me]

comment:8 by Johannes Dollinger, 16 years ago

This would also fix #9315.

comment:9 by Jacob, 16 years ago

This looks really good! I've got a few questions:

  • Have you done any benchmarking? I'd be interested in any speed implications.
  • The @uses_token_stream decorator seems a bit clumsy (as you note), but I see why you did it. Any thoughts on a more elegant way of getting the new functionality in in a backwards-compatible way? What about making Token.__iter__ return a TokenStream? That'd allow a pretty natural for x in token: approach in custom tags.
  • There's no documentation of the new TokenStream class, either as docs or docstrings. This needs to happen, if only so I can figure out what the hell pop_lexm means.

I'm sure I'll have more questions later, but for now that'll do. Thanks!

comment:10 by Johannes Dollinger, 16 years ago

I haven't done any real benchmarking. Comparing time spent in the "templates" test, it's about 1.18 times slower.

There's a tiny bit of documentation in this thread.
Hopefully enough to dicuss the API. I'll add docstrings in the next patch.

Token.__iter__ would seldomly be useful as you'd typically use the higher level parser functions in custom tags. Perhaps TokenStream should be named TokenParser.
Simon Willison proposed @register.tag(token_stream=True).

comment:11 by Carl Meyer, 15 years ago

Cc: Carl Meyer added; carl@… removed

comment:12 by (none), 15 years ago

milestone: post-1.0

Milestone post-1.0 deleted

comment:13 by Alexander Koshelev, 15 years ago

Cc: Alexander Koshelev added

comment:14 by Johannes Dollinger, 15 years ago

by Johannes Dollinger, 14 years ago

Attachment: 7806.tplrf-r11593.diff added

comment:15 by Johannes Dollinger, 14 years ago

Resolution: wontfix
Status: newclosed

For the sake of completeness: The current (outdated) version of my git branch.
I'm also closing this ticket as wontfix as I will do no further work on it.

comment:16 by Alexander Koshelev, 14 years ago

Resolution: wontfix
Status: closedreopened

I think it must be completed anyway.

comment:17 by Luke Plant, 13 years ago

Severity: Normal
Type: Cleanup/optimization

comment:18 by Aymeric Augustin, 12 years ago

Easy pickings: unset
Owner: changed from nobody to Aymeric Augustin
Status: reopenednew
UI/UX: unset

comment:19 by Aymeric Augustin, 11 years ago

Resolution: fixed
Status: newclosed

All tickets mentioned in the summary have been fixed.

Last week-end, during the sprint, Baptiste Mispelon wrote patches to use Token.split_contents() consistently — that was my goal when assigning this ticket to myself.

I think most of the goals of this ticket have been achieved — well, it's difficult to tell for sure because the patch is large and has been out of date for a few years.

If I've missed something, please re-open a new ticket describing remaining problems.

in reply to:  19 comment:20 by Johannes Dollinger, 11 years ago

My original motivation for this refactoring was the poor API for custom tag parsers. As a side effect, the new parser API fixed a bunch of open bugs with builtin tags.
While the primary goal for this ticket hasn't been achieved, I agree that it should be closed. A better API for custom tags should have its own ticket (if it doesn't already).

comment:21 by Aymeric Augustin, 11 years ago

Isn't @register.simple_tag the "better API" for custom tags?

in reply to:  21 comment:22 by anonymous, 11 years ago

Almost. But I believe that the API should be powerful enough to parse all builtin tags.

comment:23 by Aymeric Augustin, 11 years ago

Or maybe the syntax of the builtin tags should be normalized so they can all be parsed with the generic API ;)

comment:24 by jonathanslenders, 11 years ago

@aaugustin and @emulbreh: Could some of you have a look at ticket? #20434

I finished the refactoring of a better API of custom tags. See patch2 in that ticket about how I used the "normalized syntax" to rewrite all of the builtin tags.

Note: See TracTickets for help on using tickets.
Back to Top