#7806 closed Cleanup/optimization (fixed)
django.template refactoring
| Reported by: | Johannes Dollinger | Owned by: | Aymeric Augustin |
|---|---|---|---|
| Component: | Template system | Version: | dev |
| Severity: | Normal | Keywords: | tplrf |
| Cc: | Carl Meyer, research@…, Alexander Koshelev | Triage Stage: | Accepted |
| Has patch: | yes | Needs documentation: | yes |
| Needs tests: | yes | Patch needs improvement: | yes |
| Easy pickings: | no | UI/UX: | no |
Description
Why should django.template be refactored?
Filter and Variable parsing is inconsistent. Many ad hoc parsers in defaulttags are fragile.
Whitespace handling is ungraceful.
The patch provided splits __init__.py into separate modules and introduces a TokenStream
class that allows parsing of literals, vars (called lookups) and filter expressions. No docs yet,
have a look at the source (it all happens in ~150 loc).
The patch only touches django.template and django.templatetags.
Tests are separate so you can run them with the old code.
Modules in django.template
context(no dependencies)
Untouched.
expressions(no dependencies)
ContainsLookup,Literal, andFilterExpression. All are subclasses of a marker classExpression. These replaceVariableand the oldFilterExpressionclass. AVariableequivalent is provided incompat.Variable.
nodes(depends on expressions)
ContainsNode,NodeList,TextNode, andExpressionNode. The latter is the oldVariableNoderenamed. AVariableNodealias is provided incompat.
compiler(depends on context, expressions, nodes)
ContainsParser,Lexer,Token,TemplateSyntaxError,Origin,StringOrgin, andcompile_string(). Those are mostly untouched. Additionally there areTokenStream, andTokenSyntaxError. Those provide a safe way to parseTokenscontaining expressions.
library(depends on compiler, nodes, context)
ContainsInvalidTemplateLibrary,Library,get_library(), andadd_to_builtins(). Mostly untouched. Should be calledlibrariesfor consistency, but that would clash with the legacy dict.
loader
Untouched. MovedTemplateDoesNotExisthere.
utils
ConditionalNode, base class forIfNode,IfEqualNode.EmptyNode. Helper functions:parse_conditional_nodelists(parser, name),parse_as(bits)parse_args_and_kwargs(bits),resolve_args_and_kwargs(args, kwargs)
compat
Provides backwards compatibility forVariable,VariableNode,TokenParser, andresolve_variable()
defaulttags,loader_tags
Mostly refactored to useTokenStreamwhere appropriate.
Tickets
- #4746 - allow whitespace before and after filter separator
fixed. tests are there (filter-syntax03, filter-syntax04) but expect TemplateSyntaxError.
- #5270 - empty string literals
fixed. tests included.
- #5756 - accept filters (almost) everywhere
fixed. a few tests included.
- #5971 - token parser bug
TokenParserwill be deprecated.
- #6271 - filter arguments with spaces
fixed. test included. (seems to be fixed in trunk already)
- #6510 -
get_nodes_by_type()problem
probably fixed.
- #6535 - negative numeric literals
fixed. tests included.
- #7295 - quotes, escaping and translation of string literals handled inconsistently in templates
fixed.
Performance
Comparing runtests.py templates measures, the refactored version is a bit slower.
Todo
{% if not %} is currently valid. The refactored implementation is greedy and reports a syntax error.
Could be fixed, not sure it's worth it. Broken tests are: if-tag-not02, if-tag-not03. DDN.
Numeric literals ending with "." are now considered a syntax error.
Broken test: ifequal-numeric07. DDN.
{% url %} currently accepts an arbitrary unquoted unicode string for the view name.
The refactored implementation only accepts bare python identifiers or quoted string literals.
Broken test: url05. DDN.
TemplateSyntaxError messages need some work. Eventually shortcut methods on TokenStream.
Find better names for TokenSyntaxError and TokenStream.
Provide a decorator that handles the boilerplate bits=parser.token_stream(token)
and bits.assert_consumed() calls.
Investigate performance impact.
Docs and more tests.
Attachments (5)
Change History (29)
by , 17 years ago
| Attachment: | tplrf.diff added |
|---|
by , 17 years ago
| Attachment: | tplrf-tests.diff added |
|---|
comment:1 by , 17 years ago
| Has patch: | set |
|---|---|
| Needs documentation: | set |
| Needs tests: | set |
| Patch needs improvement: | set |
by , 17 years ago
| Attachment: | tplrf.2.diff added |
|---|
comment:2 by , 17 years ago
- made decorators 2.3 compatible
- added a (clusily named) decorator for parser functions using
TokenStream:token_stream_parsed. - moved some common syntax error handling to
TokenStream. - converted some remaining
compile_filter()calls
Notes: {% cycle %} is still missing. The docstring/code ratio in defaultags starts to get ugly. A DebugTokenStream should capture the exact token positions for nicer error messages.
comment:3 by , 17 years ago
| milestone: | → post-1.0 |
|---|---|
| Triage Stage: | Unreviewed → Accepted |
Accepted as post-1.0 following discussion on #7799.
comment:5 by , 17 years ago
| Cc: | added |
|---|
comment:6 by , 17 years ago
| Cc: | added |
|---|
comment:7 by , 17 years ago
This is all incredibly impressive, emulbreh.
I've got a tag token parser which has a different tact - you're more than welcome to a look. Send me an email if you're interested - [me]@gmail.com
comment:9 by , 17 years ago
This looks really good! I've got a few questions:
- Have you done any benchmarking? I'd be interested in any speed implications.
- The
@uses_token_streamdecorator seems a bit clumsy (as you note), but I see why you did it. Any thoughts on a more elegant way of getting the new functionality in in a backwards-compatible way? What about makingToken.__iter__return aTokenStream? That'd allow a pretty naturalfor x in token:approach in custom tags.
- There's no documentation of the new
TokenStreamclass, either as docs or docstrings. This needs to happen, if only so I can figure out what the hellpop_lexmmeans.
I'm sure I'll have more questions later, but for now that'll do. Thanks!
comment:10 by , 17 years ago
I haven't done any real benchmarking. Comparing time spent in the "templates" test, it's about 1.18 times slower.
There's a tiny bit of documentation in this thread.
Hopefully enough to dicuss the API. I'll add docstrings in the next patch.
Token.__iter__ would seldomly be useful as you'd typically use the higher level parser functions in custom tags. Perhaps TokenStream should be named TokenParser.
Simon Willison proposed @register.tag(token_stream=True).
comment:11 by , 17 years ago
| Cc: | added; removed |
|---|
comment:13 by , 17 years ago
| Cc: | added |
|---|
by , 16 years ago
| Attachment: | 7806.tplrf-r11593.diff added |
|---|
comment:15 by , 16 years ago
| Resolution: | → wontfix |
|---|---|
| Status: | new → closed |
For the sake of completeness: The current (outdated) version of my git branch.
I'm also closing this ticket as wontfix as I will do no further work on it.
comment:16 by , 16 years ago
| Resolution: | wontfix |
|---|---|
| Status: | closed → reopened |
I think it must be completed anyway.
comment:17 by , 15 years ago
| Severity: | → Normal |
|---|---|
| Type: | → Cleanup/optimization |
comment:18 by , 14 years ago
| Easy pickings: | unset |
|---|---|
| Owner: | changed from to |
| Status: | reopened → new |
| UI/UX: | unset |
follow-up: 20 comment:19 by , 13 years ago
| Resolution: | → fixed |
|---|---|
| Status: | new → closed |
All tickets mentioned in the summary have been fixed.
Last week-end, during the sprint, Baptiste Mispelon wrote patches to use Token.split_contents() consistently — that was my goal when assigning this ticket to myself.
I think most of the goals of this ticket have been achieved — well, it's difficult to tell for sure because the patch is large and has been out of date for a few years.
If I've missed something, please re-open a new ticket describing remaining problems.
comment:20 by , 13 years ago
My original motivation for this refactoring was the poor API for custom tag parsers. As a side effect, the new parser API fixed a bunch of open bugs with builtin tags.
While the primary goal for this ticket hasn't been achieved, I agree that it should be closed. A better API for custom tags should have its own ticket (if it doesn't already).
follow-up: 22 comment:21 by , 13 years ago
Isn't @register.simple_tag the "better API" for custom tags?
comment:22 by , 13 years ago
Almost. But I believe that the API should be powerful enough to parse all builtin tags.
comment:23 by , 13 years ago
Or maybe the syntax of the builtin tags should be normalized so they can all be parsed with the generic API ;)
comment:24 by , 12 years ago
@aaugustin and @emulbreh: Could some of you have a look at ticket? #20434
I finished the refactoring of a better API of custom tags. See patch2 in that ticket about how I used the "normalized syntax" to rewrite all of the builtin tags.
I'm going to add a tplrf-fixed keyword to #4746, #5270, #5756, #5862, #6271, #6296, #6510, #6535, and #7295 - unless someone tells me that's a bad idea.