Opened 18 years ago
Closed 18 years ago
#4164 closed (fixed)
Inaccurate tags recognition
Reported by: | Owned by: | anonymous | |
---|---|---|---|
Component: | Template system | Version: | dev |
Severity: | Keywords: | ||
Cc: | tom@… | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | UI/UX: |
Description
Template lexer's check for tags is inaccurate - it checks only the beginning of the tag, not the ending and / or content.
Sample template:
{{ "some text" #}
This will result:
some text
But it should produce:
{{ "some text" #}
Attachments (1)
Change History (8)
by , 18 years ago
Attachment: | template_lexer.diff added |
---|
comment:1 by , 18 years ago
Cc: | added |
---|
comment:2 by , 18 years ago
Has patch: | unset |
---|---|
Triage Stage: | Unreviewed → Accepted |
There is an interesting problem going on here, but it's a little bit bigger than this patch. The patch only works if the broken template is the whole "token_string" contents. Having any extra trailing characters would land you right back in the initial problem. Removing the "has patch" designator for that reason.
The bug itself is only triggered if the very first characters of a string of characters that do not contain a real tag match could be the beginning of a match. Again, putting any random character at the start of the string hides the problem. There is also a problem with typos like
{{ foo #} {{ blah }}
That will try to resolve a variable called "foo #} {{ blah", rather than just spitting out the first bit unchanged and then treating the {{ blah }}
portion as a well-formed tag.
I'm working on a slight change to the matching expression that should fix all of these problems without any real performance impact.
Thomas: it's a bit of a coin toss whether this should be treated as a syntax error or not. One problem with doing so is that it would make passing the (somewhat improbable) string "}}" as an argument to a tag problematic. At the moment we display any malformed input like this literally in the resulting template, so it's easy to pick up by eye. I'm not completely unhappy with that behaviour.
comment:3 by , 18 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
@Thomas Steinacher:
No. This template will be processed:
- Splitted by tag_re regular expression (result: ['{{ "some text" #}',]
- Parsed by Lexer.tokenize (result: {{ "some text")
- Parsed by Lexer.create_token
Code for step 3:
if token_string.startswith(VARIABLE_TAG_START): token = Token(TOKEN_VAR, token_string[len(VARIABLE_TAG_START):-len(VARIABLE_TAG_END)].strip()) elif token_string.startswith(BLOCK_TAG_START): token = Token(TOKEN_BLOCK, token_string[len(BLOCK_TAG_START):-len(BLOCK_TAG_END)].strip()) elif token_string.startswith(COMMENT_TAG_START): token = Token(TOKEN_COMMENT, '') else: token = Token(TOKEN_TEXT, token_string)
As you see - only the beginning is checked, and tag's end is cut (any 2 chars).
Provided example is edge case, but this bug can cause very strange errors. I can provide move examples if needed.
comment:4 by , 18 years ago
@mtredinnick:
Yes, in your example "foo #} {{ blah" is tried to resolve as a variable, but the code related to this resilving should handle it and generate an error.
comment:5 by , 18 years ago
Yes, it does handle that case. It would be nice to less greedy, though. However, may not be possible. Fixing the initial problem is the real goal.
comment:6 by , 18 years ago
Before making a patch I thought about better solution, but its implementation takes a lot of code change (update lexer, tags evaluating, etc.) that can cause new bugs, so I made a simple patch that covers the initial defect.
Anyway, the better solution is better ;-)
comment:7 by , 18 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Shouldn't your sample give a syntax error, as you're mixing {{ and #}?