Django

Code

Ticket #4164 (closed: fixed)

Opened 2 years ago

Last modified 2 years ago

Inaccurate tags recognition

Reported by: tonnzor <tonn81@gmail.com> Assigned to: anonymous
Milestone: Component: Template system
Version: SVN Keywords:
Cc: tom@eggdrop.ch Triage Stage: Accepted
Has patch: 0 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 0

Description

Template lexer's check for tags is inaccurate - it checks only the beginning of the tag, not the ending and / or content.

Sample template:

{{ "some text" #}

This will result:

some text

But it should produce:

{{ "some text" #}

Attachments

template_lexer.diff (1.1 kB) - added by tonnzor <tonn81@gmail.com> on 04/26/07 12:41:39.

Change History

04/26/07 12:41:39 changed by tonnzor <tonn81@gmail.com>

  • attachment template_lexer.diff added.

04/27/07 05:41:49 changed by Thomas Steinacher <tom@eggdrop.ch>

  • cc set to tom@eggdrop.ch.
  • needs_better_patch changed.
  • needs_tests changed.
  • needs_docs changed.

Shouldn't your sample give a syntax error, as you're mixing {{ and #}?

04/27/07 06:24:53 changed by mtredinnick

  • has_patch deleted.
  • stage changed from Unreviewed to Accepted.

There is an interesting problem going on here, but it's a little bit bigger than this patch. The patch only works if the broken template is the whole "token_string" contents. Having any extra trailing characters would land you right back in the initial problem. Removing the "has patch" designator for that reason.

The bug itself is only triggered if the very first characters of a string of characters that do not contain a real tag match could be the beginning of a match. Again, putting any random character at the start of the string hides the problem. There is also a problem with typos like

{{ foo #} {{ blah }}

That will try to resolve a variable called "foo #} {{ blah", rather than just spitting out the first bit unchanged and then treating the {{ blah }} portion as a well-formed tag.

I'm working on a slight change to the matching expression that should fix all of these problems without any real performance impact.

Thomas: it's a bit of a coin toss whether this should be treated as a syntax error or not. One problem with doing so is that it would make passing the (somewhat improbable) string "}}" as an argument to a tag problematic. At the moment we display any malformed input like this literally in the resulting template, so it's easy to pick up by eye. I'm not completely unhappy with that behaviour.

04/27/07 06:29:31 changed by tonnzor <tonn81@gmail.com>

  • owner changed from adrian to anonymous.
  • status changed from new to assigned.

@Thomas Steinacher:

No. This template will be processed:

  1. Splitted by tag_re regular expression (result: ['{{ "some text" #}',]
  2. Parsed by Lexer.tokenize (result: {{ "some text" #}?)
  3. Parsed by Lexer.create_token

Code for step 3:

        if token_string.startswith(VARIABLE_TAG_START):
            token = Token(TOKEN_VAR, token_string[len(VARIABLE_TAG_START):-len(VARIABLE_TAG_END)].strip())
        elif token_string.startswith(BLOCK_TAG_START):
            token = Token(TOKEN_BLOCK, token_string[len(BLOCK_TAG_START):-len(BLOCK_TAG_END)].strip())
        elif token_string.startswith(COMMENT_TAG_START):
            token = Token(TOKEN_COMMENT, '')
        else:
            token = Token(TOKEN_TEXT, token_string)

As you see - only the beginning is checked, and tag's end is cut (any 2 chars).

Provided example is edge case, but this bug can cause very strange errors. I can provide move examples if needed.

04/27/07 06:33:32 changed by tonnzor <tonn81@gmail.com>

@mtredinnick:

Yes, in your example "foo #} {{ blah" is tried to resolve as a variable, but the code related to this resilving should handle it and generate an error.

04/27/07 06:38:50 changed by mtredinnick

Yes, it does handle that case. It would be nice to less greedy, though. However, may not be possible. Fixing the initial problem is the real goal.

04/27/07 06:55:50 changed by tonnzor <tonn81@gmail.com>

Before making a patch I thought about better solution, but its implementation takes a lot of code change (update lexer, tags evaluating, etc.) that can cause new bugs, so I made a simple patch that covers the initial defect.

Anyway, the better solution is better ;-)

04/27/07 07:16:22 changed by mtredinnick

  • status changed from assigned to closed.
  • resolution set to fixed.

(In [5104]) Fixed #4164, #4171 -- Reworked some of the template lexer logic to ensure we don't get caught out by a couple of corner cases.


Add/Change #4164 (Inaccurate tags recognition)




Change Properties
Action