#34521 closed Cleanup/optimization (wontfix)

Use __slots__ for template Node classes

Reported by: Adam Johnson Owned by: Adam Johnson
Component: Template system Version: dev
Severity: Normal Keywords:
Cc: Keryn Knight, Carlton Gibson Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by Adam Johnson)

Declaring slots reduces the memory footprint of a class, and can lead to performance gains due to less data fetching by the CPU.

#12826 proposed adding slots to many classes but was closed with the proposal to do so on a case-by-case basis.

#33474 added slots to Variable and related classes, showing memory reductions and performance gains.

I propose adding slots to more template classes to further save memory (Template, Token, Lexer, and the Node classes). This change leads to 1-2% improvement in template rendering time, using this benchmark script:

import django
from django.conf import settings
from django.template import Context
from django.template import Template

settings.configure(
    TEMPLATES=[{"BACKEND": "django.template.backends.django.DjangoTemplates"}]
)
django.setup()

template = Template("{% if x %}{{ x }}{% endif %}" * 1_000)
context = Context({"x": "abc"})
for i in range(1_000):
    template.render(context)

And invoking [hyperfine](https://github.com/sharkdp/hyperfine) like so, with Python 3.11.2:

$ hyperfine \
  -N --warmup 1 \
  --prepare 'git switch -d 7d0e566208' \
  'python benchmark.py' \
  --prepare 'git switch ticket_34521_optimize_templates' \
  'python benchmark.py'
Benchmark 1: python benchmark.py
  Time (mean ± σ):      2.028 s ±  0.042 s    [User: 1.990 s, System: 0.035 s]
  Range (min … max):    1.997 s …  2.116 s    10 runs

Benchmark 2: python benchmark.py
  Time (mean ± σ):      1.985 s ±  0.007 s    [User: 1.950 s, System: 0.033 s]
  Range (min … max):    1.976 s …  1.997 s    10 runs

Summary
  'python benchmark.py' ran
    1.02 ± 0.02 times faster than 'python benchmark.py'

(I decided not to use djangobench since it's not well maintained, and doesn't do such great stats as hyperfine.)

Change History (8)

comment:1 by Adam Johnson, 12 months ago

Description: modified (diff)

comment:2 by Adam Johnson, 12 months ago

Description: modified (diff)

comment:3 by Mariusz Felisiak, 12 months ago

Owner: changed from nobody to Adam Johnson
Status: newassigned
Triage Stage: UnreviewedAccepted

Thanks. Tentatively accepted.

comment:4 by Keryn Knight, 12 months ago

Cc: Keryn Knight added

comment:5 by Carlton Gibson, 12 months ago

Cc: Carlton Gibson added

comment:6 by Mariusz Felisiak, 12 months ago

Patch needs improvement: set

Marking as "Needs improvement" while waiting for follow up.

comment:7 by Carlton Gibson, 12 months ago

I'd like to argue for wontfixing this, at least for a couple of release cycles...

Copying my comment from the PR:

Can I ask that we pause on this please?

I'm currently investigating possible implementations for _Template Fragments_ (or _Partials_), as per the [HTMX essay on that theme](https://htmx.org/essays/template-fragments/).

Whilst custom template tags have long been able to write to the context, this route isn't available for (an as yet in-development) _partial nodes_ that are not rendered, for example because they're declared outside a block, or are rendered after they're required for output, because of varying rendering order due to inheritance, say.

In order to work around this, and in order to allow partials to resolvable from a specific template, I'm currently assigning a mapping to the template Origin to serve as a lookup store.

This isn't particularly _nice™_, but it functions, and allows work on the prototype to continue.

My current implementation works on Django 4/2, but is broken by this PR.

  • Can no longer create weak references to template objects. (This could be worked around but...)
  • Can no longer assign a mapping to any object that's available at both parsing and rendering time, and so there's no (immediately) clear way to pass partial references out from the parser.
    • A module global might work here, but it's difficult to maintain the correct per-template mapping, and (more) even using weakref for the storer making sure other references are dropped has proved non-trivial in my experiments thus far.

Beyond seriously putting in doubt my current work, I think forcing slots here makes experimenting with the DTL much harder, and I don't think that's paid for by the speed increase on offer. We're only now beginning to come back to _What's possible with templates?_, after a long period of JSON APIs being primary. I'd like to ask that we don't handicap that by overly restricting what's easy to do with template objects.

I also note that in researching this project, there is a long history of people trying all sorts of customisations (and I was only looking at output capture). Given that this PR breaks my WIP, I'd be surprised if it didn't lead to breakages more widely.

I appreciate that it's not part of the official contract for template object not to use slots, but I think it's a pretty major change.

tl;dr: With fresh attention, there are plenty of gains to be made to the DTL. Making everything use slots makes experimenting with that harder.

Thanks.

Last edited 12 months ago by Carlton Gibson (previous) (diff)

comment:8 by Mariusz Felisiak, 12 months ago

Resolution: wontfix
Status: assignedclosed
Triage Stage: AcceptedUnreviewed

Closing as "wontifx", as a deeper discussion is needed due to the backward incompatibility of this change.

Thanks y'all.

Note: See TracTickets for help on using tickets.
Back to Top