#33474 closed Cleanup/optimization (fixed)

Introduce slots definition to Variable and FilterExpression

Reported by: Keryn Knight Owned by: Keryn Knight
Component: Template system Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Another attempt at a case-by-case of the idea in #12826, given #33465 was OK'd.

In the Pull-Request for #33465 there was some discussion about adding slots defs to Node subclasses, which isn't as straight-forward as one might hope (though I don't think it's impossible, just a bit more work, and a bit more of a sell to make :)) because of class attributes and mutation during parsing.

However, internal to VariableNode are 2 layers of "indirection" to actually resolve a value, FilterExpression and Variable. Neither of these is a Node and neither inherits from anything nor is ordinarily inherited by anything. Additionally they don't make use of class attributes, assigning all of their values in __init__

This makes them (IMHO) a pretty decent candidate for removing the implicit __dict__ on the instance, because any user subclasses would still inherit and get a dict, and there's not any real complications (AFAIK - please say if you think of any) in tightening the attributes assignable.

Running the test suite (Ran 14963 tests in 438.336s using --parallel=1) locally, and patching VariableNode.__init__ to track the pympler.asizeof.asizeof(self) at that point and increment a global int suggests that for me (using python3.10), the baseline space taken up by VariableNode instances is 39.3 MB (using filesizeformat on the byte count, remember every VariableNode will contain a FilterExpression and most will contain a Variable). After patching in __slots__ definitions for each of FilterExpression and Variable, the cost across the same number of tests goes down to 21.4 MB, "saving" roughly nearly 50% off the cost of storing them.

If we look at the individual pieces, before applying __slots__ to either class:

In [1]: from django.template.base import TVariable, FilterExpression, Parser
In [2]: from pympler.asizeof import asizeof
In [3]: p = Parser("test")
In [4]: v = Variable("test")
In [5]: f = FilterExpression("test", p)
In [6]: asizeof(v)
Out[6]: 592
In [7]: asizeof(f)
Out[7]: 1000

and after applying to both:

...
In [6]: asizeof(v)
Out[6]: 216
In [7]: asizeof(f)
Out[7]: 368

Note that the Parser instance given to the FilterExpression isn't changed and doesn't count towards the byte count, because it's not stored onto the FilterExpression at any point (though individual filters are).

For good measure, a bit of the ole' timeit magic to check the runtime performance profile doesn't markedly change. AFAIK it ought to be every so slightly faster, but given the difference is probably in nanoseconds and the timings are in microseconds ...

Before patches:

In [3]: %timeit Variable("test")
3.47 µs ± 38.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [4]: p = Parser("test")

In [5]: %timeit FilterExpression("test", p)
6.12 µs ± 196 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [6]: %timeit Variable("test").resolve({'test': 1})
4.02 µs ± 40.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [7]: %timeit FilterExpression("test", p).resolve({'test': 1})
6.84 µs ± 105 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [8]: v = Variable("test")
In [9]: f = FilterExpression("test", p)

In [10]: %timeit v.resolve({'test': 1})
387 ns ± 2.24 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [11]: %timeit f.resolve({'test': 1})
569 ns ± 7.82 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

After patches:

In [3]: %timeit Variable("test")
3.47 µs ± 52.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [4]: p = Parser("test")

In [5]: %timeit FilterExpression("test", p)
6.01 µs ± 61.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [6]: %timeit Variable("test").resolve({'test': 1})
4.09 µs ± 52.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [7]: %timeit FilterExpression("test", p).resolve({'test': 1})
6.57 µs ± 83.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [8]: v = Variable("test")
In [9]: f = FilterExpression("test", p)

In [10]: %timeit v.resolve({'test': 1})
365 ns ± 2.13 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [11]: %timeit f.resolve({'test': 1})
521 ns ± 3.03 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

As one might expect, I have a branch locally with these changes that I can make into a PR, and all the tests pass.

As a minor aside and sanity check for "odd behaviours in the wild", given Malcolm's comment on #12826 about monkey-patching possibly presenting an issue, I've checked one of my monkeypatches on the builtin Variable class and that still appears to all work fine, otherwise I would definitely not be suggesting this ;)

Change History (5)

comment:1 Changed 10 months ago by Keryn Knight

Owner: changed from nobody to Keryn Knight

comment:2 Changed 10 months ago by Keryn Knight

Has patch: set

comment:3 Changed 10 months ago by Mariusz Felisiak

Triage Stage: UnreviewedAccepted

comment:4 Changed 10 months ago by Mariusz Felisiak

Triage Stage: AcceptedReady for checkin

comment:5 Changed 10 months ago by Mariusz Felisiak <felisiak.mariusz@…>

Resolution: fixed
Status: assignedclosed

In 84418ba3:

Fixed #33474 -- Added slots to Variable and FilterExpression.

Note: See TracTickets for help on using tickets.
Back to Top