Opened 9 months ago

Closed 9 months ago

#35270 closed Cleanup/optimization (fixed)

Optimize Model._meta._property_names

Reported by: Adam Johnson Owned by: Adam Johnson
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Keryn Knight Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Adam Johnson)

Continuing my project to optimize the system checks, I found some optimizations for Options._meta._property_names, which I found to take ~4% of the total runtime for checks.

Most of this function’s runtime was being spent running inspect.getattr_static(). This is not surprising as it jumps through many hoops in order to avoid triggering attribute access.

I added use of getattr_static() back in #28269 / ed244199c72f5bbf33ab4547e06e69873d7271d0 to fix a bug with instance descriptors. But I think it’s overly cautious, and we can assume that accessing the __dict__ of the model class will work fine.

Two optimizations make the function run in negligible time:

  1. Changing the function to use __dict__ directly
  2. Caching on a per-class basis. This requires using a weak-reference to classes, as we shouldn’t mutate base classes in the MRO, some of which can be non-model subclasses, like Model itself for the pk property, object, or any mixins.

Before optimization stats:

106 calls to _property_names took 26ms, or ~4% of the total runtime of system checks.

After optimization:

The same calls take 1ms, or ~0.2% of the total runtime. (The real runtime may be <1ms, but shows as 1 due to rounding up by cProfile.)

Change History (8)

comment:1 by Adam Johnson, 9 months ago

Description: modified (diff)

comment:2 by Natalia Bidart, 9 months ago

I'm not so sure about this one, particularly after having read the history in the relevant PRs (the original optimization in this code and its regression fix).

I wonder, would using the solution proposed for "1.11.x" be an option for getting rid of inspect.getattr_static? I'm not a fan of the custom weak key cache, it feels like an unnecessary adding to the framework only for optimization purposes.

comment:3 by Keryn Knight, 9 months ago

Cc: Keryn Knight added

👋 Adam, do you happen to know the overall percentage of the "win" each of the 2 optimisations does? i.e. is 80% of it the change to use the klass.__dict__ or is 95% of it the weak_key_cache, etc?

Is there a world where we get enough of a benefit from just Changing the function to use __dict__ directly that we don't need the weak_key_cache?

I ask because the weak_key_cache is the kind of deep magic that I don't fully understand immediately, and because you mentioned caching on a per-class basis but I'd have assumed (from my naive understanding of these innards) that was already approaching done, by virtue of the cached_property existing on the Options per model? (i.e. Options is a singleton per class)

comment:4 by Adam Johnson, 9 months ago

I didn’t fully explain the caching. What I meant by “per-class” is per-*any*-class, not per *model class* - that means caching values for all the property names in object (none), in models.Model (just pk, at current), any mixing classes, and so on. Yes, the caching on Options means it’s cached per *model class*, but just relying on that caching means we don’t get to reuse the work done checking everything defined on mixins, Model, or object.

weak_key_cache implements a pattern I’ve played with before to associate extra data with an object without attaching arbitrary attributes, since they might clash or affect operations like serialization or repr. django-upgrade uses a bunch of WeakKeyDictionary instances to keep fixers modular to their own files.

It doesn’t make sense to use @weak_key_cache without the __dict__ optimization, because it requires one call per class, whilst the old dir() approach checks attributes defined in the class *and* all superclasses.

I profiled __dict__ without @weak_key_cache though, using this implementation:

@cached_property
def _property_names(self):
    """Return a set of the names of the properties defined on the model."""
    names = set()
    for klass in self.model.__mro__:
        names.update(
            {
                name
                for name, value in klass.__dict__.items()
                if isinstance(value, property)
            }
        )
    return frozenset(names)

The result was that the calls took 2ms, keeping most of the savings. That said, the project I’m using doesn’t have deep model inheritance or many mixins, so we wouldn’t expect the caching to do so much.

If you’d both prefer this version, sure, we can go for it. Best to keep things maintainable for all, and we can always add @weak_key_cache or similar in the future.

in reply to:  4 comment:5 by Natalia Bidart, 9 months ago

Triage Stage: UnreviewedAccepted

Replying to Adam Johnson:
[...]

The result was that the calls took 2ms, keeping most of the savings. That said, the project I’m using doesn’t have deep model inheritance or many mixins, so we wouldn’t expect the caching to do so much.

If you’d both prefer this version, sure, we can go for it. Best to keep things maintainable for all, and we can always add @weak_key_cache or similar in the future.

I'm very much in favor of a simpler optimization. I agree with Keryn that @weak_key_cache is the kind of deep magic is not fully understood immediately.
Accepting following this simplification proposal.

comment:6 by Adam Johnson, 9 months ago

Alright, updated.

comment:7 by Mariusz Felisiak, 9 months ago

Triage Stage: AcceptedReady for checkin

comment:8 by Mariusz Felisiak, 9 months ago

Resolution: fixed
Status: assignedclosed
Note: See TracTickets for help on using tickets.
Back to Top