Opened 2 years ago

Last modified 2 years ago

#26481 new New feature

Add a "strict mode" for defer()/only() to prevent queries on unfetched field access

Reported by: Adam (Chainz) Johnson Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: only, defer
Cc: me@…, Florian Apolloner Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

In our project at YPlan we had some highly optimized queries using only() to restrict the fields being fetched.

Some time later the downstream code of these queries was changed to access extra fields not in the only() list. This triggered the DeferredAttribute implicit queries to go fill in those fields with single queries, a classic N+1 queries problem.

Whilst better testing on the queries would have revealed this, I can't help but feel that it's a somewhat surprising behaviour that only() (and by extension, defer()) can bite back this badly.

I'd like some way of making only() and defer() strict, and to raise an exception on deferred fields.

Currently we've implemented this with a patchy.patch to insert a raise into DeferredAttribute:

from django.db.models.query_utils import DeferredAttribute

patchy.patch(DeferredAttribute.__get__, """\
    @@ -20,6 +20,7 @@ def __get__(self, instance, owner):
             # might be able to reuse the already loaded value. Refs #18343.
             val = self._check_parent_chain(instance, name)
             if val is None:
    +            raise AttributeError("Deferred field '{}' fetched from database!".format(self.field_name))
                 instance.refresh_from_db(fields=[self.field_name])
                 val = getattr(instance, self.field_name)
             data[self.field_name] = val
""")

I can imagine this being a Django-wide setting, or maybe per-model so it wouldn't affect third party apps so badly.

Change History (6)

comment:1 Changed 2 years ago by Tim Graham

Triage Stage: UnreviewedAccepted

The API that comes to mind to me would be something like only(..., strict=True). Possible APIs should be discussed on the DevelopersMailingList before doing the implementation.

comment:2 Changed 2 years ago by Tim Graham

Summary: Add a "strict mode" for defer()/only()Add a "strict mode" for defer()/only() to prevent queries on unfetched field access

comment:3 Changed 2 years ago by Simon Charette

FWIW Adrian proposed adding another keyword argument to only()/defer() that results in loading all fields.

Since we might want to add different kind of actions when a deferred field is accessed I suggest to use another name for the kwarg.

For example, we could issue a warning (warnings.warn) or log one (logging.warning) instead of raising an exception to prevent the code from crashing while exposing a way to monitor such instances.

comment:4 Changed 2 years ago by Anssi Kääriäinen

Overriding Model.refresh_from_db() allows raising failure when deferred field is accessed - is this enough for you use case?.

Last edited 2 years ago by Tim Graham (previous) (diff)

comment:5 in reply to:  3 Changed 2 years ago by Florian Apolloner

Replying to charettes:

FWIW Adrian proposed adding another keyword argument to only()/defer() that results in loading all fields.

Loading all, or loading in groups as suggested by me in that thread and done in sqlalchemy (http://docs.sqlalchemy.org/en/latest/orm/loading_columns.html#deferred-loading-with-multiple-entities) seems like a good idea.

For example, we could issue a warning (warnings.warn) or log one (logging.warning) instead of raising an exception to prevent the code from crashing while exposing a way to monitor such instances.

Since this feature would be optional for backwards compat anyways I do not see the need for logging.

comment:6 Changed 2 years ago by Florian Apolloner

Cc: Florian Apolloner added
Note: See TracTickets for help on using tickets.
Back to Top