Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#30083 closed Bug (wontfix)

Model instance state not correctly set when initializing a model with Model.from_db()

Reported by: Sebastiaan ten Pas Owned by: Nasir Hussain
Component: Database layer (models, ORM) Version: 2.1
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When loading a model instance through from_db, it first makes a new instance and after sets the state:

    @classmethod
    def from_db(cls, db, field_names, values):
        if len(values) != len(cls._meta.concrete_fields):
            values_iter = iter(values)
            values = [
                next(values_iter) if f.attname in field_names else DEFERRED
                for f in cls._meta.concrete_fields
            ]
        new = cls(*values)
        new._state.adding = False
        new._state.db = db
        return new

This means that during the class initialization, the model instance state is equal to _state.adding = True and _state.db = None. This can cause troubles with related queries in the pre-/post-init methods. For example:

class Parent(models.Model):
    pass


class Child(models.Model):
    parent = models.ForeignKey(Parent)


def foo(instance, **kwargs)
    print(instance._state.adding)
    print(instance._state.db)

    # It could be that we initialize the model while it does not exist in the
    # database, so we check whether `instance.parent` exists first.
    if getattr(instance, 'parent', None):
        # Since we try to access instance.parent, it passes the `instance`
        # as hint to the router, but `instance._state.db` is `None`, so it defaults
        # to the `DEFAULT_DB_ALIAS`.
        assert instance._state.db == instance.parent._state.db


signals.post_init.connect(foo, sender=Child)

Trying to fetch an instance of Child, will now return:

>>> child = Child.objects.using('second').get(pk=1)
True
None
AssertionError

The current behavior indicates that one should never do related queries in the pre_init or post_init hooks, because it cannot be guaranteed that the related queries are done on the same database you're retrieving the instance from. Currently there's also no way of getting to know the on which database the query was done for in the pre_init and post_init hooks because the state is set after those signals are fired.

One related issue is #18532, which is only talking about the _state.adding flag. There's a suggested solution over there, but I'm not sure if that's the best way to go. I think there should be at least a note in the Django documentation at the pre_init and post_init hooks saying that related queries should be avoided, because it cannot be guaranteed that they will be run on the same database you're getting the instance from.

Change History (11)

comment:1 by Tim Graham, 5 years ago

Summary: Model instance state not correctly set when initializing a model with from_dbModel instance state not correctly set when initializing a model with Model.from_db()
Triage Stage: UnreviewedAccepted

comment:2 by Nasir Hussain, 5 years ago

Owner: changed from nobody to Nasir Hussain
Status: newassigned

comment:3 by Nasir Hussain, 5 years ago

Has patch: set
Last edited 5 years ago by Tim Graham (previous) (diff)

comment:4 by Simon Charette, 5 years ago

Patch needs improvement: set

I left comments for improvements on the PR. After a read through of #18532 I'm still not convinced the addition in complexity is worth it. I never add to use init signals though but performing queries in receivers of such signals seems like a recipe for disaster.

The suggested implementation will also have the side effect of exposing instances with a ._state attribute in pre_init signal while it wasn't the case before. If we wanted to maintain backward compatibility it could be fired from Model.__new__ through which is probably where it should belong in the first place.

Last edited 5 years ago by Simon Charette (previous) (diff)

comment:5 by Nasir Hussain, 5 years ago

Patch needs improvement: unset

comment:6 by Mariusz Felisiak, 5 years ago

Resolution: wontfix
Status: assignedclosed

When we take into account performance doubts, __new__() tricky, additional complexity, and that we can encourage bad practices to perform queries in receivers of pre_init or post_init signals (hidden N+1 queries on every queryset iteration), I think we shouldn't fix this (see also Anssi's comment and Simon's comment). It's just not worth it. I'm going to add notes to the signal documentation.

comment:7 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

In a2e1c17f:

Refs #30083 -- Clarified database state of instances in signals.pre_init docs.

comment:8 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

In fc1182af:

Refs #30083 -- Added a warning about performing queries in pre/post_init receivers.

Thanks Carlton Gibson the review.

comment:9 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

In fa3ae44:

[2.2.x] Refs #30083 -- Clarified database state of instances in signals.pre_init docs.

Backport of a2e1c17f193f5017e1f6fac7d860f1f9e34d7892 from master

comment:10 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

In 506f800e:

[2.2.x] Refs #30083 -- Added a warning about performing queries in pre/post_init receivers.

Thanks Carlton Gibson the review.

Backport of fc1182af01c391ce33d7fcf51c756829c6a11d5b from master

comment:11 by Sebastiaan ten Pas, 5 years ago

Thanks for the PR Mariusz Felisiak and your initial PR Mariusz Felisiak. I would prefer to have the code change, as I still consider it a bug, but I can live with only the documentation change for now.

Note: See TracTickets for help on using tickets.
Back to Top