Opened 11 months ago

Closed 9 months ago

#35044 closed Bug (fixed)

Accessing a deferred field clears reverse relations

Reported by: Adam Johnson Owned by: Giannis Terzopoulos
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

DeferredAttribute.__get__ calls Model.refresh_from_db() to load the deferred field. Whilst refresh_from_db() does load the intended field, it also clears a lot of cached data, including reverse relations. This can causes extra queries when those relations are accessed before and after a deferred field.

For example, take these models:

class Book(models.Model):
    title = models.TextField()
    ...


class BookText(models.Model):
    book = models.OneToOneField(Book, on_delete=models.CASCADE)
    text = models.TextField()

With query debug logging on, we can see that a second access to booktext after accessing the deferred title causes an unnecessary extra query:

In [1]: from example.models import *

In [2]: b=Book.objects.defer('title').earliest('id')
(0.000) SELECT "example_book"."id", "example_book"."author_id" FROM "example_book" ORDER BY "example_book"."id" ASC LIMIT 1; args=(); alias=default

In [3]: b.booktext
(0.000) SELECT "example_booktext"."id", "example_booktext"."book_id", "example_booktext"."text" FROM "example_booktext" WHERE "example_booktext"."book_id" = 1 LIMIT 21; args=(1,); alias=default
Out[3]: <BookText: BookText object (1)>

In [4]: b.title
(0.000) SELECT "example_book"."id", "example_book"."title" FROM "example_book" WHERE "example_book"."id" = 1 LIMIT 21; args=(1,); alias=default
Out[4]: 'A 1'

In [5]: b.booktext
(0.000) SELECT "example_booktext"."id", "example_booktext"."book_id", "example_booktext"."text" FROM "example_booktext" WHERE "example_booktext"."book_id" = 1 LIMIT 21; args=(1,); alias=default
Out[5]: <BookText: BookText object (1)>

This is due to refresh_from_db() clearing the reverse-related object cache.

Spotted whilst working on #28586. My implementation for that will probably fix this issue, but I thought it best to report this separately.

Change History (9)

comment:1 by Simon Charette, 11 months ago

Triage Stage: UnreviewedAccepted
Type: Cleanup/optimizationBug

Feels like a bug to me.

When refresh_from_db(fields) is specified it should likely not clear reverse relationships and I think the same can be said about

Regression in a7b5ad8b19a08d7d57302ece74f6e26d2887fd9f #27846 for reverse relations and possibly 123b1d3fcf79f091573c40be6da7113a6ef35b62 #34137 for private fields. The private field situation is more complex though as they might be composed of other fields (e.g. GenericForeignKey) but the field APIs doesn't expose a generic way of introspecting that so I would assume we'd still want to clear even if fields is specified.

An alternative here would be to accept another kwarg to denote field cache clearing that would default to True and that DeferredAttribute.__get__ would pass False to.

Version 1, edited 11 months ago by Simon Charette (previous) (next) (diff)

comment:2 by Giannis Terzopoulos, 10 months ago

Owner: changed from nobody to Giannis Terzopoulos

Hey, I'm interested in taking this over unless @Adam, you might want to fix this as part of #28586?

Spotted whilst working on #28586. My implementation for that will probably fix this issue

What I'm thinking to do is follow Simon's plan above and not clear cached relations when refresh_from_db(fields) is specified, unless a relation is part of fields explicitly. Not sure how I would deal with the private fields though to be honest, or if we actually need to handle that.

comment:3 by Simon Charette, 10 months ago

Not sure how I would deal with the private fields though to be honest, or if we actually need to handle that.

yeah not sure either, I suggest trying the same approach that avoids clearing if fields is specified and the private field name is not part of them. My only concern is that I'm not sure how .only will behave when passed private field names. In all cases the only regression test existing for #34137 doesn't make use of fields so it's in a grey area.

comment:4 by Adam Johnson, 10 months ago

Giannis, feel free to take this ticket. It’s now looking like my approach to #28586 will go in a different direction.

comment:5 by Giannis Terzopoulos, 10 months ago

Thank you both! I opened a PR for this PR

I've added tests for defer() and only() for relations/private fields and updated the refresh_from_db() accordingly. I also added a couple of tests with the fields kwarg specified.

comment:6 by Giannis Terzopoulos, 10 months ago

Has patch: set

comment:7 by Mariusz Felisiak, 9 months ago

Triage Stage: AcceptedReady for checkin

comment:8 by Mariusz Felisiak <felisiak.mariusz@…>, 9 months ago

In 0c690c6:

Refs #35044 -- Added Model.refresh_from_db(fields=...) test for clearing reverse relations.

comment:9 by Mariusz Felisiak <felisiak.mariusz@…>, 9 months ago

Resolution: fixed
Status: assignedclosed

In 73df8b54:

Fixed #35044 -- Avoided clearing reverse relations and private fields when accessing deferred fields.

Regression in a7b5ad8b19a08d7d57302ece74f6e26d2887fd9f for reverse
relations and possibly in 123b1d3fcf79f091573c40be6da7113a6ef35b62 for
private fields.

Note: See TracTickets for help on using tickets.
Back to Top