Opened 12 years ago

Closed 11 years ago

#19170 closed New feature (wontfix)

Add a way to control related fields reverse cache

Reported by: dirleyrls Owned by: nobody
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords: orm realted cache
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I have a very specific use-case where we use a database view to represent some data. I've mapeed it using a unmanaged model like this:

# concrete table
class Something(models.Model):
  # ...


class SomethingHelperView(models.Model):
  something = models.OneToOneField(Something, related_name='helper')
  counter = models.PositiveIntegerField()
  
  class Meta:
    managed = False
    db_table = 'something_helper_view'

When I do:

>>> something = Something.objects.get(pk=1)
>>> something.helper.counter
10
>>> # some operation that will update the counter
>>> # ...
>>> something.helper.counter
10

This was not the behavior I wanted to have. Since I'm using a database view, I expect it to always be "fresh". I want that, every time I access my helper, it hits the database, bringing me a fresh value. That's the point of this view.

Poking around through Django's code, I realized that related fields have their back references cached. It's ok, I understand the purpose of that cache. But I got frustrated when I discovered I can't just turn it off. It would be good to be able to turn it off.

And for the record, currently I'm working around this issue doing like this:

Something.fresh_helper = property(lambda self: SomethingHelperView.get(something=self))

Change History (5)

comment:1 by Aymeric Augustin, 12 years ago

Resolution: needsinfo
Status: newclosed

Caching is the expected behavior in the general case. See previous tickets on this ticket.

What do you mean by "be able to turn it off"? What if a request temporarily "turns in off", crashes, and it's never turned back on?

Version 0, edited 12 years ago by Aymeric Augustin (next)

in reply to:  1 comment:2 by dirleyrls, 12 years ago

Replying to aaugustin:

Caching is the expected behavior in the general case. See previous tickets on this topic.

What do you mean by "be able to turn it off"? What if a request temporarily "turns in off", crashes, and it's never turned back on?

Sorry, I didnt make myself clear about the "turn it off". I expect to be able to do something like this:

class SomethingHelperView(models.Model)
  something = models.OneToOneField(Something, related_name='helper', backref_cache=False)
  # ...

Now, whenever I access the Somethig.helper attribute, I'll get a fresh Helper instance.

comment:3 by Aymeric Augustin, 12 years ago

Resolution: needsinfo
Status: closedreopened

comment:4 by Luke Plant, 12 years ago

In Python in general, properties that might be expensive to calculate are often cached, so I don't think this is surprising behaviour.

Personally, I think this is a corner case that can be easily addressed in the manner you specified, or simply by doing SomethingHelperView.get(something=the_thing). I suspect the backref_cache idea would be hard enough to implement correctly that it is not worth it.

I also don't think it is a nice API to define it on the model. It's entirely possible that some types of access of a certain model would benefit from the caching behaviour, and for other types of access on the same model you wouldn't want it. An API like select_related() allows you to tune a similar performance related behaviour and turn it on and off at will, no matter what the defaults on the model are, but this API would not allow it, since we are talking about properties on instances and not methods on a queryset.

comment:5 by Aymeric Augustin, 11 years ago

Resolution: wontfix
Status: reopenedclosed

In addition to the points made by Luke, I would add that this API creates "action at a distance". A developer would have to look at the model definition to understand the performance characteristics of the code he's working on (SQL queries are a huge factor in the performance of a Django application).

I find it safer to always cache (I fixed several bugs in this area of Django), thereby minimizing SQL queries, and reload objects explicitly (or write your own APIs).

For these reasons, I'm going to reject the API proposed in comment 2.

PS: this isn't limited to backwards relations, the exact same arguments also apply to forward relations.

Note: See TracTickets for help on using tickets.
Back to Top