Django

Code

Ticket #7338 (assigned)

Opened 6 months ago

Last modified 3 months ago

Method .cache(timeout) in QuerySet

Reported by: marinho Assigned to: marinho (accepted)
Milestone: Component: Database layer (models, ORM)
Version: SVN Keywords:
Cc: semente@taurinus.org, ross@rossp.org Triage Stage: Design decision needed
Has patch: 1 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 0

Description

I made these changes in the QuerySet? and sql.Query classes in the database component of newforms-admin branch.

I added a new method called .cache(timeout) to use like we use filter(), exclude(), select_related(), etc. that if used, enable the sql.Query class to stores the SQL query result in the cache.

This improved my systems very well, because using it, I avoid to make several queries with the same SQL sentence in the same time from the database. As we know, memory is faster than database/filesystem, so, if you use memcached or locmem backends, is probably that you can can enhance your system using this method.

As my english is the best, I published a snippet that can clear for who not understand correctly what I meant: http://www.djangosnippets.org/snippets/777/

Attachments

method_cache.diff (7.7 kB) - added by marinho on 07/25/08 07:59:44.
Updated patch with cache method

Change History

05/30/08 15:58:57 changed by marinho

  • needs_better_patch changed.
  • needs_tests changed.
  • needs_docs changed.

I wrote wrong this part: "As my english is the best", I forgot the NOT part of the sentence... hehe ;)

06/03/08 12:20:35 changed by Guilherme M. Gondim <semente@taurinus.org>

  • cc set to semente@taurinus.org.

06/04/08 15:02:33 changed by marinho

  • owner changed from nobody to marinho.
  • status changed from new to assigned.

Fixed patch wish key based on hash. This is because cache doesn't accept keys with 250+ characters.

06/07/08 12:30:37 changed by brosner

  • version changed from newforms-admin to SVN.

06/20/08 06:39:20 changed by marinho

Since I lost the detailed text that I wrote in the Django Snippet told above (that has been deleted), I am writing below exemples of use of the method.

Overriding get_query_set

This is an example for use cacheable request everytime.

class CategoryManager(models.Manager):
    def get_query_set(self):
        q = super(CategoryManager, self).get_query_set()

        return q.cache(300)

Single use

This is an example for use cacheable request just for that line.

lista = Video.all().cache(300)

06/20/08 06:45:26 changed by telenieko

  • stage changed from Unreviewed to Accepted.
  • milestone set to post-1.0.

You should:

  • Remove the "#-----"
  • Add test cases to test this new behaviour, like:
    • Run a query cached,
    • Modify objects,
    • Rerun the query cached, see it's not changed
    • Rerun without .cache(), see it's changed
  • Patch the documentation to report about this method.

Anyway I think that should go in post-1.0 ;)

And I'm not sure if running a query without cache should invalidate caches of that same query, but that could waste resources checking if cache exists for every query, so better forget.

07/16/08 06:04:14 changed by marinho

@telenieko

Thanks for advices :) I known that I need to improve the patch, but I'm waiting for the merge of 1.0-final (with newforms-admin) to do this :)

07/25/08 07:59:44 changed by marinho

  • attachment method_cache.diff added.

Updated patch with cache method

07/25/08 08:01:37 changed by marinho

Tests and documentation done. Also the proxy method in Manager class

07/28/08 01:31:48 changed by rossp

  • cc changed from semente@taurinus.org to semente@taurinus.org, ross@rossp.org.

09/15/08 08:53:11 changed by marinho

Two other approaches to do that:

but I still prefer the approach of this ticket, in my opinion is more simple and powerfull :)

09/16/08 22:26:47 changed by mtredinnick

  • stage changed from Accepted to Design decision needed.
  • milestone deleted.

I don't think this has a place in core at the moment. There are a lot of trade-offs you have to know about when doing caching like this (e.g. it assumes the data isn't changing rapidly enough for the caching to get unacceptably stale). It's already pretty easy to cache queries when you want them to be cached (and caching querysets with a key so they can be looked up later is probably better in most cases).

My feeling is that this is probably better done for now as a third-party thing that is invoked via a manager that provides a custom QuerySet? class. Making it work smoothly with other custom Queryset / Query classes might require a little work and that's something you can bring up on django-dev if it turns out to be hard (it would be nice to make it possible). But tying the caching stuff into the SQL producing stuff so tightly doesn't sit comfortably with me.

Leaving open for the time being in the hopes that another core developer will comment one way or the other, but I'm giving it a -1 at the moment.


Add/Change #7338 (Method .cache(timeout) in QuerySet)




Change Properties
Action