Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#29340 closed Cleanup/optimization (invalid)

"cache.get_or_set()" extra hits on database

Reported by: hematinik Owned by: nobody
Component: Core (Cache system) Version: 2.0
Severity: Normal Keywords: cache
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by hematinik)

I've found out that using cache.get_or_set() causes extra SQL queries while using the alternative approach to get a key’s value or set a value if the key isn’t in the cache (using "if" statement) works fine without extra hits on database.
it seems the problem appears when you use the cache.get_or_set() inside a loop . this is the code to reproduce the problem:

THE MODEL:

class EventManager(models.Manager):

    def get_location(self,event):

        output = cache.get_or_set(
            'event_location_eID{}'.format(event.pk),
            event.location,
            None)

        return output

class Event(models.Model):

    objects = EventManager()
    location = models.ForeignKey(Location,on_delete = models.CASCADE)
    

THE VIEW:

def home(request):

    events = cache.get_or_set('all_events',Event.objects.all(),None) # this line works fine without extra hits on db
    locations = []

    for event in events:
        x = Event.objects.get_location(event) # this line causes extra SQL queries while the cache keys already exist
        locations.append(x)

    return render(request,'index.html',{'locations':locations})

THE TEMPLATE:

{% for location in locations %}
    
    {{location.title}} <br>

{% endfor %}

now if I change the get_location() method on EventManager to this:

def get_location(self,event):

    output = cache.get('event_location_eID{}'.format(event.pk)) 
    if output is None:
        output = event.location
        cache.set('event_location_eID{}'.format(event.pk),output,None)

    return output

there will be no more unnecessarily SQL queries.

I'm using django-debug-toolbar to monitor my database requests and the information about queries is based on the statistics that django-debug-toolbar provides. I'll also attach some screenshots from django-debug-toolbar to clarify the situation.

sql queries when using get_or_set(): https://imgur.com/a/dydWEuL
sql queries when using the alternative approach : https://imgur.com/a/vYpfUZ3

Regards.

Change History (6)

comment:1 by hematinik, 6 years ago

Description: modified (diff)

comment:2 by Tim Graham, 6 years ago

Can you explain why Django is at fault? I haven't debugged your project but I think it's more likely that you've made a mistake somewhere. You should at least try to simplify the steps to reproduce to something more minimal.

Last edited 6 years ago by Tim Graham (previous) (diff)

in reply to:  2 comment:3 by hematinik, 6 years ago

Replying to Tim Graham:

Can you explain why Django is at fault? I haven't debugged your project but I think it's more likely that you've made a mistake somewhere.

django is expected to use cached data and not to create sql queries when the required cache keys are present. as I explained my information is based on the statistics django-debug-toolbar provides so "maybe" it is not the django's fault. I don't think there is any problem with the code itself. I simplified the code for clearance but I'm sure it resembles the situation of my project's behavior.

comment:4 by Tim Graham, 6 years ago

Resolution: invalid
Status: newclosed

The extra queries are caused by accessing event.location which isn't cached. You could cache that as well using Event.objects.select_related('location'). Next time, please use our support channels if you need help confirming whether or not Django is at fault.

in reply to:  4 comment:5 by hematinik, 6 years ago

Replying to Tim Graham:

The extra queries are caused by accessing event.location which isn't cached. You could cache that as well using Event.objects.select_related('location'). Next time, please use our support channels if you need help confirming whether or not Django is at fault.

Sir I believe you didn't get my point. these two block of code should work just alike while they don't when certain conditions meet:

    def get_location(self,event):
    
        output = cache.get('event_location_eID{}'.format(event.pk)) 
        if output is None:
            output = event.location
            cache.set('event_location_eID{}'.format(event.pk),output,None)

        return output

V.S

    def get_location(self,event):
    
        output = cache.get_or_set(
            'event_location_eID{}'.format(event.pk),
            event.location,
            None)

        return output

I've already cached the values of event.location so get_or_set() shouldn't make any sql queries. both of the variations are working fine outside of the loop but they behave differently when used inside a for loop. In sorry for the inconvenience if I'm mistaken.

Version 1, edited 6 years ago by hematinik (previous) (next) (diff)

comment:6 by Tim Graham, 6 years ago

The second block evaluates event.location (which causes a database query) because it's an argument to cache.get_or_set(). The first block delays that evaluation until the cache is checked (cache.get()). This is how Python works and not a bug in Django.

Note: See TracTickets for help on using tickets.
Back to Top