Opened 3 days ago

Closed 3 days ago

Last modified 3 days ago

#36137 closed Bug (invalid)

Simple performance test using timeit and django.test.Client leads to using all available memory

Reported by: Vinay Sajip Owned by:
Component: Testing framework Version: 5.1
Severity: Normal Keywords:
Cc: Vinay Sajip Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Running this script:

#!/usr/bin/env bash
rm -rf minimal env
python3 -m venv env
env/bin/pip install django
env/bin/django-admin startproject minimal
cd minimal
../env/bin/python manage.py startapp basic
cat << EOF > basic/tests.py
import timeit

from django.test import TestCase, Client

class MinimalTestCase(TestCase):
    def test_render_performance(self):
        n = 2000000
        t = timeit.timeit(setup="from django.test import Client; c = Client(headers={'user-agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:134.0) Gecko/20100101 Firefox/134.0'})",
                          stmt="c.get('/admin/')", number=n)
        print(f'{int(t * 1000/n)} msecs')
EOF
../env/bin/python manage.py test

causes all of the memory in the machine to be used up. This is unexpected, as the response returned from the c.get('/admin/') call isn't stored anywhere and should be garbage collected, and it's not clear where the memory leak is. Once the memory usage goes to near 100%, the swap starts going up and the test grinds to a crawl.

Tested with Django 5.1.5, Python 3.10.12 on a Linux Mint system with 4GB of memory.

Change History (2)

comment:1 by Simon Charette, 3 days ago

Resolution: invalid
Status: newclosed

Hello Vinay, there is unfortunately not much we can do from this report as you've not provided details on how Django is at fault.

if you rewrite your test like the following

    def test_render_performance(self):
        n = 2000000
        for _ in range(n):
            c = Client(headers={'user-agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:134.0) Gecko/20100101 Firefox/134.0'})
            c.get('/admin/')
        print(f'{int(t * 1000/n)} msecs')

you'll notice that the memory usage does not continue to increase in an unbounded manner which likely means that the following statement

the response returned from the c.get('/admin/') call isn't stored anywhere and should be garbage collected, and it's not clear where the memory leak is.

is likely a bad assumption and from looking at the output of tracemalloc it appears that timeit.timeit keeps references to frames which prevents c and it's weakref associated signal receivers registered on a request from being adequately garbage collected.

If you can reproduce without involving timeit.timeit or demonstrate how Django is at fault please re-open.

Last edited 3 days ago by Simon Charette (previous) (diff)

comment:2 by Simon Charette, 3 days ago

Per Python docs on timeit

By default, timeit() temporarily turns off garbage collection during the timing. The advantage of this approach is that it makes independent timings more comparable. The disadvantage is that GC may be an important component of the performance of the function being measured. If so, GC can be re-enabled as the first statement in the setup string.

Enabling garbage collection, which is a necessity if you're going to be creating 2M requests, also bounds the memory usage

    def test_render_performance(self):
        n = 2000000
        t = timeit.timeit(setup="gc.enable();from django.test import Client; c = Client(headers={'user-agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:134.0) Gecko/20100101 Firefox/134.0'})",
                          stmt="c.get('/admin/')", number=n)
        print(f'{int(t * 1000/n)} msecs')
Note: See TracTickets for help on using tickets.
Back to Top