Opened 4 years ago

Closed 4 years ago

#31419 closed Bug (invalid)

Django Postgres memory leak

Reported by: Francesco Meli Owned by: nobody
Component: Database layer (models, ORM) Version: 3.0
Severity: Normal Keywords: postgresl memory-leak
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Note: I also wrote the detailed issue here: https://stackoverflow.com/questions/60972577/django-postgres-memory-leak

I have a custom Django (v 2.0.0) command to start background job executers in a multi-threaded fashion which seems to give me memory leak issues.

The command can be started like so:

./manage.py start_job_executer --thread=1

Each thread has a while True loop that picks up jobs from a PostgreSQL table.

In order to pick up the job and change the status atomically I used transactions:

# atomic transaction to temporary lock the db access and to
# get the most recent job from db with column status = pending
with transaction.atomic():
    job = Job.objects.select_for_update() \
        .filter(status=Job.STATUS['pending']) \
        .order_by('created_at').first()
    if job:
        job.status = Job.STATUS['executing']
        job.save()

Il looks like the allocated memory by this Django custom command keeps growing.

Using tracemalloc I tried to find what is causing the memory leak by creating a background thread that checks the memory allocation:

def check_memory(self):
        while True:
            s1 = tracemalloc.take_snapshot()
            sleep(10)
            s2 = tracemalloc.take_snapshot()
            for alog in s2.compare_to(s1, 'lineno')[:10]:
                log.info(alog)

Finding out the following log:

01.04.20 13:50:06   operations.py:222: size=23.7 KiB (+23.7 KiB), count=66 (+66), average=367 B
01.04.20 13:50:36   operations.py:222: size=127 KiB (+43.7 KiB), count=353 (+122), average=367 B
01.04.20 13:51:04   operations.py:222: size=251 KiB (+66.7 KiB), count=699 (+186), average=367 B
01.04.20 13:51:31   operations.py:222: size=379 KiB (+68.9 KiB), count=1056 (+192), average=367 B
01.04.20 13:51:57   operations.py:222: size=495 KiB (+60.3 KiB), count=1380 (+168), average=367 B

Looks like /usr/local/lib/python3.5/dist-packages/django/db/backends/postgresql/operations.py:222 does not release memory

The leakage is slow for 1 thread but if I use 8 threads the memory leak is worse:

01.04.20 13:07:51   operations.py:222: size=68.3 KiB (+68.3 KiB), count=191 (+191), average=366 B
01.04.20 13:08:56   operations.py:222: size=770 KiB (+140 KiB), count=2151 (+390), average=367 B
01.04.20 13:10:07   operations.py:222: size=1476 KiB (+138 KiB), count=4122 (+386), average=367 B

01.04.20 13:36:22   operations.py:222: size=17.3 MiB (+138 KiB), count=49506 (+385), average=367 B

01.04.20 13:48:16   operations.py:222: size=24.5 MiB (+136 KiB), count=69993 (+379), average=367 B

This is the code at line 222 in /usr/local/lib/python3.5/dist-packages/django/db/backends/postgresql/operations.py:222:

def last_executed_query(self, cursor, sql, params):
        # http://initd.org/psycopg/docs/cursor.html#cursor.query
        # The query attribute is a Psycopg extension to the DB API 2.0.
        if cursor.query is not None:
            return cursor.query.decode() # this is line 222!
        return None

I have no clue how to attack this problem. Any ideas at all?

Thanks in advance

UPDATE

I was using Django 2.0 and I thought to update to Django 3.0.5 (latest stable release), but unfortunately the problem is still there.

Below the new logs:

01.04.20 20:15:06   operations.py:235: size=977 KiB (+53.9 KiB), count=2750 (+152), average=364 B
01.04.20 20:15:28   operations.py:235: size=1070 KiB (+50.1 KiB), count=3012 (+141), average=364 B
01.04.20 20:15:53   operations.py:235: size=1156 KiB (+43.7 KiB), count=3255 (+123), average=364 B
01.04.20 20:16:19   operations.py:235: size=1245 KiB (+44.7 KiB), count=3507 (+126), average=364 B

01.04.20 20:20:23   operations.py:235: size=2154 KiB (+44.3 KiB), count=6065 (+125), average=364 B

Change History (1)

comment:1 by Simon Charette, 4 years ago

Resolution: invalid
Status: newclosed

Django keeps a reference to all executed queries in a ring buffer when settings.DEBUG = True

From DEBUG documentation

It is also important to remember that when running with DEBUG turned on, Django will remember every SQL query it executes. This is useful when you’re debugging, but it’ll rapidly consume memory on a production server.

Please reopen if you can reproduce without settings.DEBUG and please avoid using this bug tracker as a second tier support channel.

Note: See TracTickets for help on using tickets.
Back to Top