Opened 3 years ago

Last modified 8 days ago

#34746 assigned Cleanup/optimization

High CPU/memory consumption when a 5XX is raised with large local variables

Reported by: Rémi Dupré Owned by: Abdullah Mujahid
Component: Error reporting Version: 4.2
Severity: Normal Keywords:
Cc: jaffar Khan Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: yes UI/UX: no

Description (last modified by Rémi Dupré)

Hi!

In a view with large variable in its scope, if an exception is raised, the worker will freeze and its memory usage will raise. Here is a minimal example:

import sys
from django.urls import path

# 300MB
large_data = list(range(40 * 1024 * 1024))

def compute_lines(request):
    local_data = large_data
    raise ValueError

urlpatterns = [
    path("compute-lines/", compute_lines)
]


When calling /compute-lines/ you will notice the issue : the server will take a few seconds to responds while its memory raises by ~1GB.


After a bit of investigations, it turned out it comes from django.utils.log.AdminEmailHandler in the default logging config. This handler will eventually pretty print the whole context of each trace frame from here : https://github.com/django/django/blob/main/django/views/debug.py#L355

This implementation ensures that no more than 4kB are included in the formatted result, but the entire variable will still have to be rendered first. I'm not sure how it could be solved, but maybe https://docs.python.org/3/library/reprlib.html can be a clue?

Change History (27)

comment:1 by Rémi Dupré, 3 years ago

Description: modified (diff)
Summary: High CPU/memory consumtion when a 5xx is raised with a large local variableHigh CPU/memory consumption when a 5XX is raised with large local variables

comment:2 by Natalia Bidart, 3 years ago

Easy pickings: set
Triage Stage: UnreviewedAccepted
Type: BugCleanup/optimization

I have managed to reproduce the issue, thank you for the example. Indeed if v = pprint(v) is replaced by something that does not render the whole variable, the memory and CPU usage do not spike.

Accepting so we can evaluate whether an alternative could be used instead of pprint that would account for the size limit before printing.

comment:3 by Natalia Bidart, 3 years ago

After some experimentation following the suggested link to reprlib, perhaps something like this could be a good alternative? (it needs better code formatting, more reprlib limits sets and tests, of course)

  • django/views/debug.py

    a b class ExceptionReporter:  
    347347            self.template_does_not_exist = True
    348348            self.postmortem = self.exc_value.chain or [self.exc_value]
    349349
     350        import reprlib
     351        repr_instance = reprlib.Repr()
     352        repr_instance.maxstring = 5000
     353        repr_instance.maxlist = 1000
     354
    350355        frames = self.get_traceback_frames()
    351356        for i, frame in enumerate(frames):
    352357            if "vars" in frame:
    353358                frame_vars = []
    354359                for k, v in frame["vars"]:
    355                     v = pprint(v)
     360                    v = repr_instance.repr(v)
    356361                    # Trim large blobs of data
    357362                    if len(v) > 4096:
    358363                        v = "%s… <trimmed %d bytes string>" % (v[0:4096], len(v))

comment:4 by Vishesh Garg, 3 years ago

Owner: set to Vishesh Garg
Status: newassigned

comment:5 by Vishesh Garg, 3 years ago

Owner: Vishesh Garg removed

comment:6 by Vishesh Garg, 3 years ago

Owner: set to Vishesh Garg

comment:7 by Natalia Bidart, 3 years ago

Has patch: set
Needs tests: set
Patch needs improvement: set

comment:8 by Adam Johnson, 3 years ago

I think using reprlib is acceptable, this is exactly the kind of situation it’s designed for. It will be a shame to lose the pretty-printing aspect, but perhaps reprlib/pprint will grow appropriate functionality in the future.

comment:9 by Yash Kumar Verma, 2 years ago

Owner: changed from Vishesh Garg to Yash Kumar Verma

Hello folks, I saw the two pull requests that were attached in the ticket are stale, and the original author have both closed. I'm interesting in picking this as my first ticket, can I proceed?

comment:10 by Yash Kumar Verma, 2 years ago

I was:

  • able to verify the following
    • when exception is raised after declaring a large variable, memory usage shoots up.
    • when exception is not raised after declaring a large variable, memory remains in live with normal operations.

Setup I used to reproduce the error and suggested solution

  • added a middleware which logs the change in memory after each request.
  • created two views, one which throws error and one that does not.
  • observed the memory usage in both the cases.

Current

https://github.com/YashKumarVerma/django-pocs/blob/master/poc01/poc01/2023-12-03-14-18-23.png

After using reprlib

https://github.com/YashKumarVerma/django-pocs/blob/master/poc01/poc01/2023-12-03-14-35-23.png

comment:11 by Keerthi Vasan S A, 2 years ago

Owner: changed from Yash Kumar Verma to Keerthi Vasan S A

comment:12 by Keerthi Vasan S A, 2 years ago

Triage Stage: AcceptedReady for checkin

comment:13 by Mariusz Felisiak, 2 years ago

Triage Stage: Ready for checkinAccepted

You should not mark your own PRs as "ready for checkin".

comment:14 by Keerthi Vasan S A, 2 years ago

Ah sorry, I wanted to set this ticket as ready to be reviewed as mentioned by a contributor in the PR. How do I do that?

in reply to:  14 comment:15 by Natalia Bidart, 2 years ago

Replying to Keerthi Vasan S A:

Ah sorry, I wanted to set this ticket as ready to be reviewed as mentioned by a contributor in the PR. How do I do that?

Hi!

The proper procedure to get a re-review is described in the PR review checklist. Basically you need to unset the relevant flags in this ticket so the PR gets added back to the "patches needing review" section of the Django Developer Dahsboard.

comment:16 by Keerthi Vasan S A, 2 years ago

Gotcha, thank you!

comment:17 by Keerthi Vasan S A, 2 years ago

Needs tests: unset
Patch needs improvement: unset

comment:18 by Natalia Bidart, 2 years ago

Needs tests: set
Patch needs improvement: set

comment:19 by Ahmed Ibrahim, 18 months ago

Is this still being worked on or should I handle t?

in reply to:  19 comment:20 by Natalia Bidart, 18 months ago

Replying to Ahmed Ibrahim:

Is this still being worked on or should I handle t?

Hello Ahmed! Thanks for your eagerness to help. As we already discussed in the PR, this is still being work on and I'll provide more feedback in the PR.

comment:21 by Jitesh Nair, 9 months ago

Hi,

Is this ticket open to be taken over?

comment:22 by Abdullah Mujahid, 13 days ago

Owner: changed from Keerthi Vasan S A to Abdullah Mujahid

comment:23 by Abdullah Mujahid, 13 days ago

Hi, I'm taking over this ticket. I'll submit a fresh PR using reprlib from the standard library as suggested in the recent discussion. Since Django 6.0+ supports only Python 3.12+ we can use reprlib directly without compatibility concerns.

I'll include tests and address the previous feedback about consistent output representation.

comment:24 by Abdullah Mujahid, 13 days ago

Needs tests: unset

I've submitted a fresh PR addressing this issue: https://github.com/django/django/pull/20578

The implementation uses the standard library's reprlib.Repr() to limit variable representation size during generation, as suggested in the previous PR discussion. Since Django 6.0+ supports only Python 3.12+, we can use reprlib directly without compatibility concerns.

Changes include:

  • Replaced pprint with reprlib.Repr() in django/views/debug.py
  • Updated related tests
  • Added release note entry for 6.1

comment:25 by jaffar Khan, 9 days ago

As the latest PR has failures and no one is working on this ticket so I like to open a new PR.

comment:26 by jaffar Khan, 9 days ago

Cc: jaffar Khan added

comment:27 by Abdullah Mujahid, 9 days ago

I am already working on it

Version 0, edited 9 days ago by Abdullah Mujahid (next)
Note: See TracTickets for help on using tickets.
Back to Top