Context Navigation

← Previous Ticket
Next Ticket →

#34746 assigned Cleanup/optimization

High CPU/memory consumption when a 5XX is raised with large local variables

Reported by:	Rémi Dupré	Owned by:	Keerthi Vasan S A
Component:	Error reporting	Version:	4.2
Severity:	Normal	Keywords:
Cc:		Triage Stage:	Accepted
Has patch:	yes	Needs documentation:	no
Needs tests:	yes	Patch needs improvement:	yes
Easy pickings:	yes	UI/UX:	no

Description (last modified by Rémi Dupré)

Hi!

In a view with large variable in its scope, if an exception is raised, the worker will freeze and its memory usage will raise. Here is a minimal example:

import sys
from django.urls import path

# 300MB
large_data = list(range(40 * 1024 * 1024))

def compute_lines(request):
    local_data = large_data
    raise ValueError

urlpatterns = [
    path("compute-lines/", compute_lines)
]

When calling /compute-lines/ you will notice the issue : the server will take a few seconds to responds while its memory raises by ~1GB.

After a bit of investigations, it turned out it comes from django.utils.log.AdminEmailHandler in the default logging config. This handler will eventually pretty print the whole context of each trace frame from here : https://github.com/django/django/blob/main/django/views/debug.py#L355

This implementation ensures that no more than 4kB are included in the formatted result, but the entire variable will still have to be rendered first. I'm not sure how it could be solved, but maybe https://docs.python.org/3/library/reprlib.html can be a clue?

Change History (21)

comment:1 by Rémi Dupré, 2 years ago

Description:	modified (diff)
Summary:	High CPU/memory consumtion when a 5xx is raised with a large local variable → High CPU/memory consumption when a 5XX is raised with large local variables

comment:2 by Natalia Bidart, 2 years ago

Easy pickings:	set
Triage Stage:	Unreviewed → Accepted
Type:	Bug → Cleanup/optimization

I have managed to reproduce the issue, thank you for the example. Indeed if v = pprint(v) is replaced by something that does not render the whole variable, the memory and CPU usage do not spike.

Accepting so we can evaluate whether an alternative could be used instead of pprint that would account for the size limit before printing.

comment:3 by Natalia Bidart, 2 years ago

After some experimentation following the suggested link to reprlib, perhaps something like this could be a good alternative? (it needs better code formatting, more reprlib limits sets and tests, of course)

django/views/debug.py

                class ExceptionReporter:
             self.template_does_not_exist = True
             self.postmortem = self.exc_value.chain or [self.exc_value]
+        import reprlib
+        repr_instance = reprlib.Repr()
+        repr_instance.maxstring = 5000
+        repr_instance.maxlist = 1000
         frames = self.get_traceback_frames()
         for i, frame in enumerate(frames):
             if "vars" in frame:
                 frame_vars = []
                 for k, v in frame["vars"]:
                     v = pprint(v)
+                    v = repr_instance.repr(v)
                     # Trim large blobs of data
                     if len(v) > 4096:
                         v = "%s… <trimmed %d bytes string>" % (v[0:4096], len(v))

comment:4 by Vishesh Garg, 2 years ago

Owner:	set to Vishesh Garg
Status:	new → assigned

comment:5 by Vishesh Garg, 2 years ago

Owner:	Vishesh Garg removed

comment:6 by Vishesh Garg, 2 years ago

Owner:	set to Vishesh Garg

comment:7 by Natalia Bidart, 2 years ago

Has patch:	set
Needs tests:	set
Patch needs improvement:	set

comment:8 by Adam Johnson, 2 years ago

I think using reprlib is acceptable, this is exactly the kind of situation it’s designed for. It will be a shame to lose the pretty-printing aspect, but perhaps reprlib/pprint will grow appropriate functionality in the future.

comment:9 by Yash Kumar Verma, 20 months ago

Owner:	changed from Vishesh Garg to Yash Kumar Verma

Hello folks, I saw the two pull requests that were attached in the ticket are stale, and the original author have both closed. I'm interesting in picking this as my first ticket, can I proceed?

comment:10 by Yash Kumar Verma, 20 months ago

I was:

able to verify the following
- when exception is raised after declaring a large variable, memory usage shoots up.
- when exception is not raised after declaring a large variable, memory remains in live with normal operations.

Setup I used to reproduce the error and suggested solution

added a middleware which logs the change in memory after each request.
created two views, one which throws error and one that does not.
observed the memory usage in both the cases.

Current

After using reprlib

comment:11 by Keerthi Vasan S A, 18 months ago

Owner:	changed from Yash Kumar Verma to Keerthi Vasan S A

comment:12 by Keerthi Vasan S A, 18 months ago

Triage Stage:	Accepted → Ready for checkin

comment:13 by Mariusz Felisiak, 18 months ago

Triage Stage:	Ready for checkin → Accepted

You should not mark your own PRs as "ready for checkin".

follow-up: 15 comment:14 by Keerthi Vasan S A, 18 months ago

Ah sorry, I wanted to set this ticket as ready to be reviewed as mentioned by a contributor in the PR. How do I do that?

in reply to: 14 comment:15 by Natalia Bidart, 17 months ago

Replying to Keerthi Vasan S A:

Ah sorry, I wanted to set this ticket as ready to be reviewed as mentioned by a contributor in the PR. How do I do that?

Hi!

The proper procedure to get a re-review is described in the PR review checklist. Basically you need to unset the relevant flags in this ticket so the PR gets added back to the "patches needing review" section of the Django Developer Dahsboard.

comment:16 by Keerthi Vasan S A, 17 months ago

Gotcha, thank you!

comment:17 by Keerthi Vasan S A, 17 months ago

Needs tests:	unset
Patch needs improvement:	unset

comment:18 by Natalia Bidart, 17 months ago

Needs tests:	set
Patch needs improvement:	set

follow-up: 20 comment:19 by Ahmed Ibrahim, 12 months ago

Is this still being worked on or should I handle t?

in reply to: 19 comment:20 by Natalia Bidart, 11 months ago

Replying to Ahmed Ibrahim: