Opened 16 months ago

Last modified 3 months ago

#34746 assigned Cleanup/optimization

High CPU/memory consumption when a 5XX is raised with large local variables

Reported by: Rémi Dupré Owned by: Keerthi Vasan S A
Component: Error reporting Version: 4.2
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: yes
Easy pickings: yes UI/UX: no

Description (last modified by Rémi Dupré)

Hi!

In a view with large variable in its scope, if an exception is raised, the worker will freeze and its memory usage will raise. Here is a minimal example:

import sys
from django.urls import path

# 300MB
large_data = list(range(40 * 1024 * 1024))

def compute_lines(request):
    local_data = large_data
    raise ValueError

urlpatterns = [
    path("compute-lines/", compute_lines)
]


When calling /compute-lines/ you will notice the issue : the server will take a few seconds to responds while its memory raises by ~1GB.


After a bit of investigations, it turned out it comes from django.utils.log.AdminEmailHandler in the default logging config. This handler will eventually pretty print the whole context of each trace frame from here : https://github.com/django/django/blob/main/django/views/debug.py#L355

This implementation ensures that no more than 4kB are included in the formatted result, but the entire variable will still have to be rendered first. I'm not sure how it could be solved, but maybe https://docs.python.org/3/library/reprlib.html can be a clue?

Change History (20)

comment:1 by Rémi Dupré, 16 months ago

Description: modified (diff)
Summary: High CPU/memory consumtion when a 5xx is raised with a large local variableHigh CPU/memory consumption when a 5XX is raised with large local variables

comment:2 by Natalia Bidart, 16 months ago

Easy pickings: set
Triage Stage: UnreviewedAccepted
Type: BugCleanup/optimization

I have managed to reproduce the issue, thank you for the example. Indeed if v = pprint(v) is replaced by something that does not render the whole variable, the memory and CPU usage do not spike.

Accepting so we can evaluate whether an alternative could be used instead of pprint that would account for the size limit before printing.

comment:3 by Natalia Bidart, 16 months ago

After some experimentation following the suggested link to reprlib, perhaps something like this could be a good alternative? (it needs better code formatting, more reprlib limits sets and tests, of course)

  • django/views/debug.py

    a b class ExceptionReporter:  
    347347            self.template_does_not_exist = True
    348348            self.postmortem = self.exc_value.chain or [self.exc_value]
    349349
     350        import reprlib
     351        repr_instance = reprlib.Repr()
     352        repr_instance.maxstring = 5000
     353        repr_instance.maxlist = 1000
     354
    350355        frames = self.get_traceback_frames()
    351356        for i, frame in enumerate(frames):
    352357            if "vars" in frame:
    353358                frame_vars = []
    354359                for k, v in frame["vars"]:
    355                     v = pprint(v)
     360                    v = repr_instance.repr(v)
    356361                    # Trim large blobs of data
    357362                    if len(v) > 4096:
    358363                        v = "%s… <trimmed %d bytes string>" % (v[0:4096], len(v))

comment:4 by Vishesh Garg, 16 months ago

Owner: set to Vishesh Garg
Status: newassigned

comment:5 by Vishesh Garg, 16 months ago

Owner: Vishesh Garg removed

comment:6 by Vishesh Garg, 16 months ago

Owner: set to Vishesh Garg

comment:7 by Natalia Bidart, 16 months ago

Has patch: set
Needs tests: set
Patch needs improvement: set

comment:8 by Adam Johnson, 16 months ago

I think using reprlib is acceptable, this is exactly the kind of situation it’s designed for. It will be a shame to lose the pretty-printing aspect, but perhaps reprlib/pprint will grow appropriate functionality in the future.

comment:9 by Yash Kumar Verma, 12 months ago

Owner: changed from Vishesh Garg to Yash Kumar Verma

Hello folks, I saw the two pull requests that were attached in the ticket are stale, and the original author have both closed. I'm interesting in picking this as my first ticket, can I proceed?

comment:10 by Yash Kumar Verma, 12 months ago

I was:

  • able to verify the following
    • when exception is raised after declaring a large variable, memory usage shoots up.
    • when exception is not raised after declaring a large variable, memory remains in live with normal operations.

Setup I used to reproduce the error and suggested solution

  • added a middleware which logs the change in memory after each request.
  • created two views, one which throws error and one that does not.
  • observed the memory usage in both the cases.

Current

https://github.com/YashKumarVerma/django-pocs/blob/master/poc01/poc01/2023-12-03-14-18-23.png

After using reprlib

https://github.com/YashKumarVerma/django-pocs/blob/master/poc01/poc01/2023-12-03-14-35-23.png

comment:11 by Keerthi Vasan S A, 10 months ago

Owner: changed from Yash Kumar Verma to Keerthi Vasan S A

comment:12 by Keerthi Vasan S A, 10 months ago

Triage Stage: AcceptedReady for checkin

comment:13 by Mariusz Felisiak, 10 months ago

Triage Stage: Ready for checkinAccepted

You should not mark your own PRs as "ready for checkin".

comment:14 by Keerthi Vasan S A, 10 months ago

Ah sorry, I wanted to set this ticket as ready to be reviewed as mentioned by a contributor in the PR. How do I do that?

in reply to:  14 comment:15 by Natalia Bidart, 10 months ago

Replying to Keerthi Vasan S A:

Ah sorry, I wanted to set this ticket as ready to be reviewed as mentioned by a contributor in the PR. How do I do that?

Hi!

The proper procedure to get a re-review is described in the PR review checklist. Basically you need to unset the relevant flags in this ticket so the PR gets added back to the "patches needing review" section of the Django Developer Dahsboard.

comment:16 by Keerthi Vasan S A, 10 months ago

Gotcha, thank you!

comment:17 by Keerthi Vasan S A, 10 months ago

Needs tests: unset
Patch needs improvement: unset

comment:18 by Natalia Bidart, 9 months ago

Needs tests: set
Patch needs improvement: set

comment:19 by Ahmed Ibrahim, 4 months ago

Is this still being worked on or should I handle t?

in reply to:  19 comment:20 by Natalia Bidart, 3 months ago

Replying to Ahmed Ibrahim:

Is this still being worked on or should I handle t?

Hello Ahmed! Thanks for your eagerness to help. As we already discussed in the PR, this is still being work on and I'll provide more feedback in the PR.

Note: See TracTickets for help on using tickets.
Back to Top