Opened 7 years ago

Closed 2 years ago

Last modified 2 years ago

#12581 closed Cleanup/optimization (fixed)

connection.queries should use a ring buffer to avoid large memory consumption under runserver

Reported by: Rob Hudson Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: rob@…, bmispelon@…, pztrick44@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Currently, when running using runserver, each query is appended to a list. For long running runserver processes, this list can grow quite large.

Attachments (1)

12581.diff (3.0 KB) - added by Rob Hudson 7 years ago.
RingBuffer, based on connections.deque (Python 2.4+)

Download all attachments as: .zip

Change History (24)

Changed 7 years ago by Rob Hudson

Attachment: 12581.diff added

RingBuffer, based on connections.deque (Python 2.4+)

comment:1 Changed 7 years ago by Rob Hudson

Has patch: set
Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset

comment:2 Changed 7 years ago by Rob Hudson

FYI: Original patch by Alex Gaynor, tests and ticket added by Rob Hudson.

comment:3 Changed 7 years ago by Russell Keith-Magee

milestone: 1.2
Triage Stage: UnreviewedAccepted
Version: 1.1

comment:4 Changed 7 years ago by James Bennett

milestone: 1.2

1.2 is feature-frozen, moving this feature request off the milestone.

comment:5 Changed 7 years ago by Andy Durdin

Note that the django debug toolbar’s sql panel does not work with this patch.

comment:6 Changed 6 years ago by Aymeric Augustin

This is a duplicate of #3711 (closed as wontfix) and #6734 (closed as duplicate).

Since the ticket was accepted by a core dev, I won't close it, though.

comment:7 Changed 5 years ago by Matt McClanahan

Severity: Normal
Type: New feature

comment:8 Changed 5 years ago by Julien Phalip

Patch needs improvement: set
Type: New featureCleanup/optimization

The tests would need to be rewritten using unittests since this is now Django's preferred way. Also, the datastructures tests seem to have been moved to source:django/trunk/tests/regressiontests/utils/datastructures.py

comment:9 Changed 5 years ago by Aymeric Augustin

UI/UX: unset

Change UI/UX from NULL to False.

comment:10 Changed 5 years ago by Aymeric Augustin

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:11 Changed 3 years ago by Baptiste Mispelon

Cc: bmispelon@… added
Needs tests: set
Patch needs improvement: unset
Version: master

The situation has changed a bit since the original patch was written. collections.deque now takes a maxlen argument in its constructor, making the need of a custom RingBuffer implementation obsolete.

The patch becomes much simpler too: only two lines need to be changed: https://github.com/bmispelon/django/compare/master...ticket-12581

Regarding the potential problems with django debug toolbar, I haven't looked into it very thoroughly but it seemed to work on my first attempt.

The original patch had tests for the RingBuffer implementation which are not needed anymore since we rely 100% on python's stdlib now.
We should still test the new feature somehow but I'm not quite sure how to achieve this.

I'll look into this a bit deeper but if anyone has ideas on how to implement the tests, or some test failures with django-debug-toolbar, I'd love some feedback.

comment:12 Changed 3 years ago by Baptiste Mispelon

I had to tweak django.test.utils.CapturedQueriesContext in order to get the test suite to run but after that, all the tests pass without further modification.

I also took a closer look at how this affects DDT and it really seems to run without any problem. It's not actually affected by the change since it has its own logic for counting queries (I tried setting the maxlen to 1 and DDT would still show the correct number of queries).

One thing I'm concerned about is hardcoding this maxlen value to 100, but I don't know how to do it in a more flexible manner (creating another setting is likely out of question).

I'll look into adding some tests and maybe some documentation in the upcoming days.

comment:13 Changed 3 years ago by Simon Charette

A maxlen of 100 looks like sensible default to me.

comment:14 Changed 3 years ago by Aymeric Augustin

Summary: connection.queries should use a ring buffer to avoid large memory consumption under runserverconnection.queries should use a ring buffer to avoid large memory consumption under runserverontrib.comments inline on Postgres 8.3 fails to cast integer to text

At least since magic-removal (which is the horizon of events as far as I am concerned), connections.queries is reset at the end of each request-response cycle by django.db.reset_queries. I fail to see how it could grow indefinitely. I must have missed something; could someone explain what problem this ticket is trying to solve and how to reproduce it?

comment:15 Changed 3 years ago by Aymeric Augustin

Summary: connection.queries should use a ring buffer to avoid large memory consumption under runserverontrib.comments inline on Postgres 8.3 fails to cast integer to textconnection.queries should use a ring buffer to avoid large memory consumption under runserver

Restore the summary...

comment:16 Changed 3 years ago by Carl Meyer

It can grow indefinitely when the Django ORM is used outside the request/response cycle, in long-running data-processing tasks.

comment:17 Changed 3 years ago by Aymeric Augustin

I understand that, however, the summary talks of runserver which doesn't exhibit this behavior.

If we want to protect against memory exhaustion only in long running processes, can we put a limit that's outside the realm of plausibility for a single request eg. 10 000?

I'm not wild about changing the type of connection.queries, but as long as the DDT works...

comment:18 Changed 3 years ago by Claude Paroz

#21245 has been marked as a duplicate (and includes a valid use case).

comment:19 Changed 3 years ago by Patrick Paul

Cc: pztrick44@… added

comment:20 Changed 2 years ago by Aymeric Augustin

The self.queries = deque(maxlen=100) doesn't survive when the query log is reset with connection.queries = []. The bulk_create tests do that.

comment:21 Changed 2 years ago by Aymeric Augustin

Needs tests: unset
Triage Stage: AcceptedReady for checkin

I'm going to complete this patch. However, an important aspect hasn't been considered yet. connection.queries is documented in faq/models.txt as "a list of dictionaries in order of query execution". We cannot change its type.

To preserve it, we can use a deque internally and provide a queries property that turns the deque into a list. The property should also emit a warning when the limit is reached and queries are truncated.

This requires making connection.queries read-only. I suspect some people use connection.queries = [] to reset queries, even though the documentation only talks about db.reset_queries(). But since there's a documented API, I don't feel too bad about breaking undocumented usage.

I've set a high limit to minimize chances that people will hit it -- if it's over 9000, you're really in trouble -- and I've made it a class attribute to make it possible to change.

Pull request: https://github.com/django/django/pull/2772

comment:22 Changed 2 years ago by Aymeric Augustin <aymeric.augustin@…>

Resolution: fixed
Status: newclosed

In cfcca7ccce3dc527d16757ff6dc978e50c4a2e61:

Fixed #3711, #6734, #12581 -- Bounded connection.queries.

Prevented unlimited memory consumption when running background tasks
with DEBUG=True.

Thanks Rob, Alex, Baptiste, and others.

comment:23 Changed 2 years ago by Aymeric Augustin <aymeric.augustin@…>

In fc31319752945d4d39f70c8278ecd42af240bc64:

Fixed test again. Refs #12581.

Note: See TracTickets for help on using tickets.
Back to Top