Opened 14 years ago

Closed 10 years ago

Last modified 10 years ago

#12581 closed Cleanup/optimization (fixed)

connection.queries should use a ring buffer to avoid large memory consumption under runserver

Reported by: Rob Hudson Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: rob@…, bmispelon@…, pztrick44@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Currently, when running using runserver, each query is appended to a list. For long running runserver processes, this list can grow quite large.

Attachments (1)

12581.diff (3.0 KB ) - added by Rob Hudson 14 years ago.
RingBuffer, based on connections.deque (Python 2.4+)

Download all attachments as: .zip

Change History (24)

by Rob Hudson, 14 years ago

Attachment: 12581.diff added

RingBuffer, based on connections.deque (Python 2.4+)

comment:1 by Rob Hudson, 14 years ago

Has patch: set

comment:2 by Rob Hudson, 14 years ago

FYI: Original patch by Alex Gaynor, tests and ticket added by Rob Hudson.

comment:3 by Russell Keith-Magee, 14 years ago

milestone: 1.2
Triage Stage: UnreviewedAccepted
Version: 1.1

comment:4 by James Bennett, 14 years ago

milestone: 1.2

1.2 is feature-frozen, moving this feature request off the milestone.

comment:5 by Andy Durdin, 14 years ago

Note that the django debug toolbar’s sql panel does not work with this patch.

comment:6 by Aymeric Augustin, 13 years ago

This is a duplicate of #3711 (closed as wontfix) and #6734 (closed as duplicate).

Since the ticket was accepted by a core dev, I won't close it, though.

comment:7 by Matt McClanahan, 13 years ago

Severity: Normal
Type: New feature

comment:8 by Julien Phalip, 13 years ago

Patch needs improvement: set
Type: New featureCleanup/optimization

The tests would need to be rewritten using unittests since this is now Django's preferred way. Also, the datastructures tests seem to have been moved to source:django/trunk/tests/regressiontests/utils/datastructures.py

comment:9 by Aymeric Augustin, 12 years ago

UI/UX: unset

Change UI/UX from NULL to False.

comment:10 by Aymeric Augustin, 12 years ago

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:11 by Baptiste Mispelon, 11 years ago

Cc: bmispelon@… added
Needs tests: set
Patch needs improvement: unset
Version: master

The situation has changed a bit since the original patch was written. collections.deque now takes a maxlen argument in its constructor, making the need of a custom RingBuffer implementation obsolete.

The patch becomes much simpler too: only two lines need to be changed: https://github.com/bmispelon/django/compare/master...ticket-12581

Regarding the potential problems with django debug toolbar, I haven't looked into it very thoroughly but it seemed to work on my first attempt.

The original patch had tests for the RingBuffer implementation which are not needed anymore since we rely 100% on python's stdlib now.
We should still test the new feature somehow but I'm not quite sure how to achieve this.

I'll look into this a bit deeper but if anyone has ideas on how to implement the tests, or some test failures with django-debug-toolbar, I'd love some feedback.

comment:12 by Baptiste Mispelon, 11 years ago

I had to tweak django.test.utils.CapturedQueriesContext in order to get the test suite to run but after that, all the tests pass without further modification.

I also took a closer look at how this affects DDT and it really seems to run without any problem. It's not actually affected by the change since it has its own logic for counting queries (I tried setting the maxlen to 1 and DDT would still show the correct number of queries).

One thing I'm concerned about is hardcoding this maxlen value to 100, but I don't know how to do it in a more flexible manner (creating another setting is likely out of question).

I'll look into adding some tests and maybe some documentation in the upcoming days.

comment:13 by Simon Charette, 11 years ago

A maxlen of 100 looks like sensible default to me.

comment:14 by Aymeric Augustin, 11 years ago

Summary: connection.queries should use a ring buffer to avoid large memory consumption under runserverconnection.queries should use a ring buffer to avoid large memory consumption under runserverontrib.comments inline on Postgres 8.3 fails to cast integer to text

At least since magic-removal (which is the horizon of events as far as I am concerned), connections.queries is reset at the end of each request-response cycle by django.db.reset_queries. I fail to see how it could grow indefinitely. I must have missed something; could someone explain what problem this ticket is trying to solve and how to reproduce it?

comment:15 by Aymeric Augustin, 11 years ago

Summary: connection.queries should use a ring buffer to avoid large memory consumption under runserverontrib.comments inline on Postgres 8.3 fails to cast integer to textconnection.queries should use a ring buffer to avoid large memory consumption under runserver

Restore the summary...

comment:16 by Carl Meyer, 11 years ago

It can grow indefinitely when the Django ORM is used outside the request/response cycle, in long-running data-processing tasks.

comment:17 by Aymeric Augustin, 11 years ago

I understand that, however, the summary talks of runserver which doesn't exhibit this behavior.

If we want to protect against memory exhaustion only in long running processes, can we put a limit that's outside the realm of plausibility for a single request eg. 10 000?

I'm not wild about changing the type of connection.queries, but as long as the DDT works...

comment:18 by Claude Paroz, 10 years ago

#21245 has been marked as a duplicate (and includes a valid use case).

comment:19 by patrick, 10 years ago

Cc: pztrick44@… added

comment:20 by Aymeric Augustin, 10 years ago

The self.queries = deque(maxlen=100) doesn't survive when the query log is reset with connection.queries = []. The bulk_create tests do that.

comment:21 by Aymeric Augustin, 10 years ago

Needs tests: unset
Triage Stage: AcceptedReady for checkin

I'm going to complete this patch. However, an important aspect hasn't been considered yet. connection.queries is documented in faq/models.txt as "a list of dictionaries in order of query execution". We cannot change its type.

To preserve it, we can use a deque internally and provide a queries property that turns the deque into a list. The property should also emit a warning when the limit is reached and queries are truncated.

This requires making connection.queries read-only. I suspect some people use connection.queries = [] to reset queries, even though the documentation only talks about db.reset_queries(). But since there's a documented API, I don't feel too bad about breaking undocumented usage.

I've set a high limit to minimize chances that people will hit it -- if it's over 9000, you're really in trouble -- and I've made it a class attribute to make it possible to change.

Pull request: https://github.com/django/django/pull/2772

comment:22 by Aymeric Augustin <aymeric.augustin@…>, 10 years ago

Resolution: fixed
Status: newclosed

In cfcca7ccce3dc527d16757ff6dc978e50c4a2e61:

Fixed #3711, #6734, #12581 -- Bounded connection.queries.

Prevented unlimited memory consumption when running background tasks
with DEBUG=True.

Thanks Rob, Alex, Baptiste, and others.

comment:23 by Aymeric Augustin <aymeric.augustin@…>, 10 years ago

In fc31319752945d4d39f70c8278ecd42af240bc64:

Fixed test again. Refs #12581.

Note: See TracTickets for help on using tickets.
Back to Top