Code

Opened 19 months ago

Closed 19 months ago

Last modified 19 months ago

#19519 closed Bug (fixed)

Django fails to close db connections at end of request-response cycle

Reported by: amosonn@… Owned by: aaugustin
Component: Database layer (models, ORM) Version: master
Severity: Release blocker Keywords: connection, managment, closing, mysql, innodb, signal, generator
Cc: apollo13 Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Under some circumstances (I'll get to guesses about this in the end), the db connection isn't closed at the end of the request, and resurfaces later on to service another request. Thus, one request writes to the database, and then the next request tries to look for that data and fails, because it uses an outdated connection that hasn't yet been notified of the db change (per pep-249, different connections aren't required to see the same data.) The workaround solution for this is to add a django.db.connection.close() before that query, which brings to mind bug #13533. Similarly, This bug manifests on mysql innodb; but I haven't checked other databases, so this might just be coincidence. It is also possible that the non-closing connection bug exists for all back-ends, but that auto-commit isn't enough to prevent the damage only on mysql innodb.

The situation under which I've seen these connections retained is when sending an HttpResponse with a generator for content. It might be that when sending such content, the following lines (django.db:44-47)

def close_connection(**kwargs):
    for conn in connections.all():
        conn.close()
signals.request_finished.connect(close_connection)

which are supposed to make sure the connection closes, don't work, because the signal isn't raised. It is, however, still a riddle how the old connection gets attached to a new request. Bug #19117 might also be related to this.

Attachments (3)

testing.tar (50.0 KB) - added by apollo13 19 months ago.
fix_19519.diff (1.6 KB) - added by apollo13 19 months ago.
19519-fix-live-server-test-case.patch (3.5 KB) - added by aaugustin 19 months ago.

Download all attachments as: .zip

Change History (28)

comment:1 Changed 19 months ago by aaugustin

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

I remember making some fixes in this area when I introduced explicit support for streaming responses. If this is indeed related to streaming responses it's most likely fixed in master.

comment:2 Changed 19 months ago by apollo13

Raising this issue to "release blocker" since streaming responses are new in 1.5 and somewhat broken in that regard! I attached a project which demonstrates the issue, see the following server log:

<sqlite3.Connection object at 0x17234b8>
------------------------
() {'signal': <django.dispatch.dispatcher.Signal object at 0x149e1d0>, 'sender': <class 'django.core.handlers.wsgi.WSGIHandler'>}
------------------------
[29/Dec/2012 03:54:12] "GET /testing/ HTTP/1.1" 200 15


<sqlite3.Connection object at 0x17234b8>
------------------------
() {'signal': <django.dispatch.dispatcher.Signal object at 0x149e1d0>, 'sender': <class 'django.core.handlers.wsgi.WSGIHandler'>}
------------------------
None
request already finished: True
[29/Dec/2012 03:54:16] "GET /testing_generator/ HTTP/1.1" 200 18

What does this mean? When using StreamingHttpResponse the request_finished signal is sent before the response is actually generated, meaning database connections are already closed and a new query inside the iterator would cause a new connection to open.

A fix might be something along the lines of http://werkzeug.pocoo.org/docs/wsgi/#werkzeug.wsgi.ClosingIterator -- eg returning the response and issuing the request_finished signal after the upstream wsgi gateway exhausted the iterator.

Sidenote: The attached project is a sample project on purpose and not a testcase against Django's testsuite since I didn't check what the TestClient does and how it would affect things.

Last edited 19 months ago by apollo13 (previous) (diff)

Changed 19 months ago by apollo13

comment:3 Changed 19 months ago by apollo13

  • Cc apollo13 added
  • Severity changed from Normal to Release blocker
  • Triage Stage changed from Unreviewed to Accepted
  • Version changed from 1.4 to master

comment:4 Changed 19 months ago by amosonn@…

It is probably best to make sure this works for iterators passed to regular HttpResponse objects as well, even though it is deprecated.

comment:5 Changed 19 months ago by aaugustin

The request_finished signal should be hooked on the close() method of the WSGI iterable.

As pointed out in comment 4 this was already broken in the old-style streaming responses.

comment:6 follow-up: Changed 19 months ago by akaariai

Do we have a problem with transactions here? The transaction used for the request could be closed before the request is actually generated. If this is the case, then documenting this with "if using transaction middleware the transaction used for the request will be closed before the request is generated" seems like the easy way out. Another option is to add hooks for request finished with success/exception, but this is out of scope for 1.5.

comment:7 Changed 19 months ago by apollo13

Yes, the transaction middleware will also commit the transaction before the response is generated. Same goes for decorators like commit_on_success. But independent of that, the whole connection is closed before the response is generated.

comment:8 Changed 19 months ago by apollo13

Ok, so I attached an initial patch. Anssi provided some valuable feedback: He suggested that we only do the .close dance if we are actually working with a StreamingResponse to keep the patch backwards-compatible (although we don't know if it is backwards-incompatible yet). I somewhat agree with that, but on the other hand I'd like to handle both cases in the same way.

Next question is how to handle streaming responses in the test client. Currently they are not consumed at all and Anssi suggested to consume them always (probably add a flag later on to prevent consuming them, but that's probably not needed for 1.5).

Last thing: I am still thinking how to test that since the TestClient uses it's own WSGI handler which does stuff a bit different (it actually disconnects the close_connections receiver before it fires request_finished ;)).

Any thoughts from someone with more knowledge of the WSGI protocol than me would be appreciated :)

Version 0, edited 19 months ago by apollo13 (next)

comment:9 follow-up: Changed 19 months ago by apollo13

Related: http://bugs.python.org/issue16220 -- We either have to raise the required python versions or fix it in our wsgiref subclass (probably the later since Python didn't fix it in 2.6, unless it wasn't broken which I doubt).

comment:10 in reply to: ↑ 6 Changed 19 months ago by aaugustin

Replying to akaariai:

Do we have a problem with transactions here? The transaction used for the request could be closed before the request is actually generated.

This is less likely to bite people in real life. There are obvious uses cases for reading from the database while generating a streaming response (eg. huge CSV exports); I don't see writing to the database as being as common.

It's tracked by #5241, and I don't think it is a release blocker.

comment:11 in reply to: ↑ 9 Changed 19 months ago by aaugustin

Replying to apollo13:

Related: http://bugs.python.org/issue16220 -- We either have to raise the required python versions or fix it in our wsgiref subclass (probably the later since Python didn't fix it in 2.6, unless it wasn't broken which I doubt).

This fix will be available in Python 2.7.4, which isn't released yet.

If we want to fix this by triggering request_finished in the WSGI iterable's close() method we must ensure it's reliably closed.

I've re-opened #16241.

Changed 19 months ago by apollo13

comment:12 Changed 19 months ago by aaugustin

  • Has patch set
  • Needs tests set
  • Owner changed from nobody to aaugustin
  • Status changed from new to assigned

comment:13 Changed 19 months ago by aaugustin

  • Needs tests unset
  • Triage Stage changed from Accepted to Ready for checkin

This isn't testable within Django's test framework because the test client explicitly disables closing the database connection at the end of each request (since #8138).

I've used Florian's test project to validate manually that the pull request fixes the issue.

Per PEP 3333, "the close() method requirement is to support resource release by the application". Currently, Django uses the request_finished signal to close the connections to the databases and caches. That's exactly the purpose of close(), and therefore I think it's a good idea to send the signal from there.

comment:14 follow-up: Changed 19 months ago by akaariai

I am wondering if pushing the request_finished signal to late stages is actually safe to do. Question no. 2 from pep 3333 http://www.python.org/dev/peps/pep-3333/#questions-and-answers says that:

If we supported only the iteration approach, then current frameworks
that assume the availability of "push" suffer. But, if we only support
pushing via write(), then server performance suffers for transmission
of e.g. large files (if a worker thread can't begin work on a new
request until all of the output has been sent). Thus, this compromise
allows an application framework to support both approaches, as
appropriate, but with only a little more burden to the server
implementor than a push-only approach would require.

To me this suggest it is possible a different thread will finish the request and/or there is another request processed at the same time by the same thread. This would mean that we could be closing connections for different threads or different requests. I hope I am 100% wrong here.

If the above is the case, this is going to get a lot more complex. The connection handling is bound to threads, not requests. What we could do is detach the connection from the thread at the old request_finished point, and then when the request is actually finished (the new request_finished point) we could close the connection.

comment:15 in reply to: ↑ 14 Changed 19 months ago by aaugustin

The real question is: does any WSGI server actually used to serve real websites sport this behavior?

comment:16 Changed 19 months ago by aaugustin

Summary of an IRC discussion:

  • We believe that current implementations of this idea (eg. gunicorn's async workers) use green threads that honor Python's threading API; the database connection is a green thread local.
  • If that weren't true, Django would break horribly anyway, independently of streaming responses.
  • The way gevent solves blocking I/O makes the PEP's point moot; it makes the write() API just as "pseudo-non-blocking" as the iterator API.

comment:17 Changed 19 months ago by aaugustin

If we want to avoid changing the semantic of request_finished we can also introduce a new signal. We'd have to find a not-too-confusing name.

comment:18 Changed 19 months ago by apollo13

-1 on a new signal, I'd say ship it as is.

comment:19 Changed 19 months ago by Aymeric Augustin <aymeric.augustin@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In fd1279a44df3b9a837453cd79fd0fbcf81bae39d:

[1.5.x] Fixed #19519 -- Fired request_finished in the WSGI iterable's close().

Backport of acc5396.

comment:20 Changed 19 months ago by Aymeric Augustin <aymeric.augustin@…>

In acc5396e6d0ac49ae9dc6abc08903b81e6553199:

Fixed #19519 -- Fired request_finished in the WSGI iterable's close().

comment:21 Changed 19 months ago by aaugustin

  • Resolution fixed deleted
  • Status changed from closed to new

My fix makes servers.LiveServerDatabase.test_fixtures_loaded hang under PostgreSQL.

comment:22 Changed 19 months ago by aaugustin

LiveServerTestCase needs request_finished to close connections; disabling it globally for tests was wrong.

I'm attaching a patch that's supposed to fix the problem...

Changed 19 months ago by aaugustin

comment:23 Changed 19 months ago by apollo13

That's at least here not the case:

Installed 0 object(s) from 0 fixture(s)
test_test_test (regressiontests.servers.tests.LiveServerAddress) ... ok
test_database_writes (regressiontests.servers.tests.LiveServerDatabase) ... ok
test_fixtures_loaded (regressiontests.servers.tests.LiveServerDatabase) ... ok
test_404 (regressiontests.servers.tests.LiveServerViews) ... ok
test_environ (regressiontests.servers.tests.LiveServerViews) ... ok
test_media_files (regressiontests.servers.tests.LiveServerViews) ... ok
test_static_files (regressiontests.servers.tests.LiveServerViews) ... ok
test_view (regressiontests.servers.tests.LiveServerViews) ... ok

----------------------------------------------------------------------
Ran 8 tests in 2.356s

comment:24 Changed 19 months ago by apollo13

  • Resolution set to fixed
  • Status changed from new to closed

So the tests did work for me cause I already had the fix which github fails to display in the history *rage*
https://github.com/django/django/commit/56e54727661bc34bd2b6f9fa6a75f5370149256e and https://github.com/django/django/commit/bacb097ac31af2479c43934747b34fad7c91f55c

Did I already say that this is absolutely githubs fault? :þ

comment:25 Changed 19 months ago by apollo13

Note: This also broke MySQL but the tests never got this far since the Postgres builds blocked the queue.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.