Opened 4 years ago

Closed 4 years ago

#17251 closed Cleanup/optimization (fixed)

select_for_update tests threads should close connections manually

Reported by: akaariai Owned by: aaugustin
Component: Uncategorized Version:
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This is a small patch which closes two connections taken in threads.

It is good to close the conncetions manually, as they are thread locals in DatabaseWrapper, and thus new connections are created for new threads.

The connections will be closed by GC eventually, but IMHO it is good practice to close the connections manually, as they will not be used again anyways. They will be idle in transaction until they happen to be GCed.

Attachments (1)

17251.diff (1.1 KB) - added by akaariai 4 years ago.

Download all attachments as: .zip

Change History (7)

Changed 4 years ago by akaariai

comment:1 Changed 4 years ago by akaariai

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

The GCed part of the description is somewhat wrong. Threadlocals do not get GCed when the thread goes away. However it seems they do get GCed if the thread has gone away and _another_ thread is then accessing the same thread-local object. Or at least this is my working hypothesis based on the fact that the connection gets closed on this line in the transactions.py, managed() [L111]:

     connection.managed(flag)

And this is ran from main thread! I assume this is the first time the default_db_alias' DatabaseWrapper object (which is thread-local) is used after the another thread has gone away. I assume Python GCes the state of the other thread because it has gone away, and it sees it only when the object is accessed.

If you want to try tracking this down, I have a branch where you can run:

python runtests.py --settings=test_pgsql select_for_update.SelectForUpdateTests.test_nowait_raises_error_on_block

Then wait until you get into PDB. Begin stepping through the code. Note when the 'I got deleted' for thread 1 is printed. Note also which thread you are currently in. It is also good to use postgresql and set log_connections / log_disconnections to ON, and also log at least the backend pid in the log_line_prefix (%p).

I think this enforces the point that it would be good to close the connection in the tester threads. The current behavior seems to be based on pure luck :) I am also beginning to see why some of the core developers don't like thread-locals...

The test code for this is at:
https://github.com/akaariai/django/tree/connection_close

comment:2 Changed 4 years ago by akaariai

I did some more digging into when separate thread connections are closed. It happens to be so that the current behaviour relies on DatabaseWrapper being thread-local. The problem is that the chain from django.db.connection to the real underlying connection (for example psycopg2 connection) contains cycles. Because of the cycles, the connections will not be GCed (and thus closed) immediately when last reference to them is closed. It seems that an object which is threading.local subclass has a bit different rules to when it will be GCed (or more precisely, when its __dict__ will be collected). This makes select_for_update tests leave open connections behind if you move the threading.local from DatabaseWrapper to the ConnectionHandler._connections dictionary. This also has the possibility to make for some really hard to debug problems if Python (or PyPy or Jython or...) decides to change the rules how GCing is implemented.

Anyways, it would be a good idea to always close connections in separate threads. The abovementioned patch does this for select_for_update tests. Some method decorators would be useful so that you could easily ensure your connections are closed when running command line tasks, or when creating separate threads. I think it would also be a good idea to break the cyclic references in DatabaseWrapper (for example DatabaseWrapper -> DatabaseFeatures -> DatabaseWrapper). This will however break every external backend... If this is not done, at least document the gotchas in this area of the code.

Additional note: do not use __del__ to check when the objects are GCed, this will make any cyclic reference uncollectable by the GC. I happen to know this isn't nice to debug... :)

But most importantly: it seems it is essential connections for separate threads are closed manually in the current Django implementation. So, lets do that for select_for_update tests.

comment:3 Changed 4 years ago by aaugustin

  • Triage Stage changed from Unreviewed to Accepted

comment:4 Changed 4 years ago by akaariai

I don't think there is much to do in this ticket. The monologue above about breaking cycles in DatabaseWrapper is not relevant, as it would break every external backend, and it doesn't belong into this ticket anyways.

The patch is pretty trivial, so I think this is ready for commit. But as it is my patch, I can't set ready for checkin...

comment:5 Changed 4 years ago by aaugustin

  • Owner changed from nobody to aaugustin

Yes, I noticed the patch is quite straightforward. I'll review it.

comment:6 Changed 4 years ago by aaugustin

  • Resolution set to fixed
  • Status changed from new to closed

In [17195]:

Fixed #17251 -- In the select_for_update tests, close manually database connections made in threads, so they don't stay "idle in transaction" until the GC deletes them. Thanks Anssi Kääriäinen for the report and patch.

Note: See TracTickets for help on using tickets.
Back to Top