Opened 4 years ago
Closed 8 months ago
#32409 closed New feature (worksforme)
TestCase async tests are not transaction-aware
Reported by: | David | Owned by: | |
---|---|---|---|
Component: | Testing framework | Version: | 3.1 |
Severity: | Normal | Keywords: | TestCase, async, transaction |
Cc: | Andrew Godwin, Carlton Gibson | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
SimpleTestCase
wraps each async test with async_to_sync
(see commit).
When using TestCase
this results in problems while accessing the database via sync_to_async
(or with channels database_sync_to_async
). Since under-the-hood aync_to_sync
runs the async code in an other thread and savepoints (aka: transactions) are thread-bound what happens is that those tests do not interact with the actual ("expected") database which is wrapped in a transaction bound to main thread.
Change History (22)
comment:1 by , 4 years ago
Cc: | added |
---|
follow-up: 12 comment:2 by , 4 years ago
Triage Stage: | Unreviewed → Accepted |
---|---|
Type: | Bug → New feature |
comment:4 by , 4 years ago
Hi Carlton,
I did not think about nested sync_to_async
and async_to_sync
could cause issue.
My use case is the following: I have a websocket users use to acquire and release "lock" on a specific database row, the "lock" is also a database table which connects user and the other one. Database access is required by websocket since it needs to verify that the requested element was not already locked by an other user (and this is the failing test).
My backend was build with Django v2.2 and Channels v1 (using also multiplexing).
I am now moving to Django v3 which has forced me also to update Channels since otherwise there would be a conflict with asgiref
dependency.
Since from Channels v1 to v2 (also v3) there are many breaking changes I have been forced to rewrite my unittests. Reading Django 3.1 supporting async testing I thought to try it out (since I am not familiar with pytest fixtures and database presets) and migrate my existing tests.
I had a similar issue with an other websocket, which was due to authorization (since I use a permission-check mechanism to accept websocket connections), which was work-arounded by pre-fetching permissions and test user in TestCase
:
class ConsumerTestCase(TestCase): fixtures = ['test_authorization', ] @classmethod def setUpTestData(cls): user = User.objects.get_by_natural_key('other') assign_perm('app.model', user) # from guardian.shortcuts import assign_perm # force perm cache to be evaluated here # without this even if DB is in sync there could be a miss-read due to transactions user.has_perm('app.model') cls.user = user async def test_connect(self): application = AuthMiddlewareStack(WebsocketConsumer.as_asgi()) communicator = WebsocketCommunicator(application, '/test') communicator.scope['user'] = self.user connected, _subprotocol = await communicator.connect() self.assertTrue(connected)
comment:5 by , 4 years ago
Using debugger I noticed this behaviour:
TestCase | | async_to_sync-> Thread (test) |--------------------| | | | | (database_)sync_to_async ->Thread (ORM) | |--------------------------------|
Since DB connection (along with savepoint/transaction references) resides in the main thread it is hidden from re-synced code.
This behaviour could be fine if there is not any open transaction, since every thread can hit the database independently. However, like in this case, when the main thread opens a transaction this info is not trasmitted through async-thread-chain thus resulting in reading a "wrong" state from the database.
It could be addressed by making async_to_sync
and sync_to_async
transaction-aware. I do not know if this can be achieved without an async-ready database client.
comment:6 by , 4 years ago
I do not know if this can be achieved without an async-ready database client.
Yes, That's kind of where I'm at with it. :| — This is a Difficult Problem™.
In traditional request-response scenarios we can reasonably assume that there's a single transaction in a single thread and not worry about what goes on outside that. Once there are multiple threads in play you start wanting (that's not the right word perhaps) to think about isolation levels, how you'll handle conflicts and so on, and what's right in that space for your application (which may not be general). At the moment we're constrained by the ORM not having been built with any in mind at all, so the required safety wheels do impinge.
For now, can you try writing this particular test with pytest-asyncio? That should skip the outer async_to_sync(), which should/may hopefully mean that the sync_to_async call is run on the outermost thread. (Without sitting down with the exact test and the debugger it's hard to say exactly.)
comment:7 by , 4 years ago
I will try to move this test under pytest.
Then I will try to build a sample project with a simplified version of my actual test for further analysis.
comment:8 by , 4 years ago
Here is a demo project with the use-case I described before: https://github.com/sevdog/test-websocker-user-lock.
I have setup both py-test (however something is still missing here, since they work if run one-by-one but fails when run in sequence, I should learn more about this library) and django testcase.
comment:9 by , 4 years ago
I'm not sure quite why this is happening - any recent release of asgiref
ensures that every sync_to_async call runs in the same thread as we now default thread-sensitive mode to on; precisely to stop this kind of behaviour.
Can you verify you're seeing this with the most recent asgiref release?
comment:10 by , 4 years ago
Hi Andrew, in both my main project and test-one I use asgiref==3.3.1
which, as of now, is the latest available on pypi (Nov 2020).
I will give a try with latest commit from github to see if this issue persists in the main project.
Thank you.
comment:11 by , 4 years ago
No, that's plenty new, the change happened in 3.3.0. In that case, I think the best course of investigation here is why the outer sync thread with the transaction in it, and the inner sync thread the ORM runs in, are different. I can take a look at that in a few weeks, but unfortunately I'm pretty busy at work right now so I can't do it immediately.
If anyone else is interested in doing that investigation, the thing I normally do is go edit the Django source code at relevant points and make it print the current thread ID, then run a test and see what thread IDs get printed out during it.
comment:12 by , 4 years ago
I have added thead-id logging to my test app and unittests.
Looking at connect test It seems that in some way with django TestCase
an extra thread gets spawned.
test-setup 140537656747840 test 140537592018688 db-connection-2 140537583625984 --- error gets thrown
While the corresponding test with pytest does not.
fixture-sync 140584466261824 fixture-sync 140584466261824 fixture-async 140584466261824 db-connection-2 140584401794816 connect 140584466261824 db-connection-1 140584401794816 test 140584466261824 db-connection-4 140584401794816 disconnect 140584466261824 --- test successfull
I checked against the test case provided by Carlton and seems that it passes with asgiref==3.3.1
. I do not know if ApplicationCommunicator
from asgiref
are bound in some way to this issue.
I will investigate where the source of the problem lies and create an appropriate test case for it.
comment:13 by , 4 years ago
Looking deeper at pytest integration I found that in my case it does not behave like TestCase
but like TransactionTestCase
.
When using TransactionTestCase
all my unit-tests pass.
I have also add the "simple-fetch" test suggested by Carlton to my sample project tests along with automated testing to better see what is going on (results can be see here).
Usin sqlit3 tests fail due to this error: django.db.utils.OperationalError: database table is locked: ws_lock_grouptypevisibility
. From what I have found this error is due to a concurrent connection with an uncommitted-transaction.
So the whole problem seems to be bound to transactions and how they behave with async code. In a normal/production working flow this may not be a problem, but while running tests it is.
comment:14 by , 4 years ago
I cannot reproduce this issue with Django's TestCase
or TransactionTestCase
. I tried with SQLite and PostgreSQL, and asgiref 3.3.0 and 3.3.1. Puzzled.
follow-up: 16 comment:15 by , 4 years ago
Resolution: | → needsinfo |
---|---|
Status: | new → closed |
Closing as "needsinfo", since we don't have a reproducible scenario.
comment:16 by , 4 years ago
Replying to Mariusz Felisiak:
Closing as "needsinfo", since we don't have a reproducible scenario.
Hi Mariusz, sorry to not have replied earlier but I had provided a complete test scenario in this repository: https://github.com/sevdog/test-websocker-user-lock
Replying to David:
Here is a demo project with the use-case I described before: https://github.com/sevdog/test-websocker-user-lock.
I have setup both py-test (however something is still missing here, since they work if run one-by-one but fails when run in sequence, I should learn more about this library) and django testcase.
comment:17 by , 4 years ago
Resolution: | needsinfo |
---|---|
Status: | closed → new |
comment:18 by , 4 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:19 by , 3 years ago
I'm faced with a similar OperationalError
whilst testing an async middleware using TestCase
.
The db accessing method in the middleware is decorated with @database_sync_to_async
from Channels. Package versions are django==3.2.8
channels==3.0.4
, asgiref==3.4.1
and pytest==6.2.5
.
Decorating the async test case with both @pytest.mark.asyncio
and @async_to_sync
seems to make the problem go away.
comment:20 by , 3 years ago
I have faced the same issue when writing tests for API implemented using strawberry-graphql-django
.
Strawberry converts django calls using sync_to_async(func, thread_sensitive=True)
.
Initially, I've converted tests to fully async ones, for example:
@pytest.mark.django_db async def test_currencies(setup_for_currency): schema = gql.Schema(query=Query) result = await schema.execute( """ query { currency { currencies { name } } } """ )
As a result, tests based on TestCase
don't work, while TransactionTestCase
based do work. This happens because fixture setup_for_currency
and TestCase
setup and tear-down are executed against a different DB connection.
However, if I change upper-level test-case to synchronous one, everything works!
@pytest.mark.django_db def test_currencies(setup_for_currency): schema = gql.Schema(query=Query) result = async_to_sync(schema.execute)( """ query { currency { currencies { name } } } """ )
Looks like this happens according to asgiref
documentation:
If the outermost program is async (i.e. SyncToAsync is outermost), then
this will be a dedicated single sub-thread that all sync code runs in,
one after the other. If the outermost program is sync (i.e. AsyncToSync is
outermost), this will just be the main thread. This is achieved by idling
with a CurrentThreadExecutor while AsyncToSync is blocking its sync parent,
rather than just blocking.
However, this behavior is still very confusing, I have no idea if this could be fixed.
Using TransactionTestCase
everywhere is a no-no for large projects, but my solution still looks like a hack.
comment:21 by , 2 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:22 by , 8 months ago
Resolution: | → worksforme |
---|---|
Status: | new → closed |
Comimg back here after some time and now I have tested that thanks to asgiref
and django
updates this problems seems to have been solved: my example repo now shows how everithing works fine in latest run.
At the moment I do not have enough time to deep investigate how or when it was solved, i belive this can be closed now.
Hi David. Thanks for the report.
The following diff gives an
OperationalError
database table is locked, so yes.I'm inclined to think this a documentation issue — that nested sync_to_async calls are not yet supported — plus a new feature request to add that support.
i.e. that the usage (whilst presumably reasonable) is beyond where Django's async support has yet reached. I will provisionally accept on that basis, but happy if we want to adjust that — the alternative is to take it as a release blocker on 3.1, but without a particular fix in mind, I'm not sure if that's realistic.
Can I ask what the use-case is? What are you trying to do using this pattern?
It looks like a tough one to actually address until we can get the async wrapper layer around the ORM. 🤔