Opened 3 years ago

Last modified 16 months ago

#32409 new New feature

TestCase async tests are not transaction-aware

Reported by: David Owned by:
Component: Testing framework Version: 3.1
Severity: Normal Keywords: TestCase, async, transaction
Cc: Andrew Godwin, Carlton Gibson Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

SimpleTestCase wraps each async test with async_to_sync (see commit).

When using TestCase this results in problems while accessing the database via sync_to_async (or with channels database_sync_to_async). Since under-the-hood aync_to_sync runs the async code in an other thread and savepoints (aka: transactions) are thread-bound what happens is that those tests do not interact with the actual ("expected") database which is wrapped in a transaction bound to main thread.

Change History (21)

comment:1 by Mariusz Felisiak, 3 years ago

Cc: Andrew Godwin Carlton Gibson added

comment:2 by Carlton Gibson, 3 years ago

Triage Stage: UnreviewedAccepted
Type: BugNew feature

Hi David. Thanks for the report.

The following diff gives an OperationalError database table is locked, so yes.

(django) ~/Documents/Django-Stack/django/tests (master)$ GIT_PAGER="" git diff
diff --git a/tests/async/tests.py b/tests/async/tests.py
index 783fbdd080..076cd9bc1d 100644
--- a/tests/async/tests.py
+++ b/tests/async/tests.py
@@ -2,11 +2,11 @@ import os
 import sys
 from unittest import mock, skipIf
 
-from asgiref.sync import async_to_sync
+from asgiref.sync import async_to_sync, sync_to_async
 
 from django.core.cache import DEFAULT_CACHE_ALIAS, caches
 from django.core.exceptions import SynchronousOnlyOperation
-from django.test import SimpleTestCase
+from django.test import SimpleTestCase, TestCase
 from django.utils.asyncio import async_unsafe
 
 from .models import SimpleModel
@@ -61,3 +61,19 @@ class AsyncUnsafeTest(SimpleTestCase):
             self.dangerous_method()
         except SynchronousOnlyOperation:
             self.fail('SynchronousOnlyOperation should not be raised.')
+
+
+class SyncToAsyncTestCase(TestCase):
+
+    @classmethod
+    def setUpTestData(cls):
+        SimpleModel.objects.create(field=1)
+
+    async def test_using_sync_to_async(self):
+        @sync_to_async
+        def fetch():
+            return list(SimpleModel.objects.all())
+
+        data = await fetch()
+        self.assertEqual(len(data), 1)
+        self.assertEqual(data[0].field, 1)

I'm inclined to think this a documentation issue — that nested sync_to_async calls are not yet supported — plus a new feature request to add that support.
i.e. that the usage (whilst presumably reasonable) is beyond where Django's async support has yet reached. I will provisionally accept on that basis, but happy if we want to adjust that — the alternative is to take it as a release blocker on 3.1, but without a particular fix in mind, I'm not sure if that's realistic.

Can I ask what the use-case is? What are you trying to do using this pattern?

It looks like a tough one to actually address until we can get the async wrapper layer around the ORM. 🤔

Last edited 3 years ago by Carlton Gibson (previous) (diff)

comment:3 by Carlton Gibson, 3 years ago

I've opened a possible PR just noting the limitation as it is now.

comment:4 by David, 3 years ago

Hi Carlton,

I did not think about nested sync_to_async and async_to_sync could cause issue.

My use case is the following: I have a websocket users use to acquire and release "lock" on a specific database row, the "lock" is also a database table which connects user and the other one. Database access is required by websocket since it needs to verify that the requested element was not already locked by an other user (and this is the failing test).

My backend was build with Django v2.2 and Channels v1 (using also multiplexing).
I am now moving to Django v3 which has forced me also to update Channels since otherwise there would be a conflict with asgiref dependency.

Since from Channels v1 to v2 (also v3) there are many breaking changes I have been forced to rewrite my unittests. Reading Django 3.1 supporting async testing I thought to try it out (since I am not familiar with pytest fixtures and database presets) and migrate my existing tests.

I had a similar issue with an other websocket, which was due to authorization (since I use a permission-check mechanism to accept websocket connections), which was work-arounded by pre-fetching permissions and test user in TestCase:

class ConsumerTestCase(TestCase):
    fixtures = ['test_authorization', ]

    @classmethod
    def setUpTestData(cls):
        user = User.objects.get_by_natural_key('other')
        assign_perm('app.model', user)  # from guardian.shortcuts import assign_perm
        # force perm cache to be evaluated here
        # without this even if DB is in sync there could be a miss-read due to transactions
        user.has_perm('app.model')
        cls.user = user

    async def test_connect(self):
        application = AuthMiddlewareStack(WebsocketConsumer.as_asgi())
        communicator = WebsocketCommunicator(application, '/test')
        communicator.scope['user'] = self.user
        connected, _subprotocol = await communicator.connect()
        self.assertTrue(connected)
Last edited 3 years ago by David (previous) (diff)

comment:5 by David, 3 years ago

Using debugger I noticed this behaviour:

TestCase
   |
   | async_to_sync-> Thread (test)
   |--------------------|
   |                    |
   |                    | (database_)sync_to_async ->Thread (ORM)
   |                    |--------------------------------|

Since DB connection (along with savepoint/transaction references) resides in the main thread it is hidden from re-synced code.

This behaviour could be fine if there is not any open transaction, since every thread can hit the database independently. However, like in this case, when the main thread opens a transaction this info is not trasmitted through async-thread-chain thus resulting in reading a "wrong" state from the database.

It could be addressed by making async_to_sync and sync_to_async transaction-aware. I do not know if this can be achieved without an async-ready database client.

comment:6 by Carlton Gibson, 3 years ago

I do not know if this can be achieved without an async-ready database client.

Yes, That's kind of where I'm at with it. :| — This is a Difficult Problem™.

In traditional request-response scenarios we can reasonably assume that there's a single transaction in a single thread and not worry about what goes on outside that. Once there are multiple threads in play you start wanting (that's not the right word perhaps) to think about isolation levels, how you'll handle conflicts and so on, and what's right in that space for your application (which may not be general). At the moment we're constrained by the ORM not having been built with any in mind at all, so the required safety wheels do impinge.

For now, can you try writing this particular test with pytest-asyncio? That should skip the outer async_to_sync(), which should/may hopefully mean that the sync_to_async call is run on the outermost thread. (Without sitting down with the exact test and the debugger it's hard to say exactly.)

comment:7 by David, 3 years ago

I will try to move this test under pytest.

Then I will try to build a sample project with a simplified version of my actual test for further analysis.

comment:8 by David, 3 years ago

Here is a demo project with the use-case I described before: https://github.com/sevdog/test-websocker-user-lock.
I have setup both py-test (however something is still missing here, since they work if run one-by-one but fails when run in sequence, I should learn more about this library) and django testcase.

comment:9 by Andrew Godwin, 3 years ago

I'm not sure quite why this is happening - any recent release of asgiref ensures that every sync_to_async call runs in the same thread as we now default thread-sensitive mode to on; precisely to stop this kind of behaviour.

Can you verify you're seeing this with the most recent asgiref release?

comment:10 by David, 3 years ago

Hi Andrew, in both my main project and test-one I use asgiref==3.3.1 which, as of now, is the latest available on pypi (Nov 2020).

I will give a try with latest commit from github to see if this issue persists in the main project.

Thank you.

comment:11 by Andrew Godwin, 3 years ago

No, that's plenty new, the change happened in 3.3.0. In that case, I think the best course of investigation here is why the outer sync thread with the transaction in it, and the inner sync thread the ORM runs in, are different. I can take a look at that in a few weeks, but unfortunately I'm pretty busy at work right now so I can't do it immediately.

If anyone else is interested in doing that investigation, the thing I normally do is go edit the Django source code at relevant points and make it print the current thread ID, then run a test and see what thread IDs get printed out during it.

in reply to:  2 comment:12 by David, 3 years ago

I have added thead-id logging to my test app and unittests.

Looking at connect test It seems that in some way with django TestCase an extra thread gets spawned.

test-setup 140537656747840
test 140537592018688
db-connection-2 140537583625984
--- error gets thrown

While the corresponding test with pytest does not.

fixture-sync 140584466261824
fixture-sync 140584466261824
fixture-async 140584466261824
db-connection-2 140584401794816
connect 140584466261824
db-connection-1 140584401794816
test 140584466261824
db-connection-4 140584401794816
disconnect 140584466261824
--- test successfull

I checked against the test case provided by Carlton and seems that it passes with asgiref==3.3.1. I do not know if ApplicationCommunicator from asgiref are bound in some way to this issue.

I will investigate where the source of the problem lies and create an appropriate test case for it.

Last edited 3 years ago by David (previous) (diff)

comment:13 by David, 3 years ago

Looking deeper at pytest integration I found that in my case it does not behave like TestCase but like TransactionTestCase.

When using TransactionTestCase all my unit-tests pass.

I have also add the "simple-fetch" test suggested by Carlton to my sample project tests along with automated testing to better see what is going on (results can be see here).

Usin sqlit3 tests fail due to this error: django.db.utils.OperationalError: database table is locked: ws_lock_grouptypevisibility. From what I have found this error is due to a concurrent connection with an uncommitted-transaction.

So the whole problem seems to be bound to transactions and how they behave with async code. In a normal/production working flow this may not be a problem, but while running tests it is.

comment:14 by Mariusz Felisiak, 3 years ago

I cannot reproduce this issue with Django's TestCase or TransactionTestCase. I tried with SQLite and PostgreSQL, and asgiref 3.3.0 and 3.3.1. Puzzled.

comment:15 by Mariusz Felisiak, 3 years ago

Resolution: needsinfo
Status: newclosed

Closing as "needsinfo", since we don't have a reproducible scenario.

in reply to:  15 comment:16 by David, 3 years ago

Replying to Mariusz Felisiak:

Closing as "needsinfo", since we don't have a reproducible scenario.

Hi Mariusz, sorry to not have replied earlier but I had provided a complete test scenario in this repository: https://github.com/sevdog/test-websocker-user-lock

Replying to David:

Here is a demo project with the use-case I described before: https://github.com/sevdog/test-websocker-user-lock.
I have setup both py-test (however something is still missing here, since they work if run one-by-one but fails when run in sequence, I should learn more about this library) and django testcase.

comment:17 by Mariusz Felisiak, 3 years ago

Resolution: needsinfo
Status: closednew

Sorry for missing a sample project. Let's reopen this ticket for further investigation. I still cannot reproduce it with a small set of tests without channels etc. (see also similar not-async tickets #32416 and #29062).

comment:18 by KangDo96, 3 years ago

Owner: changed from nobody to KangDo96
Status: newassigned

comment:19 by Aleksi Hoffman, 2 years ago

I'm faced with a similar OperationalError whilst testing an async middleware using TestCase.

The db accessing method in the middleware is decorated with @database_sync_to_async from Channels. Package versions are django==3.2.8 channels==3.0.4, asgiref==3.4.1 and pytest==6.2.5.

Decorating the async test case with both @pytest.mark.asyncio and @async_to_sync seems to make the problem go away.

comment:20 by Oleg, 22 months ago

I have faced the same issue when writing tests for API implemented using strawberry-graphql-django.
Strawberry converts django calls using sync_to_async(func, thread_sensitive=True).
Initially, I've converted tests to fully async ones, for example:

@pytest.mark.django_db
async def test_currencies(setup_for_currency):
    schema = gql.Schema(query=Query)
    result = await schema.execute(
        """
        query {
          currency {
            currencies {
              name
            }
          }
        }
        """
    )

As a result, tests based on TestCase don't work, while TransactionTestCase based do work. This happens because fixture setup_for_currency and TestCase setup and tear-down are executed against a different DB connection.
However, if I change upper-level test-case to synchronous one, everything works!

@pytest.mark.django_db
def test_currencies(setup_for_currency):
    schema = gql.Schema(query=Query)
    result = async_to_sync(schema.execute)(
        """
        query {
          currency {
            currencies {
              name
            }
          }
        }
        """
    )

Looks like this happens according to asgiref documentation:

If the outermost program is async (i.e. SyncToAsync is outermost), then
this will be a dedicated single sub-thread that all sync code runs in,
one after the other. If the outermost program is sync (i.e. AsyncToSync is
outermost), this will just be the main thread. This is achieved by idling
with a CurrentThreadExecutor while AsyncToSync is blocking its sync parent,
rather than just blocking.

However, this behavior is still very confusing, I have no idea if this could be fixed.
Using TransactionTestCase everywhere is a no-no for large projects, but my solution still looks like a hack.

comment:21 by Carlton Gibson, 16 months ago

Owner: KangDo96 removed
Status: assignednew
Note: See TracTickets for help on using tickets.
Back to Top