Opened 4 years ago

Closed 23 months ago

#31920 closed New feature (fixed)

ASGI/ASYNC SessionMiddleware - SynchronousOnlyOperation exception if request.user is not unwrapped in sync code

Reported by: Michael Galler Owned by: Jon Janzen
Component: contrib.sessions Version: 3.1
Severity: Normal Keywords: async
Cc: Andrew Godwin, Carlton Gibson, Jon Janzen, Adam Johnson Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

class AuthenticationMiddleware(MiddlewareMixin):
    def process_request(self, request):
        request.user = SimpleLazyObject(lambda: get_user(request))

AuthenticationMiddleware wraps the request.user in a SimpleLazyObject, so the Database Query is called lazy

If i start now a project via django-admin and I add a user and log in and request the following view:

from asgiref.sync import sync_to_async
from django.core.exceptions import SynchronousOnlyOperation
from django.http import HttpResponse


@sync_to_async
def get_user_from_request(request):
    return request.user if bool(request.user) else None


async def async_test(request):
    # CASE 1
    try:
        print(request.user.is_authenticated)
    except SynchronousOnlyOperation:
        print('error 1')

    # CASE 2
    try:
        user = await sync_to_async(request.user)
        print(user)
    except SynchronousOnlyOperation:
        print('error 2')

    # CASE 3
    try:
        user = await sync_to_async(request.user)()
        print(user)
    except SynchronousOnlyOperation:
        print('error 3')

    # CASE 4
    try:
        user = await sync_to_async(bool)(request.user)
        print(request.user)
    except SynchronousOnlyOperation:
        print('error 4')

    # CASE 5
    try:
        user = await get_user_from_request(request)
        print(user)
    except SynchronousOnlyOperation:
        print('error 5')

    return HttpResponse('Hello, async world!')

Only case 4 and 5 works, i find it a little awkward that i have to do one of this cases.

Change History (22)

comment:1 by Carlton Gibson, 4 years ago

Cc: Andrew Godwin added
Triage Stage: UnreviewedAccepted
Type: Cleanup/optimizationNew feature

HI Michael, I’m going to accept this as a New Feature.

I accept totally that you just want to do request.user, and that’s clearly where we need to go, but I don’t think that’s entailed by the view and middleware support in 3.1, so this is something we need to think about and address.

(If you want to help there, that would be cool.)

Maybe we can treat it as a bug in a new feature, but I worry we have to draw the line somewhere between what was there for 3.1 and what wasn’t.

It might be that we can document this somehow in the meantime? (If that’s too rough, then I would then lean towards making it a Release Blocker.)

Last edited 4 years ago by Carlton Gibson (previous) (diff)

comment:2 by Andrew Godwin, 4 years ago

So the problem here is that attribute access is always synchronous, so lazy loading simply isn't possible. SynchronousOnlyOperation was added to at least defend against this being done by accident, but if we want request.user to simply work during async views, it will need to be greedily loaded at middleware execution time.

This is tricky, as the middleware doesn't know if the view below it is sync or async. There's a variety of solutions to this - a setting to always greedily load it, a nice-ish async-compatible function on user/SimpleLazyObject that loads it, trying to escape to the async loop using greenlets - but I'm not sure which one I prefer right now.

comment:3 by Michael Galler, 4 years ago

would not something like the following be easier?

class AuthenticationMiddleware(MiddlewareMixin):
    def process_view(self, request, view_func, args, kwargs):
        hasattr(request.user, 'foobar')

This unpacks the user after all process_requests calls and before any views are called, also there are no conflicts with other middlewares like RemoteUserMiddleware.

in reply to:  3 ; comment:4 by Andrew Godwin, 4 years ago

Right, that's just removing the lazy-loading aspect of it for every request ("greedily loading it"). We can't ship that in default Django as it would be a performance regression from now, essentially removing the lazy loading part permanently. This is why I suggested we maybe do this with a setting.

However, it's a useful workaround if you want to write your own middleware.

Replying to Michael Galler:

would not something like the following be easier?

class AuthenticationMiddleware(MiddlewareMixin):
    def process_view(self, request, view_func, args, kwargs):
        hasattr(request.user, 'foobar')

This unpacks the user after all process_requests calls and before any views are called, also there are no conflicts with other middlewares like RemoteUserMiddleware.

in reply to:  4 comment:5 by Michael Galler, 4 years ago

Replying to Andrew Godwin:

Right, that's just removing the lazy-loading aspect of it for every request ("greedily loading it"). We can't ship that in default Django as it would be a performance regression from now, essentially removing the lazy loading part permanently. This is why I suggested we maybe do this with a setting.

However, it's a useful workaround if you want to write your own middleware.

Wouldn't it make sense to have a decorator that automatically executes the function synchronously thread sensitive when it is executed in a async context?

What i came up with after some time is the following based on async_unsafe:

async def run_in_sync(func, current_executor, *args, **kwargs):
    if current_executor:
        AsyncToSync.executors.current = current_executor
    return await sync_to_async(func, thread_sensitive=True)(*args, **kwargs)


def ensure_sync_environment(message):
    def decorator(func):
        @functools.wraps(func)
        def inner(*args, **kwargs):
            try:
                loop = asyncio.get_running_loop()
            except RuntimeError:
                return func(*args, **kwargs)
            else:
                current_executor = getattr(AsyncToSync.executors, "current", None)
                with ThreadPoolExecutor(max_workers=1) as executor:
                    return executor.submit(asyncio.run, run_in_sync(func, current_executor, *args, **kwargs)).result()

        return inner

    # If the message is actually a function, then be a no-arguments decorator.

    if callable(message):
        func = message
        message = ''
        return decorator(func)
    else:
        return decorator

If I now decrorate the function get_user with it

@ensure_sync_environment
def get_user(request):
    if not hasattr(request, '_cached_user'):
        request._cached_user = auth.get_user(request)
    return request._cached_user

The problem is gone.
This works with asgi and wsgi and does not break any thread sensitivity, even the database connections are correctly cleared.
I know that this will start a new loop on every wrap, but I have not found any other solutions.
If it is not called too often, I think it should be no problem.

Last edited 4 years ago by Michael Galler (previous) (diff)

comment:6 by Carlton Gibson, 4 years ago

Cc: Carlton Gibson added
Keywords: async added

comment:7 by Andrew Godwin, 4 years ago

Unfortunately that only appears to work reliably - you're still running synchronous blocking code inside of a coroutine, so you're holding up the main loop. There's unfortunately literally no way around this from inside attribute access, as you have no way to signal the outer event loop that it can pause the current coroutine and continue with other work.

If we put this into Django, you'd end up with every view that accessed user locking the entire async loop while it did so (which given it usually calls a database, is not going to be super quick).

in reply to:  2 comment:8 by Ben Lomax, 4 years ago

Replying to Andrew Godwin:

This is tricky, as the middleware doesn't know if the view below it is sync or async. There's a variety of solutions to this - a setting to always greedily load it, a nice-ish async-compatible function on user/SimpleLazyObject that loads it, trying to escape to the async loop using greenlets - but I'm not sure which one I prefer right now.

Just checking to see if the current state of Django lent itself to any of the proposed ideas here (setting to greedily load / async-compatible function / greenlets / something else)? I'd be keen to have a crack at this but wanted to see what the best approach might be before doing a deep dive on a badly chosen one. Given the above choices I guess my next questions would be:

  1. For the greedy load, I would imagine the greedy load would be for all of the middleware? I think AuthenticationMiddleware is the only one that lazy loads an object (although GZipMiddleware has a _lazy_re_compile which I've not wrapped my head around yet to know if that would need to be updated too).
  1. The user / SimpleLazyObject async function sounds like it might potentially be the neatest, but I don't really understand how that could be implemented. Do you have any existing examples that do similar things (or could you elaborate on how I might approach this)?
  1. I've not encountered "greenlets" before. Are you referencing this library? https://github.com/python-greenlet/greenlet
Last edited 4 years ago by Ben Lomax (previous) (diff)

comment:9 by lirontb, 2 years ago

Owner: changed from nobody to lirontb
Status: newassigned

comment:10 by Andrew Godwin, 2 years ago

Chatted with lirontb about this at the DCUS2022 sprints, and we believe we can just implement the __await__ method on SimpleLazyObject, roughly like this:

class X:
    def __await__(self):
        if isinstance(self.wrapped, coroutine):
            async def inner():
                result = await coroutine
                self.wrapped = result
                return result
            return inner().__await__()
        else:
            async def result():
                return self.wrapped
            return result().__await__()

That would mean we could simply await request.user (or any other SimpleLazyObject) and get the value out. If this works, we could even consider this for the related object descriptors for foreign keys...!

comment:11 by lirontb, 2 years ago

Has patch: set

comment:12 by Mariusz Felisiak, 2 years ago

Needs documentation: set
Patch needs improvement: set

comment:13 by Adam Johnson, 2 years ago

I don’t like the suggestion of adding SimpleLazyObject.__await__. It’s too “magical” and it makes the type of request.user complex.

Currently django-stubs types request.user as User | AnonymousUser (with a documented technique to use just User for logged in requests). If we wanted to support the __await__ method in type hints, then the type would become User | AnonymousUser | Awaitable[[], User | AnonymousUser]. This would require type narrowing everywhere request.user is used... so, we probably wouldn’t want to do that in django-stubs, but that would make it harder to have a correctly type-checked async Django app.

I propose instead making the middleware add a coroutine request.auser() that fetches from request.user, used like:

user = await request.auser()
if user.is_authenticated:
    ...

Yes, it’s a little more code, but it follows Django’s other async API’s and it looks like a normal coroutine call.

comment:14 by Jon Janzen, 2 years ago

Cc: Jon Janzen added

comment:15 by Carlton Gibson, 2 years ago

I think Adam is likely right here. We should likely have a separate auser(), similarly to elsewhere.

To await, or not await, 'tis the question

comment:16 by Jon Janzen, 2 years ago

Should auser cache the results of the DB query? AIUI that's how SimpleLazyObject works right now so it should probably be conistent.

For myself, I know there are certain codepaths in my personal installation that read the equivalent of request.user several times during a request.

If it doesn't need to be cached, I see this as a fairly trivial and I'd be happy to do it. But I'll wait for guidance on my question before moving forward.

Also, while I'm commenting here: I posted about a larger issue about asyncifying the auth app overall which touches on this ticket if anyone is interested in that: https://forum.djangoproject.com/t/asyncifying-django-contrib-auth-and-signals-and-maybe-sessions/18770

comment:17 by Jon Janzen, 2 years ago

Owner: changed from lirontb to Jon Janzen

I went ahead and created a PR with a very simple cache:

PR

comment:18 by Carlton Gibson, 2 years ago

Needs tests: set

comment:19 by Carlton Gibson, 23 months ago

Cc: Adam Johnson added

Just to follow-up, there was a mypy proposal to allow `typing.overload` to work with `T | Awaitable[T type examples], but that was closed as wontfix`.

No, this isn't something mypy is going to do. Separate methods … is an easy solution.

So, I take it that the auser() approach is going with the wind.

Current PR is looking OK. I just want to investigate if/how sharing the cache between .user and .auser() might work — but any other reviews or input welcomed.

comment:20 by Jon Janzen, 23 months ago

Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset

comment:21 by Mariusz Felisiak, 23 months ago

Triage Stage: AcceptedReady for checkin

comment:22 by Mariusz Felisiak <felisiak.mariusz@…>, 23 months ago

Resolution: fixed
Status: assignedclosed

In e846c5e7:

Fixed #31920 -- Made AuthenticationMiddleware add request.auser().

Note: See TracTickets for help on using tickets.
Back to Top