Opened 3 years ago

Last modified 7 hours ago

#33573 assigned New feature

Add native async support to redis cache backend

Reported by: Christopher Bailey Owned by: Ahmed Ibrahim
Component: Core (Cache system) Version: dev
Severity: Normal Keywords:
Cc: Andrew Chen Wang, Andrew Godwin, Carlton Gibson, Ülgen Sarıkavak, amirreza Triage Stage: Someday/Maybe
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Christopher Bailey)

The latest version of redis-py will support both sync and async clients so sync_to_async will no longer be necessary so it would be nice if the out of the box Redis cache backend supports both natively as well.

https://github.com/redis/redis-py/releases/tag/v4.2.0rc1

It may be a bit premature since 4.2.0 is still RC, but I wanted to get a ticket out there so it can be on someones radar.

Change History (20)

comment:1 by Mariusz Felisiak, 3 years ago

Cc: Andrew Chen Wang Andrew Godwin Carlton Gibson added
Triage Stage: UnreviewedAccepted

Tentatively accepted. Do you have an idea for implementation? As far as I'm aware this would require initializing two redis clients: async and non-async.

comment:2 by Mariusz Felisiak, 3 years ago

Triage Stage: AcceptedSomeday/Maybe

Let's wait for the final release first.

comment:3 by Andrew Chen Wang, 3 years ago

We are currently re-implementing the async client (one last time) in a way that allows for aioredis 1.3.1 performance. https://github.com/seandstewart/redis-py-sansio/blob/main/benchmark/conftest.py The RedisLab team will need to approve it first, so I would wait until our new client is either approved/denied https://github.com/redis/redis-py/issues/2039. This won't affect the non-async client as the non-async client will still default on the current redis-py interface with the option to also use the new API.

This isn't to say no one should try implementing the async client now; I have high confidence the new client will be merged in with Redis-py as we're gunning for the much-needed performance gain. Just don't be surprised if a few things shift around.

comment:4 by Christopher Bailey, 3 years ago

Description: modified (diff)

comment:5 by Christopher Bailey, 3 years ago

Description: modified (diff)

Do you have an idea for implementation? As far as I'm aware this would require initializing two redis clients: async and non-async.

I kind of started to implement it, but then realized it was likely going to be a bigger issue then I though. But there is already an abstract RedisClient class that is a wrapper around the base redis.Redis class. It maintains connection polls and handles all of the core commands. It has methods for get, set, add, touch, incr that just calls get_client and makes a new client using the existing connection pull any time it is called.

For Async we can just add something like get_async_client and follow the same pattern within the class. I believe the connection poll should be able to be reused between the sync/async clients, but I am not an expert on Redis.

Last edited 3 years ago by Christopher Bailey (previous) (diff)

comment:6 by Christopher Bailey, 3 years ago

4.2.0 is fully out now.

comment:7 by Andrew Chen Wang, 3 years ago

It doesn't seem like the Sans-IO approach will be adopted anytime soon as work and review has stalled.

I believe the connection poll should be able to be reused between the sync/async clients, but I am not an expert on Redis.

Sorry if I misunderstood, but the ConnectionPool classes for the sync and async versions are different, and the Connection classes are also different.

As far as I'm aware this would require initializing two redis clients: async and non-async.

Assuming you mean initialize redis.Redis and redis.asyncio.Redis in one class, I disagree; I think we should have two RedisCache classes, one for sync and another for async.

  1. Some users may want both a sync and async Redis client. However, the options presented to devs cater to only one of the variants. For instance, if you need to pass in a custom ConnectionPool or provide pool options for the default pools, those could be different for the different environments like an async enabled middleware uses the async Redis but a sync view uses the sync Redis. If a custom Connection class is provided to the pool, then we'd have a problem since the Connection class for the sync and async Redis are different.
  2. The actual network calls like get and set_many look like they will be copy-paste because of the numerous awaits required, so there won't be much if any re-use of code
  3. Not all clients have upgraded to at least Redis 4.2.0.
  4. It sets a paradigm for future client implementations. Regardless of whether packages like psycopg(3) implement both a sync and async variant or just one type like the memcached packages implementing just sync, there should be a separation of clients based on environment to set future expectations of clients not necessarily having a counterpart variant like an async package not having sync support.

Let me know your thoughts. Thanks!

comment:8 by Carlton Gibson, 3 years ago

Hey Andrew, thanks for the follow-up!

If I've understood you correctly, I think the best way forward would be to prototype the async backend in a external module, so we can see what it would look like and install it and give it a run. I think we'd need to have a some discussion as to whether/how we'd adopt that then into Django, and the messaging we'd put around that.

Make sense?

comment:9 by Andrew Chen Wang, 3 years ago

Makes sense. Thanks for the comment!

You can take a look at a practically broken but conceptual implementation here: https://github.com/Andrew-Chen-Wang/django-async-redis/pull/6

The repository itself is designed to be a django-redis replacement that I recently migrated from aioredis to redis-py; in regards to this convo, the places to look are django_async_redis.client.core.py and the test file tests/test_core.py

comment:10 by Ülgen Sarıkavak, 16 months ago

Cc: Ülgen Sarıkavak added

comment:11 by Ahmed Ibrahim, 9 months ago

Owner: changed from nobody to Ahmed Ibrahim
Status: newassigned

I can take this over, any ideas on what's the final solution? I will propose a solution soon as well after investigation

comment:12 by Andrew Chen Wang, 9 months ago

@Ahmed I'm forgetting context since it's been awhile, but my draft seems to have some ideas: https://github.com/Andrew-Chen-Wang/django-async-redis/pull/6. Specifically:

  1. A lot of the code seems to be copying the sync to async prefixed classes and methods
  2. There are some packages that use the default cache client. It needs to know whether the client is async/sync capable, so you may need to add some variable (similar to how middleware classes are async/sync capable via a variable). The reason I made two classes, one for sync and another for async, is so that the client caller can know whether to execute as sync or async. I believe work stalled on my end trying to determine how to resolve the what context the client caller is coming from.

comment:13 by Christopher Bailey, 9 months ago

I do not think have two separate clients is a good approach. It is actually a really bad one for apps that have a lot of caches. If you have a legacy WSGI app you want to migrate to ASGI and you have 20 caches (not as crazy as it sounds for larger apps with very high throughput). Now you have to duplicate all of them, one for sync and one for async, and have 40 caches?

If a cache is capable of both sync and async, it should raise an exception when used in the wrong context rather than forcing it to be a configuration/usage issue.

Django also automatically applies sync_to_async to caches as well. So, this will make it so the default sync cache will automatically wrap it to be async (inefficiently) if used in the wrong context and the async native cache will raise a NotImplemented error if used in the wrong context. That is really inconsistent between the two and could definitely lead to confusion.

Last edited 9 months ago by Christopher Bailey (previous) (diff)

comment:14 by Sarah Boyce, 7 months ago

Note that #36047 was raised while working with an async cache client

comment:15 by amirreza, 7 months ago

Cc: amirreza added

in reply to:  13 comment:16 by amirreza, 7 months ago

hi, i'd like to add that the above issue is not redis exclusive, and happens with any cache backend, if async is used

in reply to:  13 comment:17 by amirreza, 7 months ago

Replying to Christopher Bailey:

I do not think have two separate clients is a good approach. It is actually a really bad one for apps that have a lot of caches. If you have a legacy WSGI app you want to migrate to ASGI and you have 20 caches (not as crazy as it sounds for larger apps with very high throughput). Now you have to duplicate all of them, one for sync and one for async, and have 40 caches?

i'm not sure why we would keep the sync version when the async one is added, but merging both sync and async into one client is not easy, unless asgiref is used.
cache clients such as redis-py have different clients for sync and async, combining them into one class needs hard work to implement, and to maintain.
also, what if i want my client to be async/sync only?

for your issue i'd suggest a wrapper around sync and async clients, so both clients exists, and duplication won't happen.

comment:18 by amirreza, 6 months ago

hi
i have a feeling this is abandoned, but if not, i'd like to take over

the way i will do it is by using seprate classes for sync and async, using redis's native support for async, as i explained in my previous message

this way the issue i riased at #36047 will also be easy to fix (i think)
i'll tackle #36047 afterwards
the way i'll do is to check if the backend has an async aclose method, if so, call async_to_sync() on it
tho this one is a raw idea and not totaly sure about it

let me know if this is ok with you

comment:19 by amirreza, 3 months ago

hi 🙌🏼
so after a good amount of time to investigate things, i'm back to talk more on the subject
in this commecnt i'll try to share what an async client would look like in django, and leave it to you to decide if this is desired or not

so the first problem that needs solving is the cleaup discussed in #36047

this is the main problem that we're dealing with, the rest are mostly just new features

since this is a signal receiver and signal receiver can be async since django 5, i can see this fixed like this:

  1. add a is_async attribute to cache clients, similar to middlewares
  1. change the receiver to be async, like this:
    async def close_caches(**kwargs):
        # Some caches need to do a cleanup at the end of a request cycle. If not
        # implemented in a particular backend cache.close() is a no-op.
        await caches.close_all()
    

where close_all() is as follows:

    async def close_all(self):
        for conn in self.all(initialized_only=True):
            if conn.is_async:
                await conn.aclose()
            else:
                conn.close()

second matter in hand is how to design the async cache backend

i personally suggest that async backend should be in a different class, i hope by the end of this comment you'd agree too

so i would do something like this:

  1. a base class with common functionality
  2. two (less)base classes for sync and async functionality
  3. two redis backends for sync and async

async methods would have a prefixed to their name (even tho there are no sync equivelents)
for two reason:

  1. to be compatible with rest of django
  2. so if someone wants (for whatever reason), they can combine the two class and have a backend that supports both

tho if you allow me to be naughty a bit🫣, i'll define a __getattr__ that only works in the async backend and allows people to access methods without the a
i've planned a similar thing with django-valkey
so both await cache.aset() and await cache.set() work

some things to have in mind:

❕the following problems are not essential part of this ticket, an async cache backend can exists without these
even a 3rd-party can handle these issues, i'm only mentioning for refrence

  1. cache middlewares do not accept an async only cache backend

to solve this we either have to combine the two backends (as discussed above)
this one is a bit of a lie, with this way, middleware is still sync, it only calls sync cache methods
it just allows for the user to call async methods, while having the middleware as well

or have new middlewares that support async (i like this one, but i guess some people don't)

  1. if we decide to make an async middleware, make_middleware_decorator might not be very happy, since it expects sync methods, which don't make sense in an async middleware

i have made an async only option in a 3rd-party before

  1. again if we move for an async middleware, cache_page decorator won't work, needs another one
  1. these two caching utils expect a sync backend https://github.com/django/django/blob/main/django/utils/cache.py#L377 and https://github.com/django/django/blob/main/django/utils/cache.py#L399
  1. if cache server is used for sessions, the session middleware doesn't support async operations

the solutions mentionedfor cache middleware also applies here

i hope this can give some calrity on how this would work
as before, i can implement this, if this is a feature that people want, and if not, well at least this can be used as refrence for other people

with best of wishes♥️

Last edited 3 months ago by amirreza (previous) (diff)

in reply to:  18 comment:20 by Ahmed Ibrahim, 7 hours ago

Replying to amirreza:

hi
i have a feeling this is abandoned, but if not, i'd like to take over

the way i will do it is by using seprate classes for sync and async, using redis's native support for async, as i explained in my previous message

this way the issue i riased at #36047 will also be easy to fix (i think)
i'll tackle #36047 afterwards
the way i'll do is to check if the backend has an async aclose method, if so, call async_to_sync() on it
tho this one is a raw idea and not totaly sure about it

let me know if this is ok with you

dw I will do it, but I need a Django fellow to provide a certain answer on which implementation we should go with, since we can do it and it won't get accepted, I "think" (not sure) I emailed before in the mailing list, but got no response, I'm not sure if that's deja vu or something

Note: See TracTickets for help on using tickets.
Back to Top