Opened 3 years ago

Last modified 3 weeks ago

#33573 assigned New feature

Add native async support to redis cache backend

Reported by: Christopher Bailey Owned by: Ahmed Ibrahim
Component: Core (Cache system) Version: dev
Severity: Normal Keywords:
Cc: Andrew Chen Wang, Andrew Godwin, Carlton Gibson, Ülgen Sarıkavak, amirreza Triage Stage: Someday/Maybe
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Christopher Bailey)

The latest version of redis-py will support both sync and async clients so sync_to_async will no longer be necessary so it would be nice if the out of the box Redis cache backend supports both natively as well.

https://github.com/redis/redis-py/releases/tag/v4.2.0rc1

It may be a bit premature since 4.2.0 is still RC, but I wanted to get a ticket out there so it can be on someones radar.

Change History (17)

comment:1 by Mariusz Felisiak, 3 years ago

Cc: Andrew Chen Wang Andrew Godwin Carlton Gibson added
Triage Stage: UnreviewedAccepted

Tentatively accepted. Do you have an idea for implementation? As far as I'm aware this would require initializing two redis clients: async and non-async.

comment:2 by Mariusz Felisiak, 3 years ago

Triage Stage: AcceptedSomeday/Maybe

Let's wait for the final release first.

comment:3 by Andrew Chen Wang, 3 years ago

We are currently re-implementing the async client (one last time) in a way that allows for aioredis 1.3.1 performance. https://github.com/seandstewart/redis-py-sansio/blob/main/benchmark/conftest.py The RedisLab team will need to approve it first, so I would wait until our new client is either approved/denied https://github.com/redis/redis-py/issues/2039. This won't affect the non-async client as the non-async client will still default on the current redis-py interface with the option to also use the new API.

This isn't to say no one should try implementing the async client now; I have high confidence the new client will be merged in with Redis-py as we're gunning for the much-needed performance gain. Just don't be surprised if a few things shift around.

comment:4 by Christopher Bailey, 3 years ago

Description: modified (diff)

comment:5 by Christopher Bailey, 3 years ago

Description: modified (diff)

Do you have an idea for implementation? As far as I'm aware this would require initializing two redis clients: async and non-async.

I kind of started to implement it, but then realized it was likely going to be a bigger issue then I though. But there is already an abstract RedisClient class that is a wrapper around the base redis.Redis class. It maintains connection polls and handles all of the core commands. It has methods for get, set, add, touch, incr that just calls get_client and makes a new client using the existing connection pull any time it is called.

For Async we can just add something like get_async_client and follow the same pattern within the class. I believe the connection poll should be able to be reused between the sync/async clients, but I am not an expert on Redis.

Last edited 3 years ago by Christopher Bailey (previous) (diff)

comment:6 by Christopher Bailey, 3 years ago

4.2.0 is fully out now.

comment:7 by Andrew Chen Wang, 2 years ago

It doesn't seem like the Sans-IO approach will be adopted anytime soon as work and review has stalled.

I believe the connection poll should be able to be reused between the sync/async clients, but I am not an expert on Redis.

Sorry if I misunderstood, but the ConnectionPool classes for the sync and async versions are different, and the Connection classes are also different.

As far as I'm aware this would require initializing two redis clients: async and non-async.

Assuming you mean initialize redis.Redis and redis.asyncio.Redis in one class, I disagree; I think we should have two RedisCache classes, one for sync and another for async.

  1. Some users may want both a sync and async Redis client. However, the options presented to devs cater to only one of the variants. For instance, if you need to pass in a custom ConnectionPool or provide pool options for the default pools, those could be different for the different environments like an async enabled middleware uses the async Redis but a sync view uses the sync Redis. If a custom Connection class is provided to the pool, then we'd have a problem since the Connection class for the sync and async Redis are different.
  2. The actual network calls like get and set_many look like they will be copy-paste because of the numerous awaits required, so there won't be much if any re-use of code
  3. Not all clients have upgraded to at least Redis 4.2.0.
  4. It sets a paradigm for future client implementations. Regardless of whether packages like psycopg(3) implement both a sync and async variant or just one type like the memcached packages implementing just sync, there should be a separation of clients based on environment to set future expectations of clients not necessarily having a counterpart variant like an async package not having sync support.

Let me know your thoughts. Thanks!

comment:8 by Carlton Gibson, 2 years ago

Hey Andrew, thanks for the follow-up!

If I've understood you correctly, I think the best way forward would be to prototype the async backend in a external module, so we can see what it would look like and install it and give it a run. I think we'd need to have a some discussion as to whether/how we'd adopt that then into Django, and the messaging we'd put around that.

Make sense?

comment:9 by Andrew Chen Wang, 2 years ago

Makes sense. Thanks for the comment!

You can take a look at a practically broken but conceptual implementation here: https://github.com/Andrew-Chen-Wang/django-async-redis/pull/6

The repository itself is designed to be a django-redis replacement that I recently migrated from aioredis to redis-py; in regards to this convo, the places to look are django_async_redis.client.core.py and the test file tests/test_core.py

comment:10 by Ülgen Sarıkavak, 10 months ago

Cc: Ülgen Sarıkavak added

comment:11 by Ahmed Ibrahim, 3 months ago

Owner: changed from nobody to Ahmed Ibrahim
Status: newassigned

I can take this over, any ideas on what's the final solution? I will propose a solution soon as well after investigation

comment:12 by Andrew Chen Wang, 3 months ago

@Ahmed I'm forgetting context since it's been awhile, but my draft seems to have some ideas: https://github.com/Andrew-Chen-Wang/django-async-redis/pull/6. Specifically:

  1. A lot of the code seems to be copying the sync to async prefixed classes and methods
  2. There are some packages that use the default cache client. It needs to know whether the client is async/sync capable, so you may need to add some variable (similar to how middleware classes are async/sync capable via a variable). The reason I made two classes, one for sync and another for async, is so that the client caller can know whether to execute as sync or async. I believe work stalled on my end trying to determine how to resolve the what context the client caller is coming from.

comment:13 by Christopher Bailey, 3 months ago

I do not think have two separate clients is a good approach. It is actually a really bad one for apps that have a lot of caches. If you have a legacy WSGI app you want to migrate to ASGI and you have 20 caches (not as crazy as it sounds for larger apps with very high throughput). Now you have to duplicate all of them, one for sync and one for async, and have 40 caches?

If a cache is capable of both sync and async, it should raise an exception when used in the wrong context rather than forcing it to be a configuration/usage issue.

Django also automatically applies sync_to_async to caches as well. So, this will make it so the default sync cache will automatically wrap it to be async (inefficiently) if used in the wrong context and the async native cache will raise a NotImplemented error if used in the wrong context. That is really inconsistent between the two and could definitely lead to confusion.

Last edited 3 months ago by Christopher Bailey (previous) (diff)

comment:14 by Sarah Boyce, 3 weeks ago

Note that #36047 was raised while working with an async cache client

comment:15 by amirreza, 3 weeks ago

Cc: amirreza added

in reply to:  13 comment:16 by amirreza, 3 weeks ago

hi, i'd like to add that the above issue is not redis exclusive, and happens with any cache backend, if async is used

in reply to:  13 comment:17 by amirreza, 3 weeks ago

Replying to Christopher Bailey:

I do not think have two separate clients is a good approach. It is actually a really bad one for apps that have a lot of caches. If you have a legacy WSGI app you want to migrate to ASGI and you have 20 caches (not as crazy as it sounds for larger apps with very high throughput). Now you have to duplicate all of them, one for sync and one for async, and have 40 caches?

i'm not sure why we would keep the sync version when the async one is added, but merging both sync and async into one client is not easy, unless asgiref is used.
cache clients such as redis-py have different clients for sync and async, combining them into one class needs hard work to implement, and to maintain.
also, what if i want my client to be async/sync only?

for your issue i'd suggest a wrapper around sync and async clients, so both clients exists, and duplication won't happen.

Note: See TracTickets for help on using tickets.
Back to Top