Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#30941 closed Bug (fixed)

hasattr(request, '_cached_user') check no longer works

Reported by: Collin Anderson Owned by: nobody
Component: contrib.auth Version: dev
Severity: Release blocker Keywords:
Cc: cmawebsite@…, Sergey Fedoseev, Simon Charette Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Carlos_Mir_de_Souza)

Before https://github.com/django/django/commit/2f01079, it was possible to check to see whether the lazy user has been evaluated or not using hasattr(request, '_cached_user').

This was undocumented, but I think we should try to maintain backwards compatibility for that check, and document and test it for the future. (Or if nothing else, document and test how to do this the new way and add a release note about the change.)

This is really helpful in middleware to be able to check whether the user has been evaluated, so you can access the user if it has already been accessed, but can avoid fetching user if it hasn't already been fetched.

Change History (21)

comment:1 by Collin Anderson, 5 years ago

Description: modified (diff)

comment:2 by Collin Anderson, 5 years ago

Description: modified (diff)

comment:3 by Collin Anderson, 5 years ago

Description: modified (diff)

comment:4 by Simon Charette, 5 years ago

Cc: Sergey Fedoseev Simon Charette added

If we wanted to deprecate this behavior we could add a _cached_user descriptor to request that raises a deprecation warning on access and raises an AttributeError when instance.user._wrapped is empty.

FWIW I'd much prefer we define a blessed way of looking this up then maintaining compatibility with this implementation specific _cached_user hack.

Something like request.user.loaded by subclassing SimpleLazyObject or adding this feature directly LazyObject sounds more reasonable to me.

comment:5 by Nick Pope, 5 years ago

This is really helpful in middleware to be able to check whether the user has been evaluated, so you can access the user if it has already been accessed, but can avoid fetching user if it hasn't already been fetched.

I can't understand the use case described here. Surely you either need to access it in the middleware, or you don't. Maybe you could explain why you'd need to only get it if it has already been accessed...

FWIW I'd much prefer we define a blessed way of looking this up then maintaining compatibility with this implementation specific _cached_user hack.

Agreed. This always looked like a hack and it is an undocumented and private variable after all. That said, a way to determine whether LazyObject has been resolved would be good. The main challenge is conflicting attributes with the wrapped object, so I think .loaded is too problematic. I would suggest creating and documenting _resolved to go alongside _wrapped:

class LazyObject:

    ...

    @property
    def _resolved(self):
        return self._wrapped is not empty

    ...

I also note that LazyObject and SimpleLazyObject are vaguely referenced in the documentation, but mainly in release notes where it occasionally says things like "if you use ... in your own code". There is no proper public documentation so these are sort of internals that are sometimes used in projects. Do we really want to make this officially public API?

in reply to:  5 comment:6 by Sergey Fedoseev, 5 years ago

Replying to Nick Pope:

This is really helpful in middleware to be able to check whether the user has been evaluated, so you can access the user if it has already been accessed, but can avoid fetching user if it hasn't already been fetched.

I can't understand the use case described here. Surely you either need to access it in the middleware, or you don't. Maybe you could explain why you'd need to only get it if it has already been accessed...

+1

I also note that LazyObject and SimpleLazyObject are vaguely referenced in the documentation, but mainly in release notes where it occasionally says things like "if you use ... in your own code". There is no proper public documentation so these are sort of internals that are sometimes used in projects. Do we really want to make this officially public API?

Also do we really want to guarantee that request.user is actually LazyObject?

comment:7 by Collin Anderson, 5 years ago

I have some urls (like checkout pages) that access the user and some urls that don't (like product pages). I might want to use middleware to log what the user is doing, but I don't want to cause a user+session lookup on product pages where user wasn't accessed.

Similarly, there's a request.session.accessed, but I'm seeing now that even that is undocumented. But I'm realizing I could probably get away with just checking request.session.accessed because most pages that look at the session are also going to pull up the user.

comment:8 by Carlos_Mir_de_Souza, 5 years ago

Description: modified (diff)

comment:9 by Carlton Gibson, 5 years ago

Severity: NormalRelease blocker
Triage Stage: UnreviewedAccepted

There is no proper public documentation so these are sort of internals that are sometimes used in projects. Do we really want to make this officially public API?

Grrr. I think it's more that sometimes.... 😬

The user has been cached on the request here for 13 years. 3c5782287efc197eccf556cf6915fc0d8b3e02bd. I think it we just move it we will see a large number of "You broke my site" reports. (So whatever we do we have to accept this as a Release Blocker on 3.1 no?)

The saving here is a function call (into the lambda to fetch the cached value). I don't think that's worth the disruption, even to deprecate.

Maybe adding an Official Getter™ is worthwhile. (If we had such, an eventual deprecation of _cached_user might be on, but even then...)

For better or worse there's a good amount of this kind of API that we say is private, but is used by lots of projects out there, and that we can't realistically change.

Also do we really want to guarantee that request.user is actually LazyObject?

Exactly this. Custom middleware's leveraging _cached_user but not setting a user that has _wrapped property... (and so on).

I think we should probably revert 2f010795e690550c8c6f56b3924c0f629cacb33b — but are the other options really feasible?

Version 0, edited 5 years ago by Carlton Gibson (next)

in reply to:  9 comment:10 by Sergey Fedoseev, 5 years ago

Replying to Carlton Gibson:

Grrr. I think it's more than sometimes.... 😬

I looked through the dozen of pages and I didn't found any case when it would broken by 2f010795e690550c8c6f56b3924c0f629cacb33b.

comment:11 by Carlton Gibson, 5 years ago

So, example from the first page.

If I'm then checking, as Collin suggested for _cached_user as a way of avoiding a lookup, that use will break. But so would a fallback to user._wrapped, because the custom lazy user in play doesn't work the way our lazy object does.

If we move the user cache from the request, we break a lot of expectations, but, more, we tie that cache into the internal implementation of LazyObject. (I took that to be the force of your objection to making request.user have to be a LazyObject — currently folks can (and do) set whatever they like... — is that not what you meant?)

comment:12 by Collin Anderson, 5 years ago

Actually, yeah, the more and more I look over the code search results. I might be one of the few that's actually broken by this. I couldn't find any examples of other projects breaking.

In most of these cases, they just copied and pasted Django's get_user() logic (_replacing_ django's logic), including Carlton's example. And I don't think their code would break from this change.

So I'd be in favor of closing and just seeing if anyone else runs into the same issue. It's my fault for using undocumented code. I can just check request.session.accessed instead or switch to checking request.user._wrapped. If there's a better way to do it, I'll just switch to that.

comment:13 by Carlton Gibson, 5 years ago

So I'd be in favor of closing and just seeing if anyone else runs into the same issue.

Ah yes, we could do that. :) It's undocumented so doesn't even need a release note in theory, though we might add one just to be polite.

TBH, for me, I'm still not sure avoiding the function invocation (to the fetch the cached value from the request) is worth the change of such a long-standing piece of code, so I'd still prefer to revert it.

Last edited 5 years ago by Carlton Gibson (previous) (diff)

comment:14 by Mariusz Felisiak, 5 years ago

Thanks y'all for opinions.

I agree with Carlton that we should revert this change, mainly because I'm pretty sure (~98%) that people are using this undocumented property and even if we add release notes ("Undocumented property .... is removed.") then we're not able to propose any alternative solution. We don't want to suggest using another undocumented property e.g. _wrapped.

comment:15 by Mariusz Felisiak, 5 years ago

Has patch: set

in reply to:  9 comment:16 by Nick Pope, 5 years ago

I'd not really been following along, so just caught up on the comments here.

If we revert, then I honestly think that this should be documented. Or at least have a comment in the code to say that this is going to be supported for third-parties and shouldn't be removed.


I've got to be honest and say if you're going to use code that is clearly marked as private - that is the underscore prefix convention in Python - you've got what's coming to you and you haven't got a leg to stand on. Yes, it doesn't stop me either, but I would accept that I'm in the wrong and will have to deal with the fallout. In addition, as highlighted above, many of the cases are copy-paste from Django's own code. That is just sloppy and if encouraged, or at least not discouraged, no changes would ever be possible because "someone might have copied it and used it" in a way that was never intended.

To quote the Python documentation:

“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice.

I also don't really understand why hasattr() checks are needed here in other middlewares. My question wasn't answered. Surely you just need to access request.user if you need the user object and if you don't, you don't. If this is to check whether we've called this and thus assigned request.user prior to using that elsewhere, then it sounds like a middleware ordering problem. Or not subclassing a middleware properly.

Grrr. I think it's more than sometimes.... 😬

I find GitHub code search to be dreadful. It doesn't provide capability to filter out partial paths. Yes that says 53K+ results, but the vast majority seem to be vendored copies of Django and other projects that have copied bits of Django and then been vendored themselves which will, frankly, not be affected.

The user has been cached on the request here for 13 years.

Just because something is old, that doesn't mean it shouldn't necessarily change.

Exactly this. Custom middleware's leveraging _cached_user but not setting a user that has _wrapped property... (and so on).

I was merely suggesting an example to complement proposals in other comments, but posited whether we actually wanted to do this as we'd have to document LazyObject which is currently private. (That is another private thing that unfortunately bleeds across public interfaces, notably in lazy translations.)


Anyway, this is a relatively small change so it isn't something that is the end of the world and I won't get in the way of it if the majority want to revert. I just have a different view that progress shouldn't be stymied just because something has always been that way, is old (and thus venerated?), or because people broke convention and thus we need to capitulate and accept all "bad" behaviour. Yes, there is not breaking backward compatibility, but that is different to not breaking something where backward compatibility was never guaranteed.

in reply to:  14 comment:17 by Sergey Fedoseev, 5 years ago

Replying to felixxm:

we're not able to propose any alternative solution

We don't have to, since the laziness itself of request.user is not documented.

comment:18 by Sergey Fedoseev, 5 years ago

Needs tests: set

I'm opposed to reverting, but if we end with it, the tests are needed.

comment:19 by Simon Charette, 5 years ago

I think that there's enough points being raised in both camps to warrant a discussion on the developer mailing list before reverting this change.

Since 2f010795e690550c8c6f56b3924c0f629cacb33b is not part of any release branch we should take the time to discuss this change with a wider audience before taking a decision.

comment:20 by Carlton Gibson <carlton.gibson@…>, 5 years ago

Resolution: fixed
Status: newclosed

In 6e99585c:

Fixed #30941 -- Reverted "Simplified AuthenticationMiddleware a bit."

This reverts commit 2f010795e690550c8c6f56b3924c0f629cacb33b.

comment:21 by Carlton Gibson, 5 years ago

Ooops. Sorry I didn't see the further conversation here before merging this this afternoon. Happy to discuss on the mailing list, and re-revert if we decide to go that way.

Note: See TracTickets for help on using tickets.
Back to Top