Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#20393 closed Uncategorized (wontfix)

django.db.models.query.QuerySet.__repr__ should not have side-effects

Reported by: justin@… Owned by: nobody
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords: QuerySet, repr, side-effect
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

In trying to track down some timeouts and strange queries in our apps we saw some strange queries being run by django. We were querying for a single record with a FOR UPDATE, getting an error, then django would run the same query 3 times but with a limit of 21 and then fetch all of the records, each time.

Eventually I tracked this down to our error-handling middleware calling repr() on a QuerySet, which then queried the database. __repr__ is supposed to be a string-representation of an object in its current state and should not gave side-effects like making database queries, especially in the case where the query uses FOR UPDATE or similar logic.

I suggest changing django.db.models.query.QuerySet.__repr__ to only touch the cached records and not use the local iterator, which can and will query the DB, potentially causing issues like this.

Change History (8)

comment:1 Changed 2 years ago by anonymous

  • Has patch set
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

Patch in pull request

comment:2 Changed 2 years ago by claudep

Which pull request?

comment:3 Changed 2 years ago by lukeplant

The reason for the current behaviour is to help interactive coding and learning of Django - if you type an expression that returns a QuerySet, you get something helpful. See the tutorial for example:

https://docs.djangoproject.com/en/1.5/intro/tutorial01/

This would have to be considered when weighing up this ticket - a lot of Django tutorials (our own and others) would need significant modification, and the habits of people used to the current behaviour.

comment:4 Changed 2 years ago by anonymous

pull request: https://github.com/django/django/pull/1055

I realize that it may inconvenience simple usage, but functions like str and repr really should not be causing side effects in general.

In particular, debuggers and error reporting can cause strange side-effects and errors. In our application, we've implemented a mysql connection pool in the db backend. This works fine until an error happens and the error-catching middleware we're using (sentry/raven) starts accessing these query objects after the connection has been returned to the pool.

comment:5 Changed 2 years ago by lukeplant

  • Resolution set to wontfix
  • Status changed from new to closed

Well, you can argue that semantically these methods don't have side effects. They do 'SELECT', not 'UPDATE'. They do happen to use IO, and can therefore fail in all kinds of ways. "SELECT FOR UPDATE" is an unfortunate edge case.

Looking at these docs:

http://docs.python.org/2/reference/datamodel.html#object.__repr__

... you have a conflict between the goal of being information rich and with always being useful for debugging. In general, automatically seeing the results is information rich and is useful in debugging, but sometimes it causes further problems.

I have thought about it, and with this conflict of goals, I think we are going to have to err on the side of the current behaviour. It is possible to work around by not calling repr on QuerySets in your middleware, and there are too many people who will be upset (or have problems with their own debugging facilities) by changing this.

comment:6 Changed 2 years ago by lukeplant

I should have added - feel free to persuade us otherwise on the DevelopersMailingList

comment:7 Changed 2 years ago by patrys

What if we changed __repr__ to be safe (only print query params) and allowed a slice with Ellipsis to return a list of first N elements, possibly executing the query if not evaluated earlier?

Example:

>>> x = Foo.objects.filter(age__gte=5)
<QuerySet(Foo, …)>
>>> x[...]
[<Foo>, <Foo>, <Foo>, <Foo>, '11 more...']

comment:8 Changed 2 years ago by anonymous

That sounds like a great change to me.

Note: See TracTickets for help on using tickets.
Back to Top