Opened 23 months ago

Closed 23 months ago

Last modified 22 months ago

#20393 closed Uncategorized (wontfix)

django.db.models.query.QuerySet.__repr__ should not have side-effects

Reported by: justin@… Owned by: nobody
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords: QuerySet, repr, side-effect
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

In trying to track down some timeouts and strange queries in our apps we saw some strange queries being run by django. We were querying for a single record with a FOR UPDATE, getting an error, then django would run the same query 3 times but with a limit of 21 and then fetch all of the records, each time.

Eventually I tracked this down to our error-handling middleware calling repr() on a QuerySet, which then queried the database. __repr__ is supposed to be a string-representation of an object in its current state and should not gave side-effects like making database queries, especially in the case where the query uses FOR UPDATE or similar logic.

I suggest changing django.db.models.query.QuerySet.__repr__ to only touch the cached records and not use the local iterator, which can and will query the DB, potentially causing issues like this.

Change History (8)

comment:1 Changed 23 months ago by anonymous

  • Has patch set
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

Patch in pull request

comment:2 Changed 23 months ago by claudep

Which pull request?

comment:3 Changed 23 months ago by lukeplant

The reason for the current behaviour is to help interactive coding and learning of Django - if you type an expression that returns a QuerySet, you get something helpful. See the tutorial for example:

https://docs.djangoproject.com/en/1.5/intro/tutorial01/

This would have to be considered when weighing up this ticket - a lot of Django tutorials (our own and others) would need significant modification, and the habits of people used to the current behaviour.

comment:4 Changed 23 months ago by anonymous

pull request: https://github.com/django/django/pull/1055

I realize that it may inconvenience simple usage, but functions like str and repr really should not be causing side effects in general.

In particular, debuggers and error reporting can cause strange side-effects and errors. In our application, we've implemented a mysql connection pool in the db backend. This works fine until an error happens and the error-catching middleware we're using (sentry/raven) starts accessing these query objects after the connection has been returned to the pool.

comment:5 Changed 23 months ago by lukeplant

  • Resolution set to wontfix
  • Status changed from new to closed

Well, you can argue that semantically these methods don't have side effects. They do 'SELECT', not 'UPDATE'. They do happen to use IO, and can therefore fail in all kinds of ways. "SELECT FOR UPDATE" is an unfortunate edge case.

Looking at these docs:

http://docs.python.org/2/reference/datamodel.html#object.__repr__

... you have a conflict between the goal of being information rich and with always being useful for debugging. In general, automatically seeing the results is information rich and is useful in debugging, but sometimes it causes further problems.

I have thought about it, and with this conflict of goals, I think we are going to have to err on the side of the current behaviour. It is possible to work around by not calling repr on QuerySets in your middleware, and there are too many people who will be upset (or have problems with their own debugging facilities) by changing this.

comment:6 Changed 23 months ago by lukeplant

I should have added - feel free to persuade us otherwise on the DevelopersMailingList

comment:7 Changed 23 months ago by patrys

What if we changed __repr__ to be safe (only print query params) and allowed a slice with Ellipsis to return a list of first N elements, possibly executing the query if not evaluated earlier?

Example:

>>> x = Foo.objects.filter(age__gte=5)
<QuerySet(Foo, …)>
>>> x[...]
[<Foo>, <Foo>, <Foo>, <Foo>, '11 more...']

comment:8 Changed 22 months ago by anonymous

That sounds like a great change to me.

Note: See TracTickets for help on using tickets.
Back to Top