Opened 7 years ago

Closed 7 years ago

#15748 closed Bug (invalid)

QuerySet subscripting won't work as expected

Reported by: zhutao.iscas@… Owned by: nobody
Component: Database layer (models, ORM) Version: 1.2
Severity: Normal Keywords: QuerySet
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:


when got a QuerySet by using filter and it's easy to use the following code to do the change and save operation:

qs = SomeModel.objects.filter(owner_id=123)
# suppose qs has 1 or many elements
last_login_time = qs[0].last_login_time
qs[0].last_login_time =    # I expect it can assign the new value, but it won't
assertEquals(qs[0].last_login_time, old_login_time)   # YES, it doesn't change
qs[0].save()   #So it won't update the old record

And after figuring this out, the following code will be used instead and it works:

qs = SomeModel.objects.filter(owner_id=123)
# suppose qs has 1 or many elements
obj = qs[0]
last_login_time = obj.last_login_time
obj.last_login_time =    # I expect it can assign the new value, but it will
assertNotEquals(obj.last_login_time, old_login_time)   # YES, it does change   #So it will update the old record as expected

And I have met some of my friends/colleagues use the first approach to do the record updating. And IMO, it's
natural and prone to use. (when you *type qs[0]* and *type obj* , they have the same type)

After reading the code(db.models.query), it can be figured out why.(when you subscript the QuerySet it will use the qs = self._clone() and assigning a value won't change at all)

Possible solutions:

  1. make the assigning work for the subscripting QuerySet
  2. announce the above first approach is wrong and let the users know it

Change History (1)

comment:1 Changed 7 years ago by Luke Plant

Resolution: invalid
Status: newclosed

The code posted is really not idiomatic - if you are expecting there to be one item, you should be using 'get' as documented. If you expecting more than one item and need to deal with them all you should be iterating over the queryset. Both of these ways will be efficient - we have optimized the API to make idiomatic usage efficient, and we cannot optimize for all possible ways of accessing querysets.

Regarding 1:

If you look at implementing this, you'll see why it is a bad idea. It would mean that if I start to index a queryset, the queryset has to cache the values returned. We then have to decide how to issue the SQL query:

  • either we issue the full query, with no limits, which could be grossly inefficient if we are only going to need one item
  • or we issue multiple queries, limiting to a single item, populating a *sparse* result cache. The problem with this is that is a grossly inefficient way to use a queryset if more than one item is going to be needed. Adding a sparse result cache also really complicates the behaviour of the queryset, and when combined with the normal result cache it makes the behaviour of the queryset with regards to efficiency and executing SQL queries much harder to understand.

We already have a result cache that kicks in if you start iterating (rather than indexing), but in that context it is sensible because it is sensible to anticipate that if the user is iterating over the results, they will probably want all of them eventually.

Regarding 2:

This behaviour could be guessed from the documented way that slicing/indexing querysets work (i.e. that they are lazy queries and not lists) and the fact that the ORM is not an identity mapper.

We can't enumerate every wrong way to use an API, and adding docs that mention this won't help - if it is an issue of what people expect to work, they won't scour docs to find out if it is going to work or not, they will just try it.

Our documentation does in fact contain all this in enough detail:

Note: See TracTickets for help on using tickets.
Back to Top