Opened 4 years ago

Closed 4 years ago

#23555 closed Bug (fixed)

QuerySet.first suppresses internal IndexError exceptions

Reported by: Artem Rizhov Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: QuerySet first IndexError
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Current implementation of this method relies on IndexError when checking the item existence. However this exception may be raised somewhere inside of own __getitem__() which does a lot of work. This way the potential errors are hidden. The method will return None in case of a bug that raises IndexError internally. The Zen of Python says "Errors should never pass silently."

https://github.com/django/django/blob/e9103402c0fa873aea58a6a11dba510cd308cb84/django/db/models/query.py#L515

Change History (8)

comment:1 Changed 4 years ago by Artem Rizhov

Has patch: set

A fix is available from branch ticket_23555 of my github repo - https://github.com/artemrizhov/django/tree/ticket_23555

Also created a pull request - https://github.com/django/django/pull/3274

comment:2 Changed 4 years ago by Artem Rizhov

Created second pull request with alternate solution for those who prefer to use try/catch - https://github.com/django/django/pull/3277

comment:3 Changed 4 years ago by Thomas Chaumeny

I don't think there is a problem with the current implementation. There is no reason why QuerySet.__getitem__() shouldn't raise an IndexError as it aims to provide a similar interface to a list.

Note that this is a common thing in Python to raise such an IndexError in similar situation, see those examples in cPython itself :
https://github.com/python/cpython/blob/c7688b44387d116522ff53c0927169db45969f0e/Lib/xml/dom/pulldom.py#L219
https://github.com/python/cpython/blob/c7688b44387d116522ff53c0927169db45969f0e/Lib/ipaddress.py#L666

comment:4 Changed 4 years ago by Artem Rizhov

Sorry for my English, my explanation is probably not too clear. But if you see the pull request and the test that I've added to the branch, you probably will understand the problem. There is also a lot of comments on the test case code.

Let me try to explain one more time.

You are right. There is no reason why __getitem__ shouldn't raise an IndexError. This is correct and normal behaviour. I'm talking about another case.

IndexError from QuerySet.__getitem__ is not always what you think, because it may be produced by nested calls, not by this method itself.

Let's remember that QuerySet is a lazy request to DB. So when you write qs[i], it makes actual query to the DB, retrieves a list of items and returns i-th element. IndexError maybe raised by some another method that is called from __getitem__, especially by __iter__ and any nested call. For example by the iterator implementation, or by database adaptor. For example if internal list of db connections is empty, but the adaptor tries to get one from the list. Such abnormal situation should not be treated as "no such element in the queryset". IndexError in nested calls may mean anything, but probably not what you mean. If any nested call produces IndexError, the caller (_getitem__, __iter__ and another nested methods) is responsible to handle this right way, because it is aware of the source of error. Your outer code can't handle such error just because it's not aware of the source.

Hope this is clear now. If you are in doubt, please review the test in the pull request.

So what is the problem?

You can use try/except if you access i-th element of a list. Because you are sure IndexError means "no such element" in this certain list. But you should not rely on try/except for IndexError when accessing i-th element of QuerySet. You should not. But the current implementation of first() method does this. It uses "standard" try/except in the case where it is not safe:

https://github.com/django/django/blob/e9103402c0fa873aea58a6a11dba510cd308cb84/django/db/models/query.py#L515

try:
    return qs[0]
except IndexError:
    return None

This would work if qs as plain list. But qs is QuerySet with all it's magic. It's __getitem__ is not simple and rely on successful execution of a lot of nested calls. If any of those calls produces unexpected exception (either IndexError or another one), this should not be treated as "empty query set", and should not be suppressed. The outer code should receive this exception as a singal that something goes wrong. But first() method suppresses such errors.

Hope you understand the problem now. If no then please review the pull request, especially my tests, and the comments on the PR.

Thanks for the attention.

comment:5 Changed 4 years ago by Aymeric Augustin

Triage Stage: UnreviewedAccepted

comment:6 Changed 4 years ago by Tim Graham

Triage Stage: AcceptedReady for checkin

See the PR for my reservation about adding a test for this (I perceive it to be mostly a code cleanup). But I'll let the committer make a decision.

comment:7 Changed 4 years ago by Carl Meyer

If a code cleanup has a specific purpose other than readability/style, it's worth capturing that purpose in a test if possible, to help prevent regression. Same reason we'd add a test for any other code change.

comment:8 Changed 4 years ago by Tim Graham <timograham@…>

Resolution: fixed
Status: newclosed

In ca61195827efdf6ec54e8280e4cfd9a34f07b8b4:

Fixed #23555 -- Avoided suppressing IndexError in QuerySet.first() and .last()

Note: See TracTickets for help on using tickets.
Back to Top