Opened 4 years ago

Closed 4 years ago

#19895 closed Bug (fixed)

Second iteration over an invalid queryset returns an empty list instead of an exception

Reported by: Grzegorz Nosek Owned by: Grzegorz Nosek
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords:
Cc: robert.coup@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

As a part of #17664 it was discovered that an invalid queryset only raises exceptions during the first iteration. When iterating over the queryset again, an empty list is returned, i.e. the following test case would fail:

    def test_invalid_qs_list(self):
        qs = Article.objects.order_by('invalid_column')
        self.assertRaises(FieldError, list, qs)
        self.assertRaises(FieldError, list, qs)

Attachments (3)

19895_1.diff (3.8 KB) - added by Grzegorz Nosek 4 years ago.
19895_2.diff (1.0 KB) - added by Grzegorz Nosek 4 years ago.
leak.tar.gz (2.9 KB) - added by Anssi Kääriäinen 4 years ago.

Download all attachments as: .zip

Change History (19)

comment:1 Changed 4 years ago by Grzegorz Nosek

Status: newassigned

Changed 4 years ago by Grzegorz Nosek

Attachment: 19895_1.diff added

Changed 4 years ago by Grzegorz Nosek

Attachment: 19895_2.diff added

comment:2 Changed 4 years ago by Grzegorz Nosek

All the solutions I can come up with are apparently ugly. I'm attaching two versions of the patch for discussion (with tests stripped).

One solution is wrapping the iterator in another method, the other is putting the required try/catch in the iterator() method itself, which pushes the indentation to six levels deep maximum.

comment:3 Changed 4 years ago by Grzegorz Nosek

Has patch: set
Patch needs improvement: set

comment:4 Changed 4 years ago by Grzegorz Nosek

Patch needs improvement: unset

As suggested by jacobkm on IRC, here's the updated patch:

https://github.com/django/django/pull/800

comment:5 Changed 4 years ago by Jacob Kaplan-Moss <jacob@…>

Resolution: fixed
Status: assignedclosed

In d1e87eb3baf75b1b6a0ada46a9b77f7e347cdb60:

[1.5.x] Fixed #19895 -- Made second iteration over invalid queryset raise an exception too

When iteration over a queryset raised an exception, the result cache
remained initialized with an empty list, so subsequent iterations returned
an empty list instead of raising an exception

Backport of 2cd0edaa477b327024e4007c8eaf46646dcd0f21 from master.

comment:6 Changed 4 years ago by Claude Paroz

Patch needs improvement: set
Resolution: fixed
Severity: NormalRelease blocker
Status: closednew
Triage Stage: UnreviewedAccepted
Type: UncategorizedBug

That commit is causing a serious ORM memory leak in one of my applications. It may be that my code is not the cleanest, but anyway, I consider this as a serious regression.

comment:7 Changed 4 years ago by Anssi Kääriäinen

Attached is a minimalistic test case that will show the memory leak. The case is simple - have enough objects that one ITERATOR_CHUNK_SIZE will not convert all the objects (that is, more than 100 objects in the queryset). Do bool(qs). This will result in memory leak when this ticket's patch is applied, but will not leak if this ticket's patch isn't applied.

The reason for the leak is a bug in Python itself. The gc.garbage docs say that:
"""
A list of objects which the collector found to be unreachable but could not be freed (uncollectable objects). By default, this list contains only objects with __del__() methods. Objects that have __del__() methods and are part of a reference cycle cause the entire reference cycle to be uncollectable, including objects not necessarily in the cycle but reachable only from it. ...
"""

However, no __del__ method is defined anywhere, so there should not be any uncollectable objects. Also, pypy collects the garbage, so this is another thing pointing to a bug in Python.

I have tested this with Python 2.7.3 and Python 3.2.3, and both of those will leak. Pypy 1.8.0 collects the garbage correctly.

Steps to reproduce: unpack the attachment, run tester.py, see if gc.garbage has reference to _safe_iterator.

Even if this is a bug in Python this has to be fixed in Django itself. The memory leak can be bad. It seems just reverting the commit is the right fix.

Interestingly enough doing this change in Query.iterator() is enough to cause leak:

try:
    iterator() code here...
except Exception:
    raise

Changed 4 years ago by Anssi Kääriäinen

Attachment: leak.tar.gz added

comment:8 Changed 4 years ago by Anssi Kääriäinen

Here is a minimalistic case showing the bug in Python:

class MyObj(object):
    def __iter__(self):
        self._iter = iter(self.iterator())
        return iter(self._iter)

    def iterator(self):
        try:
            while True:
                yield 1
        except Exception:
            raise
i = next(iter(MyObj()))
import gc
gc.collect()
print(gc.garbage)

comment:9 Changed 4 years ago by Anssi Kääriäinen

I filed a bug to Python bug tracker. http://bugs.python.org/issue17468

Does anybody see any other solution than reverting the patch?

comment:10 Changed 4 years ago by Carl Meyer

I think we should roll back the patch. Your queryset-iteration simplification patch will fix this bug anyway, correct?

comment:11 Changed 4 years ago by Anssi Kääriäinen

The more complex version of the simplification patch has this same issue. It is likely possible to work around this issue in the patch.

As for 1.5 a roll back seems like the only option.

comment:12 Changed 4 years ago by Claude Paroz <claude@…>

Resolution: fixed
Status: newclosed

In b91067d9aa42e31d4375e00a703beaacdb30d608:

[1.5.x] Revert "Fixed #19895 -- Made second iteration over invalid queryset raise an exception too"

This reverts commit d1e87eb3baf75b1b6a0ada46a9b77f7e347cdb60.
This commit was the cause of a memory leak. See ticket for more details.
Thanks Anssi Kääriäinen for identifying the source of the bug.

comment:13 Changed 4 years ago by Claude Paroz

Resolution: fixed
Status: closednew

comment:14 Changed 4 years ago by Robert Coup

Cc: robert.coup@… added

comment:15 Changed 4 years ago by Anssi Kääriäinen

Severity: Release blockerNormal

This isn't a release blocker any more, the leak is fixed, the second iteration works in the same way as before.

comment:16 Changed 4 years ago by Claude Paroz

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.
Back to Top