Opened 2 years ago

Closed 22 months ago

#19895 closed Bug (fixed)

Second iteration over an invalid queryset returns an empty list instead of an exception

Reported by: gnosek Owned by: gnosek
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords:
Cc: robert.coup@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

As a part of #17664 it was discovered that an invalid queryset only raises exceptions during the first iteration. When iterating over the queryset again, an empty list is returned, i.e. the following test case would fail:

    def test_invalid_qs_list(self):
        qs = Article.objects.order_by('invalid_column')
        self.assertRaises(FieldError, list, qs)
        self.assertRaises(FieldError, list, qs)

Attachments (3)

19895_1.diff (3.8 KB) - added by gnosek 2 years ago.
19895_2.diff (1.0 KB) - added by gnosek 2 years ago.
leak.tar.gz (2.9 KB) - added by akaariai 2 years ago.

Download all attachments as: .zip

Change History (19)

comment:1 Changed 2 years ago by gnosek

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Status changed from new to assigned

Changed 2 years ago by gnosek

Changed 2 years ago by gnosek

comment:2 Changed 2 years ago by gnosek

All the solutions I can come up with are apparently ugly. I'm attaching two versions of the patch for discussion (with tests stripped).

One solution is wrapping the iterator in another method, the other is putting the required try/catch in the iterator() method itself, which pushes the indentation to six levels deep maximum.

comment:3 Changed 2 years ago by gnosek

  • Has patch set
  • Patch needs improvement set

comment:4 Changed 2 years ago by gnosek

  • Patch needs improvement unset

As suggested by jacobkm on IRC, here's the updated patch:

https://github.com/django/django/pull/800

comment:5 Changed 2 years ago by Jacob Kaplan-Moss <jacob@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In d1e87eb3baf75b1b6a0ada46a9b77f7e347cdb60:

[1.5.x] Fixed #19895 -- Made second iteration over invalid queryset raise an exception too

When iteration over a queryset raised an exception, the result cache
remained initialized with an empty list, so subsequent iterations returned
an empty list instead of raising an exception

Backport of 2cd0edaa477b327024e4007c8eaf46646dcd0f21 from master.

comment:6 Changed 2 years ago by claudep

  • Patch needs improvement set
  • Resolution fixed deleted
  • Severity changed from Normal to Release blocker
  • Status changed from closed to new
  • Triage Stage changed from Unreviewed to Accepted
  • Type changed from Uncategorized to Bug

That commit is causing a serious ORM memory leak in one of my applications. It may be that my code is not the cleanest, but anyway, I consider this as a serious regression.

comment:7 Changed 2 years ago by akaariai

Attached is a minimalistic test case that will show the memory leak. The case is simple - have enough objects that one ITERATOR_CHUNK_SIZE will not convert all the objects (that is, more than 100 objects in the queryset). Do bool(qs). This will result in memory leak when this ticket's patch is applied, but will not leak if this ticket's patch isn't applied.

The reason for the leak is a bug in Python itself. The gc.garbage docs say that:
"""
A list of objects which the collector found to be unreachable but could not be freed (uncollectable objects). By default, this list contains only objects with __del__() methods. Objects that have __del__() methods and are part of a reference cycle cause the entire reference cycle to be uncollectable, including objects not necessarily in the cycle but reachable only from it. ...
"""

However, no __del__ method is defined anywhere, so there should not be any uncollectable objects. Also, pypy collects the garbage, so this is another thing pointing to a bug in Python.

I have tested this with Python 2.7.3 and Python 3.2.3, and both of those will leak. Pypy 1.8.0 collects the garbage correctly.

Steps to reproduce: unpack the attachment, run tester.py, see if gc.garbage has reference to _safe_iterator.

Even if this is a bug in Python this has to be fixed in Django itself. The memory leak can be bad. It seems just reverting the commit is the right fix.

Interestingly enough doing this change in Query.iterator() is enough to cause leak:

try:
    iterator() code here...
except Exception:
    raise

Changed 2 years ago by akaariai

comment:8 Changed 2 years ago by akaariai

Here is a minimalistic case showing the bug in Python:

class MyObj(object):
    def __iter__(self):
        self._iter = iter(self.iterator())
        return iter(self._iter)

    def iterator(self):
        try:
            while True:
                yield 1
        except Exception:
            raise
i = next(iter(MyObj()))
import gc
gc.collect()
print(gc.garbage)

comment:9 Changed 2 years ago by akaariai

I filed a bug to Python bug tracker. http://bugs.python.org/issue17468

Does anybody see any other solution than reverting the patch?

comment:10 Changed 2 years ago by carljm

I think we should roll back the patch. Your queryset-iteration simplification patch will fix this bug anyway, correct?

comment:11 Changed 2 years ago by akaariai

The more complex version of the simplification patch has this same issue. It is likely possible to work around this issue in the patch.

As for 1.5 a roll back seems like the only option.

comment:12 Changed 2 years ago by Claude Paroz <claude@…>

  • Resolution set to fixed
  • Status changed from new to closed

In b91067d9aa42e31d4375e00a703beaacdb30d608:

[1.5.x] Revert "Fixed #19895 -- Made second iteration over invalid queryset raise an exception too"

This reverts commit d1e87eb3baf75b1b6a0ada46a9b77f7e347cdb60.
This commit was the cause of a memory leak. See ticket for more details.
Thanks Anssi Kääriäinen for identifying the source of the bug.

comment:13 Changed 2 years ago by claudep

  • Resolution fixed deleted
  • Status changed from closed to new

comment:14 Changed 2 years ago by rcoup

  • Cc robert.coup@… added

comment:15 Changed 22 months ago by akaariai

  • Severity changed from Release blocker to Normal

This isn't a release blocker any more, the leak is fixed, the second iteration works in the same way as before.

comment:16 Changed 22 months ago by claudep

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.
Back to Top