Code

#19895 closed Bug (fixed)

Second iteration over an invalid queryset returns an empty list instead of an exception

Reported by: gnosek Owned by: gnosek
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords:
Cc: robert.coup@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

As a part of #17664 it was discovered that an invalid queryset only raises exceptions during the first iteration. When iterating over the queryset again, an empty list is returned, i.e. the following test case would fail:

    def test_invalid_qs_list(self):
        qs = Article.objects.order_by('invalid_column')
        self.assertRaises(FieldError, list, qs)
        self.assertRaises(FieldError, list, qs)

Attachments (3)

19895_1.diff (3.8 KB) - added by gnosek 17 months ago.
19895_2.diff (1.0 KB) - added by gnosek 17 months ago.
leak.tar.gz (2.9 KB) - added by akaariai 16 months ago.

Download all attachments as: .zip

Change History (19)

comment:1 Changed 17 months ago by gnosek

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Status changed from new to assigned

Changed 17 months ago by gnosek

Changed 17 months ago by gnosek

comment:2 Changed 17 months ago by gnosek

All the solutions I can come up with are apparently ugly. I'm attaching two versions of the patch for discussion (with tests stripped).

One solution is wrapping the iterator in another method, the other is putting the required try/catch in the iterator() method itself, which pushes the indentation to six levels deep maximum.

comment:3 Changed 17 months ago by gnosek

  • Has patch set
  • Patch needs improvement set

comment:4 Changed 17 months ago by gnosek

  • Patch needs improvement unset

As suggested by jacobkm on IRC, here's the updated patch:

https://github.com/django/django/pull/800

comment:5 Changed 17 months ago by Jacob Kaplan-Moss <jacob@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In d1e87eb3baf75b1b6a0ada46a9b77f7e347cdb60:

[1.5.x] Fixed #19895 -- Made second iteration over invalid queryset raise an exception too

When iteration over a queryset raised an exception, the result cache
remained initialized with an empty list, so subsequent iterations returned
an empty list instead of raising an exception

Backport of 2cd0edaa477b327024e4007c8eaf46646dcd0f21 from master.

comment:6 Changed 16 months ago by claudep

  • Patch needs improvement set
  • Resolution fixed deleted
  • Severity changed from Normal to Release blocker
  • Status changed from closed to new
  • Triage Stage changed from Unreviewed to Accepted
  • Type changed from Uncategorized to Bug

That commit is causing a serious ORM memory leak in one of my applications. It may be that my code is not the cleanest, but anyway, I consider this as a serious regression.

comment:7 Changed 16 months ago by akaariai

Attached is a minimalistic test case that will show the memory leak. The case is simple - have enough objects that one ITERATOR_CHUNK_SIZE will not convert all the objects (that is, more than 100 objects in the queryset). Do bool(qs). This will result in memory leak when this ticket's patch is applied, but will not leak if this ticket's patch isn't applied.

The reason for the leak is a bug in Python itself. The gc.garbage docs say that:
"""
A list of objects which the collector found to be unreachable but could not be freed (uncollectable objects). By default, this list contains only objects with __del__() methods. Objects that have __del__() methods and are part of a reference cycle cause the entire reference cycle to be uncollectable, including objects not necessarily in the cycle but reachable only from it. ...
"""

However, no __del__ method is defined anywhere, so there should not be any uncollectable objects. Also, pypy collects the garbage, so this is another thing pointing to a bug in Python.

I have tested this with Python 2.7.3 and Python 3.2.3, and both of those will leak. Pypy 1.8.0 collects the garbage correctly.

Steps to reproduce: unpack the attachment, run tester.py, see if gc.garbage has reference to _safe_iterator.

Even if this is a bug in Python this has to be fixed in Django itself. The memory leak can be bad. It seems just reverting the commit is the right fix.

Interestingly enough doing this change in Query.iterator() is enough to cause leak:

try:
    iterator() code here...
except Exception:
    raise

Changed 16 months ago by akaariai

comment:8 Changed 16 months ago by akaariai

Here is a minimalistic case showing the bug in Python:

class MyObj(object):
    def __iter__(self):
        self._iter = iter(self.iterator())
        return iter(self._iter)

    def iterator(self):
        try:
            while True:
                yield 1
        except Exception:
            raise
i = next(iter(MyObj()))
import gc
gc.collect()
print(gc.garbage)

comment:9 Changed 16 months ago by akaariai

I filed a bug to Python bug tracker. http://bugs.python.org/issue17468

Does anybody see any other solution than reverting the patch?

comment:10 Changed 16 months ago by carljm

I think we should roll back the patch. Your queryset-iteration simplification patch will fix this bug anyway, correct?

comment:11 Changed 16 months ago by akaariai

The more complex version of the simplification patch has this same issue. It is likely possible to work around this issue in the patch.

As for 1.5 a roll back seems like the only option.

comment:12 Changed 16 months ago by Claude Paroz <claude@…>

  • Resolution set to fixed
  • Status changed from new to closed

In b91067d9aa42e31d4375e00a703beaacdb30d608:

[1.5.x] Revert "Fixed #19895 -- Made second iteration over invalid queryset raise an exception too"

This reverts commit d1e87eb3baf75b1b6a0ada46a9b77f7e347cdb60.
This commit was the cause of a memory leak. See ticket for more details.
Thanks Anssi Kääriäinen for identifying the source of the bug.

comment:13 Changed 16 months ago by claudep

  • Resolution fixed deleted
  • Status changed from closed to new

comment:14 Changed 16 months ago by rcoup

  • Cc robert.coup@… added

comment:15 Changed 14 months ago by akaariai

  • Severity changed from Release blocker to Normal

This isn't a release blocker any more, the leak is fixed, the second iteration works in the same way as before.

comment:16 Changed 14 months ago by claudep

  • Resolution set to fixed
  • Status changed from new to closed

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.