#35279 closed Cleanup/optimization (invalid)
Memory Leak with `prefetch_related`
| Reported by: | Ken Tong | Owned by: | nobody |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 4.2 |
| Severity: | Normal | Keywords: | memory leak |
| Cc: | Triage Stage: | Unreviewed | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
Memory Leak after calling queryset.prefetch_related() or prefetch_related_objects()
To reproduce:
import gc
from django.db import models
from django.db.models import prefetch_related_objects
class Foo(models.Model):
id = models.AutoField(primary_key=True)
class Bar(models.Model):
id = models.AutoField(primary_key=True)
foo = models.ForeignKey(Foo, on_delete=models.CASCADE)
def prepare_data():
if Foo.objects.exists():
return
foo = Foo()
foo.save()
bar = Bar(foo=foo)
bar.save()
def test1():
# no prefetch
for foo in Foo.objects.all():
for bar in foo.bar_set.all():
print(foo.id, bar.id)
def test2():
# queryset.prefetch_related()
for foo in Foo.objects.prefetch_related("bar_set").all():
for bar in foo.bar_set.all():
print(foo.id, bar.id)
def test3():
# prefetch_related_objects()
foo_list = list(Foo.objects.all())
prefetch_related_objects(foo_list, "bar_set")
for foo in foo_list:
for bar in foo.bar_set.all():
print(foo.id, bar.id)
def run():
prepare_data()
# warn up
test1()
test2()
test3()
gc.collect()
gc.set_debug(gc.DEBUG_LEAK)
gc.collect()
print(f"baseline - garbage count: {len(gc.garbage)}")
test1()
gc.collect()
print(f"test1 - garbage count: {len(gc.garbage)}")
test2()
gc.collect()
print(f"test2 - garbage count: {len(gc.garbage)}")
test3()
gc.collect()
print(f"test3 - garbage count: {len(gc.garbage)}")
gc.set_debug(0)
run()
Output
1 1 1 1 1 1 baseline - garbage count: 0 1 1 test1 - garbage count: 0 # no memory leak 1 1 test2 - garbage count: 23 # 23 objects leaked 1 1 test3 - garbage count: 46 # another 23 objects leaked
Change History (6)
comment:1 by , 20 months ago
comment:2 by , 20 months ago
| Component: | Uncategorized → Database layer (models, ORM) |
|---|---|
| Triage Stage: | Unreviewed → Accepted |
| Type: | Bug → Cleanup/optimization |
Interesting, thanks for the report. Tentatively accepted for further investigation.
comment:3 by , 20 months ago
The following code snippet shows the same result:
import gc
class Parent:
def __init__(self):
self.cache = {}
class Child:
def __init__(self, parent):
self.parent = parent
def test():
foo = Parent()
bar = Child(parent=foo)
foo.cache["bars"] = [bar]
print(foo.cache, bar.parent)
test()
gc.collect()
print(len(gc.garbage))
gc.set_debug(gc.DEBUG_LEAK)
gc.collect()
print(len(gc.garbage))
test()
gc.collect()
print(len(gc.garbage))
Results in following output
{'bars': [<__main__.Child object at 0x6f520cdd90>]} <__main__.Parent object at 0x6f520cd6d0>
0
0
{'bars': [<__main__.Child object at 0x6f520b32d0>]} <__main__.Parent object at 0x6f520b1fd0>
gc: collectable <Parent 0x6f520b1fd0>
gc: collectable <Child 0x6f520b32d0>
gc: collectable <list 0x6f520b1600>
gc: collectable <dict 0x6f520b1e80>
4
Removing the gc.set_debug statement, the gc.garbage is always empty, so it looks like à side effect of DEBUG_LEAK.
{'bars': [<__main__.Child object at 0x7535cf1d90>]} <__main__.Parent object at 0x7535cf1650>
0
0
{'bars': [<__main__.Child object at 0x7535cd7310>]} <__main__.Parent object at 0x7535cd5fd0>
0
As per the gc documentation:
To debug a leaking program call gc.set_debug(gc.DEBUG_LEAK). Notice that this includes gc.DEBUG_SAVEALL, causing garbage-collected objects to be saved in gc.garbage for inspection.
So, using DEBUG_LEAK leads to collected objects to be present in gc.garbage. So, I would say that looking at gc.garbage in this case does not identifies a memory leak. On the contrary, it shows objects that were garbage collected
comment:4 by , 20 months ago
Thank you for your detailed explanation, Antoine. I confirm that memory leak is a false alarm and I am sorry about it
comment:5 by , 20 months ago
| Resolution: | → invalid |
|---|---|
| Status: | new → closed |
Hi Team,
So far I am adding the code below in the appropriate lines in order to fix the memory leak in my projects. Hopefully there will be a fix and documented way to properly clean up the cache.
foo._prefetched_objects_cache.pop("bar_set")Thank you for your attention!