Opened 14 years ago

Closed 14 years ago

Last modified 14 years ago

#12374 closed (worksforme)

QuerySet .iterator() loads everything into memory anyway

Reported by: Nick Welch <mackstann@…> Owned by: nobody
Component: Database layer (models, ORM) Version: 1.1
Severity: Keywords: orm, cache, iterator
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Iterating through the result of .iterator() still causes a huge spike in memory consumption. In contrast, loading only one record with [:1] does not.

Others have run into this problem:

http://stackoverflow.com/questions/1443279/django-iterate-over-a-query-set-without-cache

Notice his follow-up comment to the suggestion of using .iterator():

"Its still chewing through a ton of RAM when I use your call. :("

This has been my experience as well.

Change History (5)

comment:1 by Alex Gaynor, 14 years ago

Resolution: worksforme
Status: newclosed

The author there identifies the issue as being the mysql query cache (in mysqldb I imagine). There's nothing we can do about that, when you use .iterator() django doesn't cache the data.

comment:3 by walk_n_wind, 14 years ago

Resolution: worksforme
Status: closedreopened

I'm having this problem with QuerySet.iterator() as well - we have a table with about 2 million records in it, and looping it through it the following way causes the Python process to gather upwards of 2GB of RAM (from about 50MB to start):

for p in Property.objects.all().iterator()

print "hello"

Definitely confused because there's very little out there that says iterator() didn't help with memory consumption, and lots more out there that says iterator() did help. I'm running Django 1.1 with Python 2.6.4.

Does Django make use of the psycopg2 mentioned above?

comment:4 by Russell Keith-Magee, 14 years ago

Resolution: worksforme
Status: reopenedclosed

The most likely cause of problems here is having DEBUG=True enabled; this will cause the debug cursor to soak up memory.

If you can validate that this problem still exists with DEBUG=False (and without the MySQL query cache problems), please reopen.

comment:5 by zombie_ninja, 14 years ago

I have experienced the similar problem and I have set DEBUG=False in settings. I have over 7 million rows of records in my Postgresql database.

My current setup is Debian lenny running django 1.2 (lenny backports), postgresql 8.3.7

Currently I have been using pagination approach to iterate through all records as iterator() uses up close to all of my system memory.

However, it was interesting to see that I could iterator through the dataset by doing the following (using only fraction of my system memory):

query_set = Article.objects.all()
cache = query_set._fill_cache
i = 0
while (True):

try:

r = cache.im_self[i]
print r.id, r.title, r.author
i += 1

except Exception, e:

break

Would appreciate it if someone could give some insight to this.

Regards.


Note: See TracTickets for help on using tickets.
Back to Top