#18702 closed Cleanup/optimization (fixed)
Remove chunked reads from iter(qs)
Reported by: | Anssi Kääriäinen | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.4 |
Severity: | Normal | Keywords: | |
Cc: | charette.s@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The queryset iterator protocol does convert rows lazily to objects when iterated. This has two advantages:
- If one iterates just part of the queryset, there is no need to do model conversion for all objects.
- Again, if iterating just part of the qs, some backends allow you to fetch just part of the rows from the DB (oracle, for example).
However, there are some costs, too:
- Complexity in the
__iter__
-> _result_iter -> (_results_cache, _iter) -> _iterator implementation. - The lazy fetching costs around 5-10% performance in the case of "for obj in qs.all()" (1000 objs, 2 fields). For values_list('id') the cost is around 30%.
- The current implementation silently discards some exceptions when doing list(qs). This can be annoying especially when debugging django-core code.
My take is we are optimizing the wrong case currently. That is, the case where one wants to consume a queryset only partially, but can't use the .iterator() method. The case would be something like:
for obj in qs: if somecond: break # Now, another loop for the same queryset! for obj in qs: if someothercond: break
If there is no another loop, it is possible to use .iterator(). If one of the above loops consumes major portion of the qs, then there is no benefit in doing partial object conversion.
The question is if there are common patterns where the current implementation is worth the code complexity & performance loss for the common cases.
I will leave this as DDN, as this change is obviously something that needs to be considered carefully.
There is a branch implementing the removal of chunked reads at: https://github.com/akaariai/django/compare/non_chunked_reads
Change History (15)
comment:1 by , 12 years ago
Cc: | added |
---|
comment:2 by , 12 years ago
A rebased + squashed patch available from: https://github.com/akaariai/django/commit/ddc85150d0bb82231e3820807f7652d2a9578ab2
There were some changes to the queryset iteration lately (the introduction of _safe_iterator()). So, currently there are six nested iterators when iterating over a qs. This is getting a bit complex.
Performance difference between the patched version and master for fetching ten objects from the DB using "for obj in qs" is 5%-7%. For values_list('pk') it is around 15%. So, there is a performance difference but it isn't too big.
The silent discard of some exceptions should be fixed with the _safe_iterator() patch.
Still, the code clarity difference is quite big. When considering this patch the question is mainly if there are some use cases where the "convert only some results to Python" is a big performance win. I don't believe such cases are common enough to care about, but then again I know mostly my own usage patterns... To me it seems that in most cases where the change would matter the user is doing something that will cause bad performance anyways.
comment:3 by , 12 years ago
Triage Stage: | Design decision needed → Accepted |
---|
Wow, the code clarity difference is indeed big; and that performance difference isn't insignificant either. I'm having trouble coming up with realistic cases where saving memory on partial iteration matters and you can't just use .iterator()
, which is our documented solution for when you can't afford to materialize the whole queryset. So +1 from me on this patch.
comment:4 by , 12 years ago
I am going to commit the patch this weekend if nobody objects.
A very short summary: the patch will make "half-iterating" a queryset cost more (as all the rows are converted upfront), but on the other hand full iteration will be faster. Cases hit are "obj in largeqs" without needing the rest of the qs, or "for obj in qs: if somecond: break".
One of the original reasons for the patch, discard of some exceptions, is fixed by #19895 - but there is a comment in the ticket that says the fix is causing a serious memory leak. This is hard to verify as there aren't any details. Still, the comment is from core developer so it is very likely that the commit does cause memory leak under some conditions.
Rebased patch available from https://github.com/akaariai/django/tree/non_chunked_iter_squashed.
comment:5 by , 12 years ago
I think there are some cases you might have missed:
For example, if someone gets a large result set in a qs, and does if qs
. This will currently create just one instance. There could be cases where this is a reasonable thing to do e.g. if in some cases, but not all, you will go on to iterate over the whole queryset, this pattern will mean you only do one DB query.
I am also hoping that at some point we will get better support for doing chunked reads at the DB level in Postgres. However, I guess that when you need that, you also need QuerySet.iterator()
, so this change wouldn't affect that case much.
comment:6 by , 12 years ago
Yes, __bool__
seems to be another case hit. Minor correction is that you do still convert 100 objects currently, not just one.
It is true that this patch is dealing with a tradeoff. On one hand we have full resultset iteration performance and code clarity, on the other doing bool(qs), contains or iteration with break. If the resultset is small it doesn't matter what will be done (less than 100 objects and the results are converted in full anyways). If the resultset is large, the user who cares about performance should be doing something else anyways (unless using Oracle).
It turns out that it is somewhat easy to support bool() and contains in "one object at time" manner, but still do full conversion when accessing full resultset. This way the only regression case is iteration with break. But then, just use .iterator(). Also, those wanting to use server side cursors will likely want to use .iterator().
The code is somewhat more complex than in the "always fetch all objects" but not much.
1000 objects with pk + one field from db with .all() is 10% faster using the patched code. .iterator() speed isn't changed.
Patch available from https://github.com/akaariai/django/tree/non_chunked_iter_squashed.
Sidenote about chunked reads: Without better support from Postgres it will be impossible to automatically use chunked reads. If you don't use WITH HOLD, the cursor will be unusable outside transactions. Add in autocommit mode, and it means this is a total no-go (not that it was a good idea even before). If you use WITH HOLD, then the cursor must be closed explicitly. So, if you have "if obj in qs" but don't iterate the whole qs it means you just leaked a cursor (which is even worse than having idle-in-tx connections). So, a dedicated API is needed in any case.
comment:7 by , 12 years ago
My naive opinion is that anything that doesn't use .iterator()
should load the entire queryset on the first access. That's the least surprising behavior.
As explained by Anssi, loading results incrementally from the database cannot be done transparently.
That's an external, uninformed comment; feel free to ignore me ;)
follow-up: 9 comment:8 by , 12 years ago
I am not sure if I should commit the simple no-optimizations for __bool__
and __contains__
version, or the optimized and bit more complex version. The added complexity isn't much, but on the other hand I believe users benefiting from the optimizations are rare.
I am leaning towards committing the more complex optimized version, simply because there could be users who would get a performance regression otherwise. I am wondering if this is too cautious. This approach optimizes use cases that are rare, and in the use cases optimized, the user would often benefit a lot more by using different code patterns.
Opinions welcome...
comment:9 by , 12 years ago
comment:10 by , 11 years ago
I think we want to go with the "always convert all objects" approach. Special casing some uses makes the code complex. And, in most cases we will win only in the amount of converted objects, not in the amount of fetched rows.
So, I updated the original non_chunked_reads branch to apply to master. If somebody could review at least the release notes changes that would be good...
comment:12 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
I have to say it's quite easier to read the implementation with this cleanup.