sql/query.py add_q refactoring
|Reported by:||Anssi Kääriäinen||Owned by:||nobody|
|Component:||Database layer (models, ORM)||Version:||1.3|
|Cc:||suor.web@…, anssi.kaariainen@…||Triage Stage:||Accepted|
|Has patch:||yes||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||yes|
I have worked on refactoring sql/query.py add_q. There are three reasons for doing this:
- It was hard to see what was happening in add_q previously. I hope I have improved this situation.
- The responsibilities in the code were not well divided. add_filter for example did some work that add_q normally does, but only in case when the filter was to the having clause.
- The add_q resulted in inefficient where node structure.
It turned out that the django/utils/tree.py tree structure was hard to use. So that is also refactored. Finally where.as_sql got some refactoring.
The refactoring already resulted in some other cleanups around the code, for example sql/query.py combine() is now cleaner in its hadling of the where clause.
The first two changes are best explained by reading the patch. However I can give some examples of why no. 3 is important. First consider query M.objects.exclude(pk=1). Previously this would have generated the following where tree:
AND: AND: NOT AND: AND: pk=1
AND: NOT AND: pk=1
The query M.objects.filter(pk=1).filter(pk=2).filter(pk=3) would generate:
AND: AND: pk=1 AND: pk=2 AND: pk=3
AND: pk=1 pk=2 pk=3
Finally, the query User.objects.filter(~Q(pk=1)&~Q(Q(pk=3)|Q(pk=2))) did create:
AND: AND: AND: NOT AND: AND: pk=1 AND: NOT AND: OR: AND: pk=3 OR: pk=2
AND: NOT AND: pk=1 NOT AND: OR: pk=3 pk=2
Some performance numbers, two tests:
#Test 1 for i in range(0, 1000): qs = M.objects.filter(pk=1) # Test2 qs = M.objects.all() for i in range(0, 200): qs = qs.filter(pk=1)
So, about 35% faster and about 25x faster. The patch includes deepcopy removal for qs.clone(), and large part of the performance gain is from there.
All tests passed, though I had to fix one test.
There would be still further work, but I wanted to keep this at least somewhat restricted. Even more restricted is hard to do, as the tree and where node restructuring would mean refactoring the add_q anyways.
This ticket is related to #16759, but as this goes deeper in scope, I have created another ticket for this.