#33319 closed Bug (fixed)
Query.change_aliases raises an AssertionError
| Reported by: | Ömer Faruk Abacı | Owned by: | Ömer Faruk Abacı |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 3.2 |
| Severity: | Normal | Keywords: | |
| Cc: | Simon Charette | Triage Stage: | Ready for checkin |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
- Python Version:
3.9.2 - Django Version:
2.2.24,3.2.9(reproduced using two different versions)
Code to Reproduce
# models.py from django.db import models class Foo(models.Model): qux = models.ForeignKey("app.Qux", on_delete=models.CASCADE, related_name="foos") class Bar(models.Model): foo = models.ForeignKey("app.Foo", on_delete=models.CASCADE, related_name="bars") another_foo = models.ForeignKey("app.Foo", on_delete=models.CASCADE, related_name="other_bars") baz = models.ForeignKey("app.Baz", on_delete=models.CASCADE, related_name="bars") class Baz(models.Model): pass class Qux(models.Model): bazes = models.ManyToManyField("app.Baz", related_name="quxes")
# Failing tests from django.db.models import Q from bug.app.models import Foo, Qux qux = Qux.objects.create() qs1 = qux.foos.all() qs2 = Foo.objects.filter( Q(bars__baz__in=qux.bazes.all()) | Q(other_bars__baz__in=qux.bazes.all()) ) # Works fine. qs2 | qs1 # AssertionError # "/django/db/models/sql/query.py", line 854, in Query.change_aliases # change_map = {'T4': 'T5', 'T5': 'T6'} qs1 | qs2
Description
I have encountered this bug during working on a project, recreated the code to reproduce as simple as I can. I have also examined the reason behind this bug, as far as I understand the reason is that during an __or__ operation of two QuerySets, in Query.combine method of the variable combined, if rhs's Query currently have sequential aliases (e.g. T4 and T5) and related table_names also exist in lhs.table_map, calling Query.table_alias in Query.join will result in creation of aliases T5 for T4 and T6 for T5, thus change_map's keys intersect with change_map's values, so the AssertionError above is raised.
Expectation
- Could you please fix this bug? Maybe
alias_mapofrhscan be provided toQuery.joinandQuery.table_alias, and suffix (number) of the new alias might be incremented until it is not inrhs.alias_map, to prevent intersection betweenchange_map'skeysandvalues. - Assertion in the first line of
QuerySet.change_aliasesis not documented via a comment. As far as I understand, it is there because if keys and values intersects it means that an alias might be changed twice (e.g. firstT4->T5, and thenT5->T6) according to their order in thechange_map. IMHO there can be a comment about what it assures, or an explanation can be added to theAssertionError(like the assertions in theQuery.combinemethod). - It seems like
QuerySet'sORoperation is not commutative (they can create different queries, even though the results are the same), IMHO this can be explicitly declared on the documentation.
Change History (17)
comment:1 by , 4 years ago
| Component: | Uncategorized → Database layer (models, ORM) |
|---|---|
| Type: | Uncategorized → Bug |
comment:2 by , 4 years ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
follow-up: 5 comment:3 by , 4 years ago
| Triage Stage: | Unreviewed → Accepted |
|---|
follow-up: 7 comment:4 by , 4 years ago
| Cc: | added |
|---|
Vishal Pandey, if you're looking for pointers on how to resolve the issue I think I have a pretty good understanding of the problem.
The root cause is that both queries happen to share the same alias_prefix; the letter used generate aliases when the same table is referenced more than once in a query.
We already have a method that does all of the heavy lifting to prevent these collisions that the ORM uses when subqueries are involved which is Query.bump_prefix but it's not entirely applicable here. I think the best way forward here is to change the alias_prefix of the rhs and change its alias accordingly so it doesn't conflict before proceeding with the creation of the change_map.
comment:5 by , 4 years ago
Replying to Mariusz Felisiak:
Thanks for the report. Reproduced at b8c0b22f2f0f8ce664642332d6d872f300c662b4.
Thanks for your quick reply, but the commit you have mentioned doesn't seem related. By the way, can I open a PR for adding myself to AUTHORS?
Replying to Simon Charette:
Vishal Pandey, if you're looking for pointers on how to resolve the issue I think I have a pretty good understanding of the problem.
The root cause is that both queries happen to share the same
alias_prefix; the letter used generate aliases when the same table is referenced more than once in a query.
We already have a method that does all of the heavy lifting to prevent these collisions that the ORM uses when subqueries are involved which is
Query.bump_prefixbut it's not entirely applicable here. I think the best way forward here is to change thealias_prefixof therhsand change its alias accordingly so it doesn't conflict before proceeding with the creation of thechange_map.
I totally agree with you. My initial attempt was to use Query.bump_prefix but as you told it threw an error.
comment:6 by , 4 years ago
Thanks for your quick reply, but the commit you have mentioned doesn't seem related.
Yes, it's not related. I added this comment to show that it's still reproducible on the main branch.
By the way, can I open a PR for adding myself to AUTHORS?
You can add an entry in AUTHORS with a patch for this issue. A separate PR is not necessary.
comment:7 by , 4 years ago
Replying to Simon Charette:
Thank you for your helping hand.
I tried two approaches to solve this issue by tweaking table_alias method of Query class, and it works with the above-attested sample code, but unfortunately, both of them fail with few testcases.
By both the approaches, I am trying to resolve the conflicts between the change_map's keys and its values
1st approach:
if alias_list:
from random import choice
alias = '%s%d' % (choice(ascii_uppercase), len(self.alias_map) + 1)
alias_list.append(alias)
Here, I am randomly choosing an uppercase letter as the first character of alias, this returns different aliases, but a test_case is failing with this approach, with the following traceback.
FAIL: test_multiple_search_fields (admin_changelist.tests.ChangeListTests) [<object object at 0x7fc0aa886cf0>] (search_string='Tiny Desk Concert')
All rows containing each of the searched words are returned, where each
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python3.8/unittest/case.py", line 60, in testPartExecutor
yield
File "/usr/lib/python3.8/unittest/case.py", line 582, in subTest
yield
File "/home/vishal/Desktop/open_source/django/tests/admin_changelist/tests.py", line 550, in test_multiple_search_fields
group_changelist = group_model_admin.get_changelist_instance(request)
File "/home/vishal/Desktop/open_source/django/django/contrib/admin/options.py", line 742, in get_changelist_instance
return ChangeList(
File "/home/vishal/Desktop/open_source/django/django/contrib/admin/views/main.py", line 100, in __init__
self.queryset = self.get_queryset(request)
File "/home/vishal/Desktop/open_source/django/django/contrib/admin/views/main.py", line 498, in get_queryset
qs = self.root_queryset.filter(Exists(qs))
File "/home/vishal/Desktop/open_source/django/django/db/models/query.py", line 977, in filter
return self._filter_or_exclude(False, args, kwargs)
File "/home/vishal/Desktop/open_source/django/django/db/models/query.py", line 995, in _filter_or_exclude
clone._filter_or_exclude_inplace(negate, args, kwargs)
File "/home/vishal/Desktop/open_source/django/django/db/models/query.py", line 1002, in _filter_or_exclude_inplace
self._query.add_q(Q(*args, **kwargs))
File "/home/vishal/Desktop/open_source/django/django/db/models/sql/query.py", line 1374, in add_q
clause, _ = self._add_q(q_object, self.used_aliases)
File "/home/vishal/Desktop/open_source/django/django/db/models/sql/query.py", line 1395, in _add_q
child_clause, needed_inner = self.build_filter(
File "/home/vishal/Desktop/open_source/django/django/db/models/sql/query.py", line 1263, in build_filter
condition = filter_expr.resolve_expression(self, allow_joins=allow_joins)
File "/home/vishal/Desktop/open_source/django/django/db/models/expressions.py", line 248, in resolve_expression
c.set_source_expressions([
File "/home/vishal/Desktop/open_source/django/django/db/models/expressions.py", line 249, in <listcomp>
expr.resolve_expression(query, allow_joins, reuse, summarize)
File "/home/vishal/Desktop/open_source/django/django/db/models/sql/query.py", line 1035, in resolve_expression
clone.bump_prefix(query)
File "/home/vishal/Desktop/open_source/django/django/db/models/sql/query.py", line 925, in bump_prefix
self.change_aliases({
File "/home/vishal/Desktop/open_source/django/django/db/models/sql/query.py", line 848, in change_aliases
assert set(change_map).isdisjoint(change_map.values())
AssertionError
In 2nd approach, I have rotated the alias_prefix in a circular manner from T(class variable alias_prefix has value T) to Z, then from A to Z, and so on, and it also works on the above-attested code but fails five other test cases.
if alias_list:
alias = '%s%d' % (self.alias_prefix, len(self.alias_map) + 1)
self.alias_prefix = chr(ord(self.alias_prefix)+1) if chr(ord(self.alias_prefix)+1) <= 'Z' else 'A'
alias_list.append(alias)
I feel something needs to be tweaked here only to solve the problem, but because of my limited knowledge of the underlying codebase, I am not being able to solve it.
I seek help from my fellow senior developers to solve this issue.
follow-up: 9 comment:8 by , 4 years ago
Ömer,
My initial attempt was to use Query.bump_prefix but as you told it threw an error.
The error you encountered is due to bump_prefix requiring some adjustments. The alias merging algorithm that Query.combine uses requires that both query share the same initial alias in order to work otherwise you'll get a KeyError. You'll want to find a way to have bump_prefix adjust all the aliases but the initial one so it can still be used as the merge starting point.
---
Vishal,
Here, I am randomly choosing an uppercase letter as the first character of alias, this returns different aliases, but a test_case is failing with this approach, with the following traceback.
Choosing a random initial alias will bring more bad than good as not relying on alias_prefix will cause undeterministic failures across the board. There are reasons why the prefix is used and bumped only on conflic; it makes reasoning about possible collisions easier.
In 2nd approach, I have rotated the alias_prefix in a circular manner from T(class variable alias_prefix has value T) to Z, then from A to Z, and so on, and it also works on the above-attested code but fails five other test cases.
This approach doesn't account for possible collisions with subquery aliases and will eventually collide once you've wrapped up around the 26 letter of the alphabet instead of using the Cartesian product approach.
---
All of the required logic for preventing alias_prefix collisions already lives in bump_prefix, it's only a matter of adjusting it in order to allow it to be used in the alias merging algorithm of combine.
comment:9 by , 4 years ago
Replying to Simon Charette:
Hello Simon,
I guess I have managed to fix this issue providing a parameter to bump_prefix to prevent changing aliases directly but only bumping the prefix accordingly. It seems to be working fine, passing all the tests including the regression tests I have added for this issue. I would appreciate your reviews for the PR.
Thanks for your effort too Vishal!
comment:10 by , 4 years ago
| Has patch: | set |
|---|
comment:11 by , 4 years ago
| Patch needs improvement: | set |
|---|
comment:12 by , 4 years ago
| Owner: | changed from to |
|---|
I am assigning this ticket to myself since there is an ongoing patch here.
comment:13 by , 4 years ago
| Patch needs improvement: | unset |
|---|---|
| Triage Stage: | Accepted → Ready for checkin |
Thanks for the report. Reproduced at b8c0b22f2f0f8ce664642332d6d872f300c662b4.