#33319 closed Bug (fixed)
Query.change_aliases raises an AssertionError
Reported by: | Ömer Faruk Abacı | Owned by: | Ömer Faruk Abacı |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.2 |
Severity: | Normal | Keywords: | |
Cc: | Simon Charette | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
- Python Version:
3.9.2
- Django Version:
2.2.24
,3.2.9
(reproduced using two different versions)
Code to Reproduce
# models.py from django.db import models class Foo(models.Model): qux = models.ForeignKey("app.Qux", on_delete=models.CASCADE, related_name="foos") class Bar(models.Model): foo = models.ForeignKey("app.Foo", on_delete=models.CASCADE, related_name="bars") another_foo = models.ForeignKey("app.Foo", on_delete=models.CASCADE, related_name="other_bars") baz = models.ForeignKey("app.Baz", on_delete=models.CASCADE, related_name="bars") class Baz(models.Model): pass class Qux(models.Model): bazes = models.ManyToManyField("app.Baz", related_name="quxes")
# Failing tests from django.db.models import Q from bug.app.models import Foo, Qux qux = Qux.objects.create() qs1 = qux.foos.all() qs2 = Foo.objects.filter( Q(bars__baz__in=qux.bazes.all()) | Q(other_bars__baz__in=qux.bazes.all()) ) # Works fine. qs2 | qs1 # AssertionError # "/django/db/models/sql/query.py", line 854, in Query.change_aliases # change_map = {'T4': 'T5', 'T5': 'T6'} qs1 | qs2
Description
I have encountered this bug during working on a project, recreated the code to reproduce as simple as I can. I have also examined the reason behind this bug, as far as I understand the reason is that during an __or__
operation of two QuerySet
s, in Query.combine
method of the variable combined
, if rhs
's Query
currently have sequential aliases (e.g. T4
and T5
) and related table_name
s also exist in lhs.table_map
, calling Query.table_alias
in Query.join
will result in creation of aliases T5
for T4
and T6
for T5
, thus change_map
's keys
intersect with change_map
's values
, so the AssertionError
above is raised.
Expectation
- Could you please fix this bug? Maybe
alias_map
ofrhs
can be provided toQuery.join
andQuery.table_alias
, and suffix (number) of the new alias might be incremented until it is not inrhs.alias_map
, to prevent intersection betweenchange_map
'skeys
andvalues
. - Assertion in the first line of
QuerySet.change_aliases
is not documented via a comment. As far as I understand, it is there because if keys and values intersects it means that an alias might be changed twice (e.g. firstT4
->T5
, and thenT5
->T6
) according to their order in thechange_map
. IMHO there can be a comment about what it assures, or an explanation can be added to theAssertionError
(like the assertions in theQuery.combine
method). - It seems like
QuerySet
'sOR
operation is not commutative (they can create different queries, even though the results are the same), IMHO this can be explicitly declared on the documentation.
Change History (17)
comment:1 by , 3 years ago
Component: | Uncategorized → Database layer (models, ORM) |
---|---|
Type: | Uncategorized → Bug |
comment:2 by , 3 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
follow-up: 5 comment:3 by , 3 years ago
Triage Stage: | Unreviewed → Accepted |
---|
follow-up: 7 comment:4 by , 3 years ago
Cc: | added |
---|
Vishal Pandey, if you're looking for pointers on how to resolve the issue I think I have a pretty good understanding of the problem.
The root cause is that both queries happen to share the same alias_prefix
; the letter used generate aliases when the same table is referenced more than once in a query.
We already have a method that does all of the heavy lifting to prevent these collisions that the ORM uses when subqueries are involved which is Query.bump_prefix
but it's not entirely applicable here. I think the best way forward here is to change the alias_prefix
of the rhs
and change its alias accordingly so it doesn't conflict before proceeding with the creation of the change_map
.
comment:5 by , 3 years ago
Replying to Mariusz Felisiak:
Thanks for the report. Reproduced at b8c0b22f2f0f8ce664642332d6d872f300c662b4.
Thanks for your quick reply, but the commit you have mentioned doesn't seem related. By the way, can I open a PR for adding myself to AUTHORS?
Replying to Simon Charette:
Vishal Pandey, if you're looking for pointers on how to resolve the issue I think I have a pretty good understanding of the problem.
The root cause is that both queries happen to share the same
alias_prefix
; the letter used generate aliases when the same table is referenced more than once in a query.
We already have a method that does all of the heavy lifting to prevent these collisions that the ORM uses when subqueries are involved which is
Query.bump_prefix
but it's not entirely applicable here. I think the best way forward here is to change thealias_prefix
of therhs
and change its alias accordingly so it doesn't conflict before proceeding with the creation of thechange_map
.
I totally agree with you. My initial attempt was to use Query.bump_prefix
but as you told it threw an error.
comment:6 by , 3 years ago
Thanks for your quick reply, but the commit you have mentioned doesn't seem related.
Yes, it's not related. I added this comment to show that it's still reproducible on the main
branch.
By the way, can I open a PR for adding myself to AUTHORS?
You can add an entry in AUTHORS
with a patch for this issue. A separate PR is not necessary.
comment:7 by , 3 years ago
Replying to Simon Charette:
Thank you for your helping hand.
I tried two approaches to solve this issue by tweaking table_alias
method of Query class, and it works with the above-attested sample code, but unfortunately, both of them fail with few testcases.
By both the approaches, I am trying to resolve the conflicts between the change_map's keys and its values
1st approach:
if alias_list: from random import choice alias = '%s%d' % (choice(ascii_uppercase), len(self.alias_map) + 1) alias_list.append(alias)
Here, I am randomly choosing an uppercase letter as the first character of alias, this returns different aliases, but a test_case is failing with this approach, with the following traceback.
FAIL: test_multiple_search_fields (admin_changelist.tests.ChangeListTests) [<object object at 0x7fc0aa886cf0>] (search_string='Tiny Desk Concert') All rows containing each of the searched words are returned, where each ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python3.8/unittest/case.py", line 60, in testPartExecutor yield File "/usr/lib/python3.8/unittest/case.py", line 582, in subTest yield File "/home/vishal/Desktop/open_source/django/tests/admin_changelist/tests.py", line 550, in test_multiple_search_fields group_changelist = group_model_admin.get_changelist_instance(request) File "/home/vishal/Desktop/open_source/django/django/contrib/admin/options.py", line 742, in get_changelist_instance return ChangeList( File "/home/vishal/Desktop/open_source/django/django/contrib/admin/views/main.py", line 100, in __init__ self.queryset = self.get_queryset(request) File "/home/vishal/Desktop/open_source/django/django/contrib/admin/views/main.py", line 498, in get_queryset qs = self.root_queryset.filter(Exists(qs)) File "/home/vishal/Desktop/open_source/django/django/db/models/query.py", line 977, in filter return self._filter_or_exclude(False, args, kwargs) File "/home/vishal/Desktop/open_source/django/django/db/models/query.py", line 995, in _filter_or_exclude clone._filter_or_exclude_inplace(negate, args, kwargs) File "/home/vishal/Desktop/open_source/django/django/db/models/query.py", line 1002, in _filter_or_exclude_inplace self._query.add_q(Q(*args, **kwargs)) File "/home/vishal/Desktop/open_source/django/django/db/models/sql/query.py", line 1374, in add_q clause, _ = self._add_q(q_object, self.used_aliases) File "/home/vishal/Desktop/open_source/django/django/db/models/sql/query.py", line 1395, in _add_q child_clause, needed_inner = self.build_filter( File "/home/vishal/Desktop/open_source/django/django/db/models/sql/query.py", line 1263, in build_filter condition = filter_expr.resolve_expression(self, allow_joins=allow_joins) File "/home/vishal/Desktop/open_source/django/django/db/models/expressions.py", line 248, in resolve_expression c.set_source_expressions([ File "/home/vishal/Desktop/open_source/django/django/db/models/expressions.py", line 249, in <listcomp> expr.resolve_expression(query, allow_joins, reuse, summarize) File "/home/vishal/Desktop/open_source/django/django/db/models/sql/query.py", line 1035, in resolve_expression clone.bump_prefix(query) File "/home/vishal/Desktop/open_source/django/django/db/models/sql/query.py", line 925, in bump_prefix self.change_aliases({ File "/home/vishal/Desktop/open_source/django/django/db/models/sql/query.py", line 848, in change_aliases assert set(change_map).isdisjoint(change_map.values()) AssertionError
In 2nd approach, I have rotated the alias_prefix
in a circular manner from T(class variable alias_prefix
has value T) to Z, then from A to Z, and so on, and it also works on the above-attested code but fails five other test cases.
if alias_list: alias = '%s%d' % (self.alias_prefix, len(self.alias_map) + 1) self.alias_prefix = chr(ord(self.alias_prefix)+1) if chr(ord(self.alias_prefix)+1) <= 'Z' else 'A' alias_list.append(alias)
I feel something needs to be tweaked here only to solve the problem, but because of my limited knowledge of the underlying codebase, I am not being able to solve it.
I seek help from my fellow senior developers to solve this issue.
follow-up: 9 comment:8 by , 3 years ago
Ömer,
My initial attempt was to use Query.bump_prefix but as you told it threw an error.
The error you encountered is due to bump_prefix
requiring some adjustments. The alias merging algorithm that Query.combine
uses requires that both query share the same initial alias in order to work otherwise you'll get a KeyError
. You'll want to find a way to have bump_prefix
adjust all the aliases but the initial one so it can still be used as the merge starting point.
---
Vishal,
Here, I am randomly choosing an uppercase letter as the first character of alias, this returns different aliases, but a test_case is failing with this approach, with the following traceback.
Choosing a random initial alias will bring more bad than good as not relying on alias_prefix
will cause undeterministic failures across the board. There are reasons why the prefix is used and bumped only on conflic; it makes reasoning about possible collisions easier.
In 2nd approach, I have rotated the alias_prefix in a circular manner from T(class variable alias_prefix has value T) to Z, then from A to Z, and so on, and it also works on the above-attested code but fails five other test cases.
This approach doesn't account for possible collisions with subquery aliases and will eventually collide once you've wrapped up around the 26 letter of the alphabet instead of using the Cartesian product approach.
---
All of the required logic for preventing alias_prefix
collisions already lives in bump_prefix
, it's only a matter of adjusting it in order to allow it to be used in the alias merging algorithm of combine
.
comment:9 by , 3 years ago
Replying to Simon Charette:
Hello Simon,
I guess I have managed to fix this issue providing a parameter to bump_prefix
to prevent changing aliases directly but only bumping the prefix accordingly. It seems to be working fine, passing all the tests including the regression tests I have added for this issue. I would appreciate your reviews for the PR.
Thanks for your effort too Vishal!
comment:10 by , 3 years ago
Has patch: | set |
---|
comment:11 by , 3 years ago
Patch needs improvement: | set |
---|
comment:12 by , 3 years ago
Owner: | changed from | to
---|
I am assigning this ticket to myself since there is an ongoing patch here.
comment:13 by , 3 years ago
Patch needs improvement: | unset |
---|---|
Triage Stage: | Accepted → Ready for checkin |
Thanks for the report. Reproduced at b8c0b22f2f0f8ce664642332d6d872f300c662b4.