Opened 16 months ago

Closed 16 months ago

Last modified 16 months ago

#34894 closed Cleanup/optimization (needsinfo)

Query.change_aliases() has several significant bugs

Reported by: Aqua Penguin Owned by: nobody
Component: Database layer (models, ORM) Version: 4.2
Severity: Normal Keywords: Query change_aliases
Cc: Aqua Penguin, Simon Charette Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When using the Query.change_aliases(change_map) method (on an extended class), some changes are not done and errors occur:

  1. the insert order of Query.alias_map is modified, and it is used by the compiler to compile the FROM parts in the correct order, this can results in an inconsistant order of the JOIN clauses;
  1. even if an entry of Query.alias_map is not in the change_map, its alias_data may still have to be relabeled if alias_data.parent_alias is in the change_map, but it is not done and the generated nested joins that are modified by change_map have errors in the generated ON clauses;
  1. the cached property Query.base_table is not modified if it is in the change_map, it results in error when compiling the query, the old alias is searched in Query.alias_refcount, resulting in a KeyError.

*Solutions:*

  1. change the for loop to iterate on the Query.alias_map instead of the change_map, and rebuild it in the same order with updates;
  1. in addition to relabel the entries of Query.alias_map that are in the change_map, relabel for those whose the alias_data.parent_alias are in the Query.alias_map;
  1. change the cached property Query.base_table if it is in the change_map.

Attached files: patch with fixes and tests

Question: The bugs has been tested on 4.2 and main, I have to do a PR on all current versions or only on the main branch ?

Attachments (1)

patch_34894.diff (3.9 KB ) - added by Aqua Penguin 16 months ago.

Download all attachments as: .zip

Change History (5)

by Aqua Penguin, 16 months ago

Attachment: patch_34894.diff added

comment:1 by Aqua Penguin, 16 months ago

Cc: Aqua Penguin added

comment:2 by Aqua Penguin, 16 months ago

Here the attached patch.

  • django/db/models/sql/query.py

    diff --git a/django/db/models/sql/query.py b/django/db/models/sql/query.py
    index a7839ccb4d..95458a9102 100644
    a b class Query(BaseExpression):  
    988988        }
    989989
    990990        # 2. Rename the alias in the internal table/alias datastructures.
    991         for old_alias, new_alias in change_map.items():
    992             if old_alias not in self.alias_map:
     991        # - The order of insertion in alias_map has great importance for
     992        #   future iterations, so we need to rebuild it in the original order.
     993        # - alias_data must be relabeled if it contains a change to do in
     994        #   alias_data.parent_alias, even if its alias does not change.
     995        # - Cached property base_table need to change if necessary.
     996        base_table_cached = self.__dict__.get("base_table", None)
     997        old_alias_map, self.alias_map = self.alias_map, {}
     998        for old_alias, alias_data in old_alias_map.items():
     999            if old_alias not in change_map:
     1000                self.alias_map[old_alias] = (
     1001                    alias_data.relabeled_clone(change_map)
     1002                    if alias_data.parent_alias in change_map
     1003                    else alias_data
     1004                )
    9931005                continue
    994             alias_data = self.alias_map[old_alias].relabeled_clone(change_map)
    995             self.alias_map[new_alias] = alias_data
     1006            new_alias = change_map[old_alias]
     1007            if old_alias == base_table_cached:
     1008                self.__dict__["base_table"] = new_alias
     1009            self.alias_map[new_alias] = alias_data.relabeled_clone(change_map)
    9961010            self.alias_refcount[new_alias] = self.alias_refcount[old_alias]
    9971011            del self.alias_refcount[old_alias]
    998             del self.alias_map[old_alias]
    9991012
    10001013            table_aliases = self.table_map[alias_data.table_name]
    10011014            for pos, alias in enumerate(table_aliases):
  • tests/queries/test_query.py

    diff --git a/tests/queries/test_query.py b/tests/queries/test_query.py
    index 99d0e32427..f5a17c56e5 100644
    a b class TestQuery(SimpleTestCase):  
    8989        self.assertIsNone(lookup.lhs.lhs.alias)
    9090        self.assertEqual(lookup.lhs.lhs.target, Author._meta.get_field("name"))
    9191
     92    def test_change_aliases_alias_map_order(self):
     93        query = Query(Item)
     94        # force to create a join between Item and Author
     95        query.add_q(Q(creator__num__gt=2))
     96        # force to initialize all joins and caches
     97        sql = str(query)
     98        order = [
     99            "item_alias" if alias == Item._meta.db_table else alias
     100            for alias in list(query.alias_map)
     101        ]
     102        query.change_aliases({
     103            Item._meta.db_table: "item_alias",
     104        })
     105        self.assertEqual(list(query.alias_map), order)
     106
     107    def test_change_aliases_nested_joins(self):
     108        query = Query(Item)
     109        # force to create a join between Item and Author
     110        query.add_q(Q(creator__num__gt=2))
     111        # force to initialize all joins and caches
     112        sql = str(query)
     113        self.assertEqual(query.alias_map[Author._meta.db_table].parent_alias, Item._meta.db_table)
     114        query.change_aliases({
     115            Item._meta.db_table: "item_alias",
     116        })
     117        self.assertEqual(query.alias_map[Author._meta.db_table].parent_alias, "item_alias")
     118
     119    def test_change_aliases_base_table(self):
     120        query = Query(Item)
     121        # force to create a join between Item and Author
     122        query.add_q(Q(creator__num__gt=2))
     123        # force to initialize all joins and caches
     124        sql = str(query)
     125        self.assertEqual(query.base_table, Item._meta.db_table)
     126        query.change_aliases({
     127            Item._meta.db_table: "item_alias",
     128        })
     129        self.assertEqual(query.base_table, "item_alias")
     130
    92131    def test_negated_nullable(self):
    93132        query = Query(Item)
    94133        where = query.build_where(~Q(modified__lt=datetime(2017, 1, 1)))

comment:3 by Mariusz Felisiak, 16 months ago

Resolution: needsinfo
Status: newclosed
Type: BugCleanup/optimization

Query.change_aliases() is a private API and, as far as I'm aware, it works fine for internal use. Can you explain what are you trying to implement? Did you find any existing bug that could be fix by these changes? Can you prepare failing tests using querysets rather than testing internal tools?

Question: The bugs has been tested on 4.2 and main, I have to do a PR on all current versions or only on the main branch ?

Mergers will handle backports, if needed.

comment:4 by Simon Charette, 16 months ago

Cc: Simon Charette added

As Mariusz said it'd be great to learn what kind of significant bugs this addresses as this method is extensively used by the ORM and making changes to it, even if they appear right from an API perspective, can have unintended side effects.

Note: See TracTickets for help on using tickets.
Back to Top