Opened 3 years ago

Closed 2 years ago

Last modified 2 years ago

#33050 closed Bug (fixed)

QuerySet.count() on combined queries with select_related() crashes on MySQL.

Reported by: Sunkyue Owned by: Jordan Bae
Component: Database layer (models, ORM) Version: 3.2
Severity: Normal Keywords: orm, count, union, select_related mysql
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Sunkyue)

test case:

class ModelA(models.Model):
    value = models.IntegerField()

class ModelB(models.Model):
   model_a = models.ForeignKey("ModelA", on_delete=models.CASCADE)

->

ModelB.objects.select_related('model_a').union(ModelB.objects.select_related('model_a')).count()

results in error.

 File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\models\query.py", line 412, in count
    return self.query.get_count(using=self.db)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\models\sql\query.py", line 521, in get_count
    number = obj.get_aggregation(using, ['__count'])['__count']
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\models\sql\query.py", line 506, in get_aggregation
    result = compiler.execute_sql(SINGLE)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\models\sql\compiler.py", line 1175, in execute_sql
    cursor.execute(sql, params)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\backends\utils.py", line 98, in execute
    return super().execute(sql, params)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\backends\utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\backends\utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\backends\utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\backends\utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\backends\mysql\base.py", line 73, in execute
    return self.cursor.execute(query, args)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\MySQLdb\cursors.py", line 206, in execute
    res = self._query(query)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\MySQLdb\cursors.py", line 319, in _query
    db.query(q)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\MySQLdb\connections.py", line 259, in query
    _mysql.connection.query(self, query)
django.db.utils.OperationalError: (1060, "Duplicate column name 'id'")

reproduced on on 3.2.5/3.2.6

suggested fix:
on django.db.models.sql.query.py -> Query.get_aggregation, from

            inner_query.select_for_update = False
            inner_query.select_related = False
            inner_query.set_annotation_mask(self.annotation_select)

to

            inner_query.select_for_update = False
            inner_query.select_related = False
            for combined_query in inner_query.combined_queries:
                combined_query.select_related = False
            inner_query.set_annotation_mask(self.annotation_select)

can solve the problem I think.

my current monkey-patching code is,

from django.db.models.sql.query import Query

old_get_aggregation = Query.get_aggregation

def get_aggregation(self, using, added_aggregate_names):
    cloned_self = self.clone()
    for combined_query in cloned_self.combined_queries:
        combined_query.select_related = False
    return old_get_aggregation(cloned_self, using, added_aggregate_names)

Query.get_aggregation = get_aggregation

Change History (13)

comment:1 by Sunkyue, 3 years ago

Easy pickings: set

comment:2 by Sunkyue, 3 years ago

Description: modified (diff)

comment:3 by Sunkyue, 3 years ago

Description: modified (diff)

comment:4 by Sunkyue, 3 years ago

Description: modified (diff)

comment:5 by Sunkyue, 3 years ago

Description: modified (diff)

comment:6 by Mariusz Felisiak, 3 years ago

Easy pickings: unset
Keywords: mysql added
Summary: calling count on union of queries having select_related results in error.QuerySet.count() on combined queries with select_related() crashes on MySQL.
Triage Stage: UnreviewedAccepted

Thanks for the report. The proposed patch looks good. Would you like to prepare PR?

in reply to:  6 comment:7 by Sunkyue, 3 years ago

Replying to Mariusz Felisiak:

Thanks for the report. The proposed patch looks good. Would you like to prepare PR?

Thank you for your review.

I'd like to submit a PR, however I don't have enough time for that for a while. (I haven't submitted any PRs to open source project...so for me, it will took some time..)

If anyone who can submit PR before me, it would be welcomed.


In the mean time, I found another buggy case:

ModelB.objects.select_related('model_a').union(ModelB.objects.select_related('model_a')).union(ModelB.objects.select_related('model_a')).count()

produces,

OperationalError: (1222, 'The used SELECT statements have a different number of columns')

because it produces nested configuration like,

query.combined_queries = (query.combined_queries(query, query), query)

So,

            for combined_query in inner_query.combined_queries:
                combined_query.select_related = False

this fix should be improved by doing this in recursive way I think...

Last edited 3 years ago by Sunkyue (previous) (diff)

comment:8 by Jordan Bae, 3 years ago

Owner: changed from nobody to Jordan Bae
Status: newassigned

Could i look into and make a patch for this?

comment:9 by Jordan Bae, 3 years ago

As my investigation, this issue is from subquery's duplicated column. it's not related to aggregate function.
In my opinion, we had better to improve the subquery which includes join. I attached a example for this.

mysql> select * from (
-> select django_migrations.id, dj.id from django_migrations inner join django_migrations as dj on dj.id = django_migrations.id
-> ) subquery;
ERROR 1060 (42S21): Duplicate column name 'id' `

and current SQL is created by Subquery.

a=Subquery(ModelB.objects.select_related('model_a'))
print(a.query)
SELECT blog_modelb.id, blog_modelb.model_a_id, blog_modela.id, blog_modela.value FROM blog_modelb INNER JOIN
blog_modela ON (blog_modelb.model_a_id = blog_modela.id)

If there is any missing or I misunderstood. please know me.
I will try to fix for this.

comment:10 by Jordan Bae, 3 years ago

Originally, that queryset generates below query.

SELECT
	COUNT(*)
FROM ((
		SELECT
			`blog_modelb`.`id`,
			`blog_modelb`.`model_a_id`,
			`blog_modela`.`id`,
			`blog_modela`.`value`
		FROM
			`blog_modelb`
			INNER JOIN `blog_modela` ON (`blog_modelb`.`model_a_id` = `blog_modela`.`id`))
UNION (
	SELECT
		`blog_modelb`.`id`,
		`blog_modelb`.`model_a_id`,
		`blog_modela`.`id`,
		`blog_modela`.`value`
	FROM
		`blog_modelb`
		INNER JOIN `blog_modela` ON (`blog_modelb`.`model_a_id` = `blog_modela`.`id`))) subquery

@Sunkyue
I think you can use this during shot-term period.

>>> ModelB.objects.select_related('model_a').extra(select={'a': '`blog_modela`.`id`'}).values('id', 'model_a_id', 'a', 'model_a__value').union(ModelB.objects.select_related('model_a').extra(select={'a': '`blog_modela`.`id`'}).values('id', 'model_a_id', 'a', 'model_a__value')).count()
SELECT
	COUNT(*)
FROM ((
		SELECT
			(`blog_modela`.`id`) AS `a`,
			`blog_modelb`.`id`,
			`blog_modelb`.`model_a_id`,
			`blog_modela`.`value`
		FROM
			`blog_modelb`
			INNER JOIN `blog_modela` ON (`blog_modelb`.`model_a_id` = `blog_modela`.`id`))
UNION (
	SELECT
		(`blog_modela`.`id`) AS `a`,
		`blog_modelb`.`id`,
		`blog_modelb`.`model_a_id`,
		`blog_modela`.`value`
	FROM
		`blog_modelb`
		INNER JOIN `blog_modela` ON (`blog_modelb`.`model_a_id` = `blog_modela`.`id`))) subquery

0

your fix looks making an unexpected SQL.

@ Mariusz Felisiak if you have a good idea for this bug, please share.

Last edited 3 years ago by Jordan Bae (previous) (diff)

comment:11 by Simon Charette, 2 years ago

I haven't tried it out but I wouldn't be surprised if this was fixed by 70499b25c708557fb9ee2264686cd172f4b2354e (#34123) as it ensures that all combined queries have non-colliding aliases.

in reply to:  11 comment:12 by Mariusz Felisiak, 2 years ago

Resolution: fixed
Status: assignedclosed

Replying to Simon Charette:

I haven't tried it out but I wouldn't be surprised if this was fixed by 70499b25c708557fb9ee2264686cd172f4b2354e (#34123) as it ensures that all combined queries have non-colliding aliases.

Good catch! I submitted PR with tests.

Fixed in 70499b25c708557fb9ee2264686cd172f4b2354e.

comment:13 by GitHub <noreply@…>, 2 years ago

In a411b909:

Refs #33050 -- Added test for QuerySet.count() on combined queries with select_related().

Thanks Simon Charette for noticing this.

Fixed in 70499b25c708557fb9ee2264686cd172f4b2354e.

Note: See TracTickets for help on using tickets.
Back to Top