Opened 3 months ago

Last modified 3 months ago

#33050 assigned Bug

QuerySet.count() on combined queries with select_related() crashes on MySQL.

Reported by: Sunkyue Owned by: Jordan Bae
Component: Database layer (models, ORM) Version: 3.2
Severity: Normal Keywords: orm, count, union, select_related mysql
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Sunkyue)

test case:

class ModelA(models.Model):
    value = models.IntegerField()

class ModelB(models.Model):
   model_a = models.ForeignKey("ModelA", on_delete=models.CASCADE)

->

ModelB.objects.select_related('model_a').union(ModelB.objects.select_related('model_a')).count()

results in error.

 File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\models\query.py", line 412, in count
    return self.query.get_count(using=self.db)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\models\sql\query.py", line 521, in get_count
    number = obj.get_aggregation(using, ['__count'])['__count']
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\models\sql\query.py", line 506, in get_aggregation
    result = compiler.execute_sql(SINGLE)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\models\sql\compiler.py", line 1175, in execute_sql
    cursor.execute(sql, params)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\backends\utils.py", line 98, in execute
    return super().execute(sql, params)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\backends\utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\backends\utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\backends\utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\backends\utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\django\db\backends\mysql\base.py", line 73, in execute
    return self.cursor.execute(query, args)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\MySQLdb\cursors.py", line 206, in execute
    res = self._query(query)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\MySQLdb\cursors.py", line 319, in _query
    db.query(q)
  File "c:\Users\user\Documents\workspace\realclass2-api\.venv\lib\site-packages\MySQLdb\connections.py", line 259, in query
    _mysql.connection.query(self, query)
django.db.utils.OperationalError: (1060, "Duplicate column name 'id'")

reproduced on on 3.2.5/3.2.6

suggested fix:
on django.db.models.sql.query.py -> Query.get_aggregation, from

            inner_query.select_for_update = False
            inner_query.select_related = False
            inner_query.set_annotation_mask(self.annotation_select)

to

            inner_query.select_for_update = False
            inner_query.select_related = False
            for combined_query in inner_query.combined_queries:
                combined_query.select_related = False
            inner_query.set_annotation_mask(self.annotation_select)

can solve the problem I think.

my current monkey-patching code is,

from django.db.models.sql.query import Query

old_get_aggregation = Query.get_aggregation

def get_aggregation(self, using, added_aggregate_names):
    cloned_self = self.clone()
    for combined_query in cloned_self.combined_queries:
        combined_query.select_related = False
    return old_get_aggregation(cloned_self, using, added_aggregate_names)

Query.get_aggregation = get_aggregation

Change History (10)

comment:1 Changed 3 months ago by Sunkyue

Easy pickings: set

comment:2 Changed 3 months ago by Sunkyue

Description: modified (diff)

comment:3 Changed 3 months ago by Sunkyue

Description: modified (diff)

comment:4 Changed 3 months ago by Sunkyue

Description: modified (diff)

comment:5 Changed 3 months ago by Sunkyue

Description: modified (diff)

comment:6 Changed 3 months ago by Mariusz Felisiak

Easy pickings: unset
Keywords: mysql added
Summary: calling count on union of queries having select_related results in error.QuerySet.count() on combined queries with select_related() crashes on MySQL.
Triage Stage: UnreviewedAccepted

Thanks for the report. The proposed patch looks good. Would you like to prepare PR?

comment:7 in reply to:  6 Changed 3 months ago by Sunkyue

Replying to Mariusz Felisiak:

Thanks for the report. The proposed patch looks good. Would you like to prepare PR?

Thank you for your review.

I'd like to submit a PR, however I don't have enough time for that for a while. (I haven't submitted any PRs to open source project...so for me, it will took some time..)

If anyone who can submit PR before me, it would be welcomed.


In the mean time, I found another buggy case:

ModelB.objects.select_related('model_a').union(ModelB.objects.select_related('model_a')).union(ModelB.objects.select_related('model_a')).count()

produces,

OperationalError: (1222, 'The used SELECT statements have a different number of columns')

because it produces nested configuration like,

query.combined_queries = (query.combined_queries(query, query), query)

So,

            for combined_query in inner_query.combined_queries:
                combined_query.select_related = False

this fix should be improved by doing this in recursive way I think...

Last edited 3 months ago by Sunkyue (previous) (diff)

comment:8 Changed 3 months ago by Jordan Bae

Owner: changed from nobody to Jordan Bae
Status: newassigned

Could i look into and make a patch for this?

comment:9 Changed 3 months ago by Jordan Bae

As my investigation, this issue is from subquery's duplicated column. it's not related to aggregate function.
In my opinion, we had better to improve the subquery which includes join. I attached a example for this.

mysql> select * from (
-> select django_migrations.id, dj.id from django_migrations inner join django_migrations as dj on dj.id = django_migrations.id
-> ) subquery;
ERROR 1060 (42S21): Duplicate column name 'id' `

and current SQL is created by Subquery.

a=Subquery(ModelB.objects.select_related('model_a'))
print(a.query)
SELECT blog_modelb.id, blog_modelb.model_a_id, blog_modela.id, blog_modela.value FROM blog_modelb INNER JOIN
blog_modela ON (blog_modelb.model_a_id = blog_modela.id)

If there is any missing or I misunderstood. please know me.
I will try to fix for this.

comment:10 Changed 3 months ago by Jordan Bae

Originally, that queryset generates below query.

SELECT
	COUNT(*)
FROM ((
		SELECT
			`blog_modelb`.`id`,
			`blog_modelb`.`model_a_id`,
			`blog_modela`.`id`,
			`blog_modela`.`value`
		FROM
			`blog_modelb`
			INNER JOIN `blog_modela` ON (`blog_modelb`.`model_a_id` = `blog_modela`.`id`))
UNION (
	SELECT
		`blog_modelb`.`id`,
		`blog_modelb`.`model_a_id`,
		`blog_modela`.`id`,
		`blog_modela`.`value`
	FROM
		`blog_modelb`
		INNER JOIN `blog_modela` ON (`blog_modelb`.`model_a_id` = `blog_modela`.`id`))) subquery

@Sunkyue
I think you can use this during shot-term period.

>>> ModelB.objects.select_related('model_a').extra(select={'a': '`blog_modela`.`id`'}).values('id', 'model_a_id', 'a', 'model_a__value').union(ModelB.objects.select_related('model_a').extra(select={'a': '`blog_modela`.`id`'}).values('id', 'model_a_id', 'a', 'model_a__value')).count()
SELECT
	COUNT(*)
FROM ((
		SELECT
			(`blog_modela`.`id`) AS `a`,
			`blog_modelb`.`id`,
			`blog_modelb`.`model_a_id`,
			`blog_modela`.`value`
		FROM
			`blog_modelb`
			INNER JOIN `blog_modela` ON (`blog_modelb`.`model_a_id` = `blog_modela`.`id`))
UNION (
	SELECT
		(`blog_modela`.`id`) AS `a`,
		`blog_modelb`.`id`,
		`blog_modelb`.`model_a_id`,
		`blog_modela`.`value`
	FROM
		`blog_modelb`
		INNER JOIN `blog_modela` ON (`blog_modelb`.`model_a_id` = `blog_modela`.`id`))) subquery

0

your fix looks making an unexpected SQL.

@ Mariusz Felisiak if you have a good idea for this bug, please share.

Last edited 3 months ago by Jordan Bae (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top