﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
22438	Slow INNER JOIN in MySQL can be fixed in Django ORM, but should it?	frol	nobody	"There is the bug in MySQL INNER JOIN optimization, but it can be fixed in Django ORM by replacing INNER JOIN with STRAIGHT_JOIN. In short, ordered inner join selections take 3+ seconds, where if replace INNER JOIN with STRAIGHT_JOIN, we get result in 0.001.

Here are steps to reproduce:
1) We need a model with ForeignKey:
{{{
from django.conf import settings
from django.db import models

class Child(models.Model):
    name = models.CharField(""Name"", max_length=255)
    owner = models.ForeignKey(settings.AUTH_USER_MODEL)
}}}

2) Initialize data:
{{{
from random import choice
from django.contrib.auth import get_user_model
from qq.models import *

User = get_user_model()
users = User.objects.all()
Child.objects.bulk_create(Child(name='child_%d' % i, owner=choice(users)) for i in xrange(400000))
}}}

3) Query data with join (the bug appears only if order by is applied):
{{{
Child.objects.all().order_by('-id').select_related('owner')[:2]
}}}												
The resulting query would be:
{{{
 {u'sql': u'SELECT `qq_child`.`id`, `qq_child`.`name`, `qq_child`.`other_field`, `qq_child`.`owner_id`, 
`auth_user`.`id`, `auth_user`.`password`, `auth_user`.`last_login`, `auth_user`.`is_superuser`, 
`auth_user`.`username`, `auth_user`.`first_name`, `auth_user`.`last_name`, `auth_user`.`email`, 
`auth_user`.`is_staff`, `auth_user`.`is_active`, `auth_user`.`date_joined` FROM `qq_child` 
INNER JOIN `auth_user` ON ( `qq_child`.`owner_id` = `auth_user`.`id` ) 
ORDER BY `qq_child`.`id` ASC 
LIMIT 2',
  u'time': u'4.608'},
}}}

(MySQL caches result until you update the table so the next time it would take 0.001, but not in production if your table updates frequently)

If I replace INNER JOIN with STRAIGHT_JOIN:
{{{
list(Child.objects.raw(
    Child.objects.all().order_by('-id').select_related('owner')[:2]\
        .query.sql_with_params()[0].replace('INNER JOIN', 'STRAIGHT_JOIN')
))
}}}

I get this result:
{{{
 {u'sql': u'SELECT `qq_child`.`id`, `qq_child`.`name`, `qq_child`.`other_field`, `qq_child`.`owner_id`, 
`auth_user`.`id`, `auth_user`.`password`, `auth_user`.`last_login`, `auth_user`.`is_superuser`, 
`auth_user`.`username`, `auth_user`.`first_name`, `auth_user`.`last_name`, `auth_user`.`email`, 
`auth_user`.`is_staff`, `auth_user`.`is_active`, `auth_user`.`date_joined` FROM `qq_child` 
STRAIGHT_JOIN `auth_user` ON ( `qq_child`.`owner_id` = `auth_user`.`id` ) 
ORDER BY `qq_child`.`id` DESC 
LIMIT 2',
  u'time': u'0.001'}] 								
}}}

'''4.6 seconds vs 0.001. Really?'''

'''Solutions:'''
1) SQLAlchemy implemented .with_prefix() method for query: http://stackoverflow.com/a/16743228
2) Replace INNER JOIN for MySQL backend with STRAIGHT_JOIN, but it might be not a good idea. I will investigate this today.

'''Wrong attempts to fix this:'''
Since select_related is used in django admin change_list if you ask for related fields in columns, devs try to override it with .raw() - http://stackoverflow.com/a/15978732/1178806 (I answered in comments why it doesn't work)."	Uncategorized	closed	Database layer (models, ORM)	dev	Normal	wontfix	mysql, orm, slow query, join		Unreviewed	0	0	0	0	0	0
