﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
11088	Aggregation problem, table JOINed twice	loic@…		"I don't know if I'm doing something wrong but I've tried to compile all the information needed to track this one.

=== Models ===
video/models.py
 {{{
class Video(models.Model):
    name = models.TextField()
    dir = models.TextField()
    videoclass = models.ForeignKey('Videoclass')
    status = models.CharField(max_length=24)
    date = models.DateTimeField()
    duration = models.FloatField()
    width = models.IntegerField()
    height = models.IntegerField()
    displayname = models.TextField()
    published = models.IntegerField()
    views = models.IntegerField()
    ratesum = models.IntegerField()
    ratenum = models.IntegerField() 
}}}


statistics/models.py

{{{
class VisitorAction(models.Model):
    name = models.CharField(max_length = 200)

    def __unicode__(self):
        return self.name


class VisitorLog(models.Model):
    visitor = models.ForeignKey(""Visitor"")
    video = models.ForeignKey(Video)
    action = models.ForeignKey(VisitorAction)
    seek_video = models.FloatField()
    event_time = models.DateTimeField(default = datetime.datetime.now)
    domain = models.CharField(max_length = 128)
}}}


=== Datas ===
statistics_visitorlog contains 100k+ entries

video_video contains 400 entries

statistics_visitoraction contains 6 entries ('play', 'seek' ... etc)

=== The QS ===
{{{
>>> from video.models import *
>>> from django.db.models import Count
>>> Video.objects.annotate(play_log = Count('visitorlog')).filter(visitorlog__action__name = 'play').order_by('-play_log')[0:10]
}}}
[Kill the mysqld because it takes forever and have a nice backtrace]
{{{
>>> from django.db import connection
>>> connection.queries[-1]
{'time': '7.180', 'sql': u'SELECT `video_video`.`id`, `video_video`.`name`, `video_video`.`dir`, `video_video`.`videoclass_id`, `video_video`.`status`, `video_video`.`date`, `video_video`.`duration`, `video_video`.`width`, `video_video`.`height`, `video_video`.`displayname`, `video_video`.`published`, `video_video`.`views`, `video_video`.`ratesum`, `video_video`.`ratenum`, COUNT(`statistics_visitorlog`.`id`) AS `play_log` FROM `video_video` LEFT OUTER JOIN `statistics_visitorlog` ON (`video_video`.`id` = `statistics_visitorlog`.`video_id`) INNER JOIN `statistics_visitorlog` T4 ON (`video_video`.`id` = T4.`video_id`) INNER JOIN `statistics_visitoraction` ON (T4.`action_id` = `statistics_visitoraction`.`id`) WHERE `statistics_visitoraction`.`name` = play  GROUP BY `video_video`.`id` ORDER BY play_log DESC LIMIT 10'}
}}}

=== The sql request ===

{{{
SELECT   `video_video`.`id`           ,
            `video_video`.`name`         ,
            `video_video`.`dir`          ,
            `video_video`.`videoclass_id`,
            `video_video`.`status`       ,
            `video_video`.`date`         ,
            `video_video`.`duration`     ,
            `video_video`.`width`        ,
            `video_video`.`height`       ,
            `video_video`.`displayname`  ,
            `video_video`.`published`    ,
            `video_video`.`views`        ,
            `video_video`.`ratesum`      ,
            `video_video`.`ratenum`      ,
            COUNT(`statistics_visitorlog`.`id`) AS `play_log`
   FROM     `video_video`
            LEFT OUTER JOIN `statistics_visitorlog`
            ON
                     (
                              `video_video`.`id` = `statistics_visitorlog`.`video_id`
                     )
            INNER JOIN `statistics_visitorlog` T4
            ON
                     (
                              `video_video`.`id` = T4.`video_id`
                     )
            INNER JOIN `statistics_visitoraction`
            ON
                     (
                              T4.`action_id` = `statistics_visitoraction`.`id`
                     )
   WHERE    `statistics_visitoraction`.`name` = 'play'
   GROUP BY `video_video`.`id`
}}}

It takes forever and burns my cpu. However, if I remove the LEFT OUTER JOIN on statistics_visitorlog (which seems to be joined twice), and fix the select to call the proper JOINed table:

{{{
SELECT   `video_video`.`id`           ,
            `video_video`.`name`         ,
            `video_video`.`dir`          ,
            `video_video`.`videoclass_id`,
            `video_video`.`status`       ,
            `video_video`.`date`         ,
            `video_video`.`duration`     ,
            `video_video`.`width`        ,
            `video_video`.`height`       ,
            `video_video`.`displayname`  ,
            `video_video`.`published`    ,
            `video_video`.`views`        ,
            `video_video`.`ratesum`      ,
            `video_video`.`ratenum`      ,
            COUNT(T4.`id`) AS `play_log`
   FROM     `video_video`
            INNER JOIN `statistics_visitorlog` T4
            ON
                     (
                              `video_video`.`id` = T4.`video_id`
                     )
            INNER JOIN `statistics_visitoraction`
            ON
                     (
                              T4.`action_id` = `statistics_visitoraction`.`id`
                     )
   WHERE    `statistics_visitoraction`.`name` = 'play'
   GROUP BY `video_video`.`id`
}}}

this returns in 0.6 sec with all the datas needed to fill the objects.
"		closed	Database layer (models, ORM)	1.1-beta		invalid	aggregation, join		Unreviewed	0	0	0	0	0	0
