Opened 16 years ago

Closed 12 years ago

Last modified 9 years ago

#10870 closed Bug (fixed)

Aggregates with joins ignore extra filters provided by setup_joins

Reported by: fas Owned by: fas
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: orm, aggregation, join, contenttypes, filter
Cc: mbencun@…, jdunck@…, michaelg@…, Robin, Noah Kantrowitz, Carl Meyer, Chris Chambers, joeri@…, anssi.kaariainen@…, ris, Alan Justino da Silva, daevaorn@…, fhahn, Hervé Cauwelier Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

(In case the description that follows is not understandable, please ask me for a better explanation).

An example where extra filters in joins are used are reverse generic relations (http://docs.djangoproject.com/en/dev/ref/contrib/contenttypes/#reverse-generic-relations).

class TaggedItem(models.Model):

    content_type = models.ForeignKey(ContentType)
    object_id = models.PositiveIntegerField()
    content_object = generic.GenericForeignKey('content_type', 'object_id')


class Bookmark(models.Model):
    url = models.URLField()
    tags = generic.GenericRelation(TaggedItem)

By querying Bookmark.objects.filter(tags=some_tag) the generic relation adds extra filters to ensure the correct content type. Those extra filters however are ignored when using aggregates, which leads to wrong results.

For example, the following query is not correct because it does not check the content_type.

Bookmark.objects.aggregate(Count('tags'))

What this aggregate actually calculates per bookmark is not the amount of tags for this bookmark but the amount of tags for any model referenced by TaggedItem with the same object id.

The error is isolated in db/models/sql/query.py in the add_aggregate method:

            field, source, opts, join_list, last, _ = self.setup_joins(
                field_list, opts, self.get_initial_alias(), False)

            # Process the join chain to see if it can be trimmed                                                                                             
            col, _, join_list = self.trim_joins(source, join_list, last, False)

The last dummy element _ is normally named 'extra' in other uses of setup_joins and contains the extra filters needed to make it work. Adding self.add_filter(*extra) after setup_joins fixes this problem.

Attachments (4)

aggregation-generic.diff (4.7 KB ) - added by Alex Gaynor 16 years ago.
I don't think the solution you've suggested works entirely as with this setup it generates incorrect results for me(it returns 9 instead of 3).
aggregation-generic.2.diff (5.0 KB ) - added by Alex Gaynor 16 years ago.
And elbow grease solves all problems.
aggregation-generic.3.diff (5.3 KB ) - added by Alex Gaynor 16 years ago.
Added a test to show it doesn't work with annotate, for some reason it only returns books that have been tagged, which doesn't match the behavior of any other annotation, it's an annotation not a filter, I assume this is the result of using the wrong join type
aggregation-generic.4.diff (5.2 KB ) - added by Alex Gaynor 16 years ago.

Download all attachments as: .zip

Change History (36)

comment:1 by Alex Gaynor, 16 years ago

Triage Stage: UnreviewedAccepted
Version: 1.0SVN

I really doubt this is with 1.0 ;)

by Alex Gaynor, 16 years ago

Attachment: aggregation-generic.diff added

I don't think the solution you've suggested works entirely as with this setup it generates incorrect results for me(it returns 9 instead of 3).

by Alex Gaynor, 16 years ago

Attachment: aggregation-generic.2.diff added

And elbow grease solves all problems.

by Alex Gaynor, 16 years ago

Attachment: aggregation-generic.3.diff added

Added a test to show it doesn't work with annotate, for some reason it only returns books that have been tagged, which doesn't match the behavior of any other annotation, it's an annotation not a filter, I assume this is the result of using the wrong join type

comment:2 by Malcolm Tredinnick, 16 years ago

This patch has a couple of problems, from looking at it quickly.

  1. There's a random comment on line 1436 that doesn't refer to any code. No code immediately follows it. It's also of the "set i to i+1" variety if it was intended to go before line 1448, so can be removed.
  2. The if-test on line 1448 can be removed. The extra parameter is always a list, possibly empty. So just iterate over it and if it's empty, nothing will be done.

by Alex Gaynor, 16 years ago

Attachment: aggregation-generic.4.diff added

comment:3 by Alex Gaynor, 16 years ago

1 and 2 are dealt with, still the strangle filtering issue, which I can't quite figure out, since a LEFT OUTER JOIN is being preformed(I've left the debugging statement in there).

comment:4 by fas, 16 years ago

Cc: mbencun@… added

comment:5 by fas, 16 years ago

Is this going in for 1.1? I think it should as it would otherwise leave a new feature with a quite serious bug.

comment:6 by Russell Keith-Magee, 16 years ago

milestone: 1.1

comment:7 by Alex Gaynor, 16 years ago

I don't know that solving this for 1.1 will be easy. The annotate stuff has reveiled a subtle bug that will require some fairly invasive changed to fix. The explanation is here: http://searchoracle.techtarget.com/expert/KnowledgebaseAnswer/0,289625,sid41_gci1148316,00.html

comment:8 by Alex Gaynor, 16 years ago

milestone: 1.11.2

Bumping from 1.1 because this will require significant changes to the Query class, because we need to add support for additional join conditions. Also, Jacob said to do this.

comment:9 by Jeremy Dunck, 15 years ago

Cc: jdunck@… added

comment:10 by Michael Grünewald, 15 years ago

Cc: michaelg@… added

comment:11 by Russell Keith-Magee, 15 years ago

milestone: 1.21.3

Not critical for 1.2.

comment:12 by Robin, 14 years ago

Cc: Robin added

comment:13 by Noah Kantrowitz, 14 years ago

Cc: Noah Kantrowitz added

comment:14 by Carl Meyer, 14 years ago

Cc: Carl Meyer added

comment:15 by Alex Gaynor, 14 years ago

So I've been thinking about this, and I'm wondering whether it perhaps makes sense to commit this patch, even with it excluding legitimate results. My reasoning: the current status is that aggregate() and annotate() both return bogus data, with the patch aggregate() returns correct data, and annotate() returns data that is valid but is *missing* results for items with no related data. This strikes me as a clear improvement, and even though the patch is obviously not completely correct, it is a first step to the more complicated completely correct implementation.

comment:16 by Chris Chambers, 14 years ago

Cc: Chris Chambers added

comment:17 by Chris Beaven, 14 years ago

Severity: Normal
Type: Bug

comment:18 by joeri@…, 13 years ago

Cc: joeri@… added
Easy pickings: unset
UI/UX: unset

comment:19 by Anssi Kääriäinen, 13 years ago

Cc: anssi.kaariainen@… added
Has patch: set
Needs tests: set
Patch needs improvement: set

I did some work to allow the contenttype lookup in left joins. This is useful in and of itself (you can do something like Book.objects.filter(notes__note__isnull=True), and it will correctly find all books without any notes (before it would not have found any books). The work allows also to generate correct results when annotating and aggregating.

There are some loose ends in the patch. First, the clone method of sql/query doesn't properly clone the extra_join_filters (the conditions need to be deep-copied if I am not mistaken). Second, there is some non-dryness in setup_joins. There is also general commenting and stuff like that that needs to be done. I don't know if the patch would break under more complex joins (the main problem is getting the extra_filter to the right join). So, more tests needed also.

Github branch available at: https://github.com/akaariai/django/tree/generic_relations_left_joins

comment:20 by Jacob, 13 years ago

milestone: 1.3

Milestone 1.3 deleted

comment:21 by ris, 13 years ago

Cc: ris added

comment:22 by Alan Justino da Silva, 13 years ago

Cc: Alan Justino da Silva added

comment:23 by Alexander Koshelev, 13 years ago

Cc: daevaorn@… added

comment:24 by fhahn, 13 years ago

Cc: fhahn added

comment:25 by Hervé Cauwelier, 12 years ago

Cc: Hervé Cauwelier added

comment:26 by Anssi Kääriäinen, 12 years ago

Could somebody write a test case against master? I am pretty sure this one is already fixed in the ORM.

comment:27 by fhahn, 12 years ago

Needs tests: unset
Patch needs improvement: unset

It seems fixed in master.

I've adopted the test from the previous patch and opened a pull request: https://github.com/django/django/pull/722

comment:28 by Anssi Kääriäinen <akaariai@…>, 12 years ago

Resolution: fixed
Status: newclosed

In 3e71368423b41c9418117328216e66f95cbaab03:

Fixed #10870 -- Added aggreation + generic reverse relation test

comment:29 by Anssi Kääriäinen, 12 years ago

Component: ORM aggregationDatabase layer (models, ORM)

comment:30 by Tim Graham <timograham@…>, 9 years ago

In e19bd086:

Fixed #24019 -- Fixed inaccurate docs about GenericRelation not supporting aggregation.

This works at least as far back as Django 1.6 according to the test
added in refs #10870.

comment:31 by Tim Graham <timograham@…>, 9 years ago

In 78f98057:

[1.9.x] Fixed #24019 -- Fixed inaccurate docs about GenericRelation not supporting aggregation.

This works at least as far back as Django 1.6 according to the test
added in refs #10870.

Backport of e19bd086d608c981098130a49e406de91dcc3d26 from master

comment:32 by Tim Graham <timograham@…>, 9 years ago

In 5adeb41:

[1.8.x] Fixed #24019 -- Fixed inaccurate docs about GenericRelation not supporting aggregation.

This works at least as far back as Django 1.6 according to the test
added in refs #10870.

Backport of e19bd086d608c981098130a49e406de91dcc3d26 from master

Note: See TracTickets for help on using tickets.
Back to Top