Opened 14 years ago

Last modified 4 years ago

#14645 new Bug

Exclude query with multiple conditions for the same multi-value relation not correct

Reported by: Ben Buchwald <bb2@…> Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: exclude manytomany
Cc: rma, Chris, PhiR_42, benkraft, bugs@…, Can Sarıgöl Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by Asif Saifuddin Auvi)

According to http://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-valued-relationships: "all the conditions in a single exclude() statement apply to a single instance (if those conditions are talking about the same multi-valued relation)." This works correctly for filter() but for exclude() it is operating the same as if it was 2 separate exclude() calls. Here's an example set of models:

class Song(models.Model):
   name = models.CharField(max_length=30)

class Release(models.Model):
   song = models.ForeignKey(Song)
   format = models.CharField(max_length=3,choices=(('cd',"CD"),('mp3',"MP3")))
   released = models.BooleanField()
   class Meta:
      unique_together = ['song','format']

If I want to ask for all of the songs that have a CD release that has already gone out this filter works:

Song.objects.filter(release_set__format='cd',release_set__released=1)

But if I want to find all the songs that don't have a CD release that has gone out (either it has one that hasn't been released yet, or it doesn't have a release record at all), this exclude statement is not working:

Song.objects.exclude(release_set__format='cd',release_set__released=1)

It produces this SQL:

('SELECT U0.`id` FROM `songs_song` U0 
  WHERE NOT (U0.`id` IN (SELECT U1.`id` FROM `songs_release` U1 
                         WHERE U1.`format` = %s) AND 
             U0.`id` IN (SELECT U1.`id` FROM `songs_release` U1 
                         WHERE U1.`released` = %s ))',
('cd', True))

Instead of what I would expect (and give me the right results):

('SELECT U0.`id` FROM `songs_song` U0 
  WHERE NOT (U0.`id` IN (SELECT U1.`id` FROM `songs_release` U1 
                         WHERE U1.`format` = %s AND 
                               U1.`released` = %s ))',
('cd', True))

Song.objects.filter(~Q(release_set__format='cd',release_set__released=1)) produces the same result, but Song.objects.exclude(Q(release_set__format='cd',release_set__released=1)) produces the even more wrong:

('SELECT U0.`id` FROM `songs_song` U0 INNER JOIN `songs_release` U1 ON (U0.`id` = U1.`song_id`)
  WHERE NOT ((U1.`format` = %s  AND U1.`released` = %s ))',
('cd', True))

Change History (25)

comment:1 by Russell Keith-Magee, 14 years ago

Triage Stage: UnreviewedAccepted

comment:2 by rma, 14 years ago

Cc: rma added
Version: 1.11.2

Can confirm in Django 1.2.4 (Mac OS X 10.6.5) and Django 1.2.3 (Debian Squeeze). Multiple conditions in a single exclude() call result in each condition being evaluated to a distinct instance, rather than all conditions being applied to a single instance.

comment:3 by James Addison, 14 years ago

Severity: Normal
Type: Bug

comment:4 by Aymeric Augustin, 13 years ago

Easy pickings: unset
UI/UX: unset

#17315 was a duplicate. It contains a thorough demonstration of the problem.

comment:5 by anonymous, 13 years ago

I can confirm this Bug on Django 1.3.1 on Linux.

comment:6 by anonymous, 13 years ago

Version: 1.21.3

comment:7 by Chris, 12 years ago

Version: 1.3master

Still present in master.

You can get SQL which returns what you would expect using this:

Song.objects.exclude(id__in=Release.objects.filter(format='cd',released=1).values('song'))

This gives the query you would expect originally:

SELECT "songs_song"."id", "songs_song"."name" FROM "songs_song"
WHERE NOT (
    "songs_song"."id" IN (
        SELECT U0."song_id"
        FROM "songs_release" U0
        WHERE (U0."released" = True  AND U0."format" = cd)
    )
)

comment:8 by Chris, 11 years ago

Please note that this behaviour contradicts the documentation.

According to the documentation (from https://docs.djangoproject.com/en/1.5/topics/db/queries/#spanning-multi-valued-relationships):

"All of this behavior also applies to exclude(): all the conditions in a single exclude() statement apply to a single instance (if those conditions are talking about the same multi-valued relation). Conditions in subsequent filter() or exclude() calls that refer to the same relation may end up filtering on different linked objects."

But exclude actually filters each condition on a different linked object.

comment:9 by Chris, 11 years ago

Cc: Chris added

comment:10 by Anssi Kääriäinen, 11 years ago

This one will be hard to fix correctly. Consider case:

Song.objects.exclude((Q(release_set__format='cd') | Q(pk=1)) & (Q(release_set__format='lp') | Q(pk=2)))

As far as I understand the only sensible way to write this is to have a query:

SELECT * FROM "song" WHERE NOT EXISTS (
    SELECT 1
      FROM "release_set"
     WHERE ((format = 'cd' OR "song"."pk" = 1)
            AND
            (format = 'lp' OR "song"."pk" = 2)
           AND "song"."pk" = "release_set"."song_id"

that is, the whole condition must be pushed down into EXISTS query. The condition should be pushed down from the lowest connector that contains all the references to the same subquery.

comment:11 by PhiR_42, 11 years ago

Cc: PhiR_42 added

comment:12 by benkraft, 10 years ago

Cc: benkraft added

comment:13 by David Seddon, 10 years ago

I've submitted a pull request for some documentation to draw attention to this behaviour. Let me know if you'd like it adjusted.

https://github.com/django/django/pull/3898

comment:14 by ris, 10 years ago

Cc: bugs@… added

comment:15 by David Seddon, 10 years ago

Just following up on the pull request, is this something that would be useful to be added to the documentation?

https://github.com/django/django/pull/3898

comment:16 by David Seddon, 10 years ago

I've been looking at correcting the documentation and have just realised that one of the statements in the original bug report is wrong: "This works correctly for filter() but for exclude() it is operating the same as if it was 2 separate exclude() calls."

Actually, the behaviour is a bit different:

Song.objects.exclude(release_set__format='cd',release_set__released=1) would exclude Songs that have both CD releases and releases that have gone out.

Song.objects.exclude(release_set__format='cd').exclude(release_set__released=1), on the other hand, would exclude Songs that have either CD releases or releases that have gone out.

I've made a gist that sums this up in the form of tests: https://gist.github.com/seddonym/84407891a11389419c14

comment:17 by Anssi Kääriäinen, 10 years ago

I think it is time to start working on this. I think we have two major bugs in Django ORM:

  1. This one.
  2. Aggregation over multiple different multivalued relations, or aggregation after filtering over m2m relation produces wrong results.

In addition, we have the problem that filtering over multivalued relation doesn't use a subquery, instead you have to use distinct.

These all are mixed together. Unfortunately this isn't easy to fix. But that doesn't mean we shouldn't try.

comment:18 by Tim Graham <timograham@…>, 10 years ago

In 6770b7ec:

Refs #14645 -- Documented bug with exclude() and multi-value relations

comment:19 by Tim Graham <timograham@…>, 10 years ago

In b46643a4:

[1.7.x] Refs #14645 -- Documented bug with exclude() and multi-value relations

Backport of 6770b7ecd208a0746f181e54202fb829460c6490 from master

comment:20 by Tim Graham <timograham@…>, 10 years ago

In 744d9a10:

[1.8.x] Refs #14645 -- Documented bug with exclude() and multi-value relations

Backport of 6770b7ecd208a0746f181e54202fb829460c6490 from master

comment:21 by Tim Graham <timograham@…>, 10 years ago

In 48f5adf:

[1.6.x] Refs #14645 -- Documented bug with exclude() and multi-value relations

Backport of 6770b7ecd208a0746f181e54202fb829460c6490 from master

comment:22 by Can Sarıgöl, 5 years ago

Cc: Can Sarıgöl added
Has patch: set

I tried to fix this issue with PR as much as the sample that's in the description. Does the approach and solution place make sense?

comment:23 by Asif Saifuddin Auvi, 5 years ago

Description: modified (diff)

comment:24 by Mariusz Felisiak, 4 years ago

Patch needs improvement: set

comment:25 by Mariusz Felisiak, 4 years ago

#31922 was closed as a duplicate.

Note: See TracTickets for help on using tickets.
Back to Top