Code

Opened 3 years ago

Last modified 3 months ago

#14645 new Bug

Exclude query with multiple conditions for the same multi-value relation not correct

Reported by: Ben Buchwald <bb2@…> Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: exclude manytomany
Cc: rma, chriskrusz, PhiR_42 Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

According to http://docs.djangoproject.com/en/1.2/topics/db/queries/#spanning-multi-valued-relationships: "all the conditions in a single exclude() statement apply to a single instance (if those conditions are talking about the same multi-valued relation)." This works correctly for filter() but for exclude() it is operating the same as if it was 2 separate exclude() calls. Here's an example set of models:

class Song(models.Model):
   name = models.CharField(max_length=30)

class Release(models.Model):
   song = models.ForeignKey(Song)
   format = models.CharField(max_length=3,choices=(('cd',"CD"),('mp3',"MP3")))
   released = models.BooleanField()
   class Meta:
      unique_together = ['song','format']

If I want to ask for all of the songs that have a CD release that has already gone out this filter works:

Song.objects.filter(release_set__format='cd',release_set__released=1)

But if I want to find all the songs that don't have a CD release that has gone out (either it has one that hasn't been released yet, or it doesn't have a release record at all), this exclude statement is not working:

Song.objects.exclude(release_set__format='cd',release_set__released=1)

It produces this SQL:

('SELECT U0.`id` FROM `songs_song` U0 
  WHERE NOT (U0.`id` IN (SELECT U1.`id` FROM `songs_release` U1 
                         WHERE U1.`format` = %s) AND 
             U0.`id` IN (SELECT U1.`id` FROM `songs_release` U1 
                         WHERE U1.`released` = %s ))',
('cd', True))

Instead of what I would expect (and give me the right results):

('SELECT U0.`id` FROM `songs_song` U0 
  WHERE NOT (U0.`id` IN (SELECT U1.`id` FROM `songs_release` U1 
                         WHERE U1.`format` = %s AND 
                               U1.`released` = %s ))',
('cd', True))

Song.objects.filter(~Q(release_set__format='cd',release_set__released=1)) produces the same result, but Song.objects.exclude(Q(release_set__format='cd',release_set__released=1)) produces the even more wrong:

('SELECT U0.`id` FROM `songs_song` U0 INNER JOIN `songs_release` U1 ON (U0.`id` = U1.`song_id`)
  WHERE NOT ((U1.`format` = %s  AND U1.`released` = %s ))',
('cd', True))

Attachments (0)

Change History (11)

comment:1 Changed 3 years ago by russellm

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

comment:2 Changed 3 years ago by rma

  • Cc rma added
  • Version changed from 1.1 to 1.2

Can confirm in Django 1.2.4 (Mac OS X 10.6.5) and Django 1.2.3 (Debian Squeeze). Multiple conditions in a single exclude() call result in each condition being evaluated to a distinct instance, rather than all conditions being applied to a single instance.

comment:3 Changed 3 years ago by jaddison

  • Severity set to Normal
  • Type set to Bug

comment:4 Changed 2 years ago by aaugustin

  • Easy pickings unset
  • UI/UX unset

#17315 was a duplicate. It contains a thorough demonstration of the problem.

comment:5 Changed 2 years ago by anonymous

I can confirm this Bug on Django 1.3.1 on Linux.

comment:6 Changed 2 years ago by anonymous

  • Version changed from 1.2 to 1.3

comment:7 Changed 12 months ago by chriskrusz

  • Version changed from 1.3 to master

Still present in master.

You can get SQL which returns what you would expect using this:

Song.objects.exclude(id__in=Release.objects.filter(format='cd',released=1).values('song'))

This gives the query you would expect originally:

SELECT "songs_song"."id", "songs_song"."name" FROM "songs_song"
WHERE NOT (
    "songs_song"."id" IN (
        SELECT U0."song_id"
        FROM "songs_release" U0
        WHERE (U0."released" = True  AND U0."format" = cd)
    )
)

comment:8 Changed 12 months ago by chriskrusz

Please note that this behaviour contradicts the documentation.

According to the documentation (from https://docs.djangoproject.com/en/1.5/topics/db/queries/#spanning-multi-valued-relationships):

"All of this behavior also applies to exclude(): all the conditions in a single exclude() statement apply to a single instance (if those conditions are talking about the same multi-valued relation). Conditions in subsequent filter() or exclude() calls that refer to the same relation may end up filtering on different linked objects."

But exclude actually filters each condition on a different linked object.

comment:9 Changed 12 months ago by chriskrusz

  • Cc chriskrusz added

comment:10 Changed 8 months ago by akaariai

This one will be hard to fix correctly. Consider case:

Song.objects.exclude((Q(release_set__format='cd') | Q(pk=1)) & (Q(release_set__format='lp') | Q(pk=2)))

As far as I understand the only sensible way to write this is to have a query:

SELECT * FROM "song" WHERE NOT EXISTS (
    SELECT 1
      FROM "release_set"
     WHERE ((format = 'cd' OR "song"."pk" = 1)
            AND
            (format = 'lp' OR "song"."pk" = 2)
           AND "song"."pk" = "release_set"."song_id"

that is, the whole condition must be pushed down into EXISTS query. The condition should be pushed down from the lowest connector that contains all the references to the same subquery.

comment:11 Changed 3 months ago by PhiR_42

  • Cc PhiR_42 added

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as new
The owner will be changed from nobody to anonymous. Next status will be 'assigned'
as The resolution will be set. Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.