Code

Opened 3 years ago

Closed 16 months ago

#16112 closed Bug (fixed)

Excluding some value in a related model excludes objects without that model

Reported by: adehnert Owned by: nobody
Component: Database layer (models, ORM) Version: 1.3
Severity: Normal Keywords:
Cc: adehnert Triage Stage: Design decision needed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I have models that look something like:

class Group(models.Model):
    name = models.CharField(max_length=100)
    activity_category = models.ForeignKey('ActivityCategory', null=True, blank=True, )
class ActivityCategory(models.Model):
    name = models.CharField(max_length=50)

I'm trying to find each Group that doesn't have some ActivityCategory:

qobj = Q(activity_category__name='Dorm')
queryset = groups.models.Group.objects.filter(~qobj)

Intuitively, I would expect this to find both groups with no ActivityCategory and groups where the ActivityCategory exists and it isn't "Dorm". Reading https://docs.djangoproject.com/en/dev/topics/db/queries/#lookups-that-span-relationships seems to support this interpretation --- "If you are filtering across multiple relationships and one of the intermediate models doesn't have a value that meets the filter condition, Django will treat it as if there is an empty (all values are NULL), but valid, object there." Unfortunately, the ORM generates a query using INNER JOIN instead of LEFT OUTER JOIN, so my Group with a null ActivityCategory will not be returned.

I can work around this by explicitly indicating that null is possible, but it's annoying and unintuitive:

qobj = Q(activity_category__isnull = True) | ~(Q(activity_category__name='Dorm') | Q(activity_category__name='FSILG'))
queryset = groups.models.Group.objects.filter(qobj)

Attachments (1)

models.py (1.4 KB) - added by adehnert 3 years ago.
models.py to reproduce

Download all attachments as: .zip

Change History (8)

comment:1 Changed 3 years ago by aaugustin

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Design decision needed

The docs basically say that the ORM won't crash'n'burn when you're filtering on foo__bar=42 just because one of your objects doesn't have a related Foo object. See the paragraph that follows your quote. It doesn't really say anything about "negative queries".

That said, I agree with your analysis. Assuming a given object doesn't have an associated Foo object, the value of its foo__bar is something akin to NaN. Since NaN != <whatever> evaluates to True, this analogy supports your interpretation.

Fixing this problem would be backwards incompatible: it would change the result of "negative queries". IMO, we can't say it's an obvious bug, rather an under-defined behavior. So I'll mark it as DDN.

comment:2 Changed 3 years ago by aaugustin

Addendum: this might be a duplicate of #14876 — I'm not sure.

comment:3 Changed 3 years ago by adehnert

Mmm.

If the outcome of the design decision is to stick with the status quo (which would make me sad, but such is life), it'd be nice to more clearly document that behavior. (Admittedly, I'm not sure that it'll actually help very many people.)

comment:4 Changed 3 years ago by ethlinn

  • Resolution set to worksforme
  • Status changed from new to closed
  • UI/UX unset

Can't reproduce it anymore with a current trunk version. The query contains no INNER JOIN, but LEFT OUTER join.

Changed 3 years ago by adehnert

models.py to reproduce

comment:5 Changed 3 years ago by adehnert

  • Resolution worksforme deleted
  • Status changed from closed to reopened

Hm, okay, so apparently I over-simplified the my steps to reproduce. It looks like the bug is actually in how combining Q objects is handled, somehow?

In particular, it appears that the LEFT OUTER JOIN occurs if (a) your qobj is a single clause, and (b) you don't try to NOT any Q wrapping the qobj. If you either do an OR (e.g., Q(activity_category__name='Dorm') | Q(activity_category__name='Dorm'), which should have the same result set) or NOT a wrapping Q (for example, by doing a .exclude instead of a .filter), things will break.

The following demonstrates behavior with various different Q objects (see the attached models.py for what they are exactly):

>>> import groups.models; groups.models.test()
or'd/exclude [<Group: foo (cat=1)>]
or'd/filter [<Group: foo (cat=1)>]
or'd/filterout [<Group: foo (cat=1)>]
or'd/filterin [<Group: foo (cat=1)>]
single/exclude [<Group: foo (cat=1)>]
single/filter [<Group: foo (cat=1)>, <Group: Bar (cat=None)>]
single/filterout [<Group: foo (cat=1)>]
single/filterin [<Group: foo (cat=1)>, <Group: Bar (cat=None)>]

SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm  OR "groups_activitycategory"."name" = Dorm ))
SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm  OR "groups_activitycategory"."name" = Dorm ))
SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm  OR "groups_activitycategory"."name" = Dorm ))
SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm  OR "groups_activitycategory"."name" = Dorm ))
SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT ("groups_activitycategory"."name" = Dorm )
SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" LEFT OUTER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm  AND NOT ("groups_activitycategory"."id" IS NULL)))
SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT ("groups_activitycategory"."name" = Dorm )
SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" LEFT OUTER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm  AND NOT ("groups_activitycategory"."id" IS NULL)))

Given that ORing together identical Q objects will change the queryset produced by filtering on it, this seems a little more clearly a bug (rather than a wart) than it previously appeared.

comment:6 Changed 16 months ago by aaugustin

  • Status changed from reopened to new

comment:7 Changed 16 months ago by aaugustin

  • Resolution set to fixed
  • Status changed from new to closed

(I was wrong in my first comment -- in SQL NULL != <whatever> evaluates to False.)

Anyway, I just tried your example, and Django now consistently uses LEFT OUTER JOIN. This was fixed, maybe at the same time as #14876.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.