Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#32554 closed New feature (wontfix)

Add Q.empty(), Q.TRUE, Q.FALSE, Q.any(), and Q.all()

Reported by: jonathan-golorry Owned by: jonathan-golorry
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: Q objects, any, all
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: yes
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by jonathan-golorry)

Q.empty() checks for nested empty Q objects such as Q(Q()). See #32549 for rationale.

Q.FALSE and Q.TRUE are classattribute aliases for Q(pk__in=[]) and ~Q(pk__in=[]).

Q.any() and Q.all() mimic python's builtin any and all, defaulting to Q.FALSE and Q.TRUE for empty iterators (or iterators containing only empty Q objects).

Patch: https://github.com/django/django/pull/14159

See this forum thread discussing why these are an improvement over many ad-hoc implementations people are using: https://forum.djangoproject.com/t/improving-q-objects-with-true-false-and-none/851

Change History (5)

comment:1 by jonathan-golorry, 3 years ago

Description: modified (diff)
Has patch: set
Needs documentation: set
Owner: changed from nobody to jonathan-golorry
Status: newassigned

comment:2 by jonathan-golorry, 3 years ago

Description: modified (diff)

comment:3 by jonathan-golorry, 3 years ago

Summary: Add Q.TRUE, Q.FALSE, Q.any() and Q.all()Add Q.empty(), Q.TRUE, Q.FALSE, Q.any(), and Q.all()

comment:4 by Mariusz Felisiak, 3 years ago

Resolution: wontfix
Status: assignedclosed

Thanks for this ticket, however I have mixed-feelings. Django API is already massive. IMO new hooks will make Q expressions more confusing and are not necessary per se. Q.any() and Q.all() can be replaced with reduce() (that's how we do this in Django itself, see an example). I don't see why the new API should be preferred over an example with reduce() in docs. It has already been pointed out that we need a good example of building Q() dynamically.

About Q.TRUE and Q.FALSE, these constants are really tricky and niche. I don't believe that folks will not how to use them properly (I don't see immediate and wide use). Moreover, I'm not sure why Django should recommend the pk__in=[] lookup (and not e.g. pk=-1), everything depends on the context. It is so niche that folks should choose on their own.

I strongly believe that we need a good example of building Q() dynamically with the existing API, rather than adding yet another method to do the same.

comment:5 by jonathan-golorry, 3 years ago

Replying to Mariusz Felisiak:

Thanks for this ticket, however I have mixed-feelings. Django API is already massive. IMO new hooks will make Q expressions more confusing and are not necessary per se. Q.any() and Q.all() can be replaced with reduce() (that's how we do this in Django itself, see an example). I don't see why the new API should be preferred over an example with reduce() in docs. It has already been pointed out that we need a good example of building Q() dynamically.

reduce isn't great. A common mistake I see in accepted answers on stack overflow is that people forget to address empty iterators (potentially leaking data due to empty Q() objects).
https://stackoverflow.com/questions/13076822/django-dynamically-filtering-with-q-objects
https://stackoverflow.com/questions/852414/how-to-dynamically-compose-an-or-query-filter-in-django

reduce turns that into a TypeError, which is better than leaking data, but handling the case properly is still a pain. It's hard to explain why some initializers work as defaults and others don't.

reduce(operator.and_, iterator, initializer=~Q(pk__in=[])  # "all", defaulting to everything
reduce(operator.or_, iterator, initializer=Q(pk__in=[])    # "any", defaulting to nothing
reduce(operator.and_, iterator, initializer=Q(pk__in=[])   # always returns nothing
reduce(operator.or_, iterator, initializer=~Q(pk__in=[])   # always returns everything

That's ignoring the absolute foot-gun that is initializer=Q(). You need to add logically meaningless expressions to make the queries safe.

reduce(operator.and_, iterator, initializer=Q()) | Q(pk__in=[])  # "all", defaulting to nothing
reduce(operator.or_, iterator, initializer=Q()) & ~Q(pk__in=[])  # "or", defaulting to everything

Using manual checks to avoid needing initializer isn't great because of edge cases around bool(Q(Q())) and iterator=[Q()].

About Q.TRUE and Q.FALSE, these constants are really tricky and niche. I don't believe that folks will not how to use them properly (I don't see immediate and wide use). Moreover, I'm not sure why Django should recommend the pk__in=[] lookup (and not e.g. pk=-1), everything depends on the context. It is so niche that folks should choose on their own.

The main argument for having Q.TRUE and Q.FALSE is to make the above code examples more readable. The logic of ANDing with TRUE is already weird enough that it's easy to miss a ~.

The query optimizer can handle pk__in=[], not the others.

Article.objects.filter(Q(pk__in=[]))  # doesn't hit DB
Article.objects.filter(Q(pk=None))    # hits DB and returns nothing
Article.objects.none()                # doesn't hit DB
Article.objects.filter(Q())           # returns everything

I strongly believe that we need a good example of building Q() dynamically with the existing API, rather than adding yet another method to do the same.

I think empty Q() objects are a significant security risk. The code required to handle them properly is unintuitive and difficult to read. If we just documented a good example, it would need to sufficiently explain all these risks.

Note: See TracTickets for help on using tickets.
Back to Top