Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#32549 closed Cleanup/optimization (wontfix)

Add `Q.empty()` to check for nested empty Q objects like `Q(Q())`

Reported by: jonathan-golorry Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: Q objects, nested, empty
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by jonathan-golorry)

Empty Q objects (Q() or ~Q(), etc) evaluate to False, but Q(Q()) evaluates to True. This interferes with the shortcut on logical operations involving empty Q objects and can lead to bugs when trying to detect empty operations.

def q_any(iterable):
    q = Q()
    for element in iterable:
        q |= element
    if q:
        return q
    return Q(pk__in=[])
Item.objects.filter(q_any())  # no items
Item.objects.filter(q_any([Q()]))  # no items
Item.objects.filter(q_any([Q(Q())]))  # all items

Patch https://github.com/django/django/pull/14127 removes empty Q objects from args during Q object initialization.

Patch https://github.com/django/django/pull/14130 adds .empty() for detecting nested empty Q objects and uses that instead of boolean evaluation for shortcutting logical operations between empty Q objects. So Q(x=1) | Q(Q()) will now shortcut to Q(x=1) the same way Q(x=1) | Q() does. No simplifying is done during Q object initialization.

This requires https://code.djangoproject.com/ticket/32548 in order to prevent a regression in logical operations between query expressions and empty Q objects.

Change History (7)

comment:1 by jonathan-golorry, 3 years ago

Description: modified (diff)
Has patch: set

comment:2 by Mariusz Felisiak, 3 years ago

Resolution: wontfix
Status: newclosed

Thanks for this proposition, however I don't think we should implicitly remove anything from args, even an empty Q() instances. This would be backward incompatible, inconsistent with the Node behavior, and can be handled in your code.

comment:3 by jonathan-golorry, 3 years ago

Description: modified (diff)
Resolution: wontfix
Status: closednew
Summary: Make `Q(Q(), ...)` equivalent to `Q(...)`Add `Q.empty()` to check for nested empty Q objects like `Q(Q())`

Here is an alternate approach that adds a way to check for nested empty Q objects instead of making them impossible to create.

in reply to:  3 comment:4 by Mariusz Felisiak, 3 years ago

Resolution: wontfix
Status: newclosed

Replying to jonathan-golorry:

Here is an alternate approach that adds a way to check for nested empty Q objects instead of making them impossible to create.

It looks that you need Q.empty() as a hook for #32554 (to which I'm also not convinced, but we can discuss this in #32554). I don't think that we need this as a standalone feature. Moreover it's still backward incompatible, because it changes the current behavior when combining a nested Q() objects. It's also inconsistent with the current API and quite misleading, e.g.

>>> from django.db.models import Q
>>> q = Q(Q())
>>> q.empty()
True
>>> len(q)
1
>>> bool(q)
True

I understand that you need this for building generic nested queries but it's not needed for most users. You can always use your own subclass of Q and even release it as a third-party package.

Please follow triaging guidelines with regards to wontfix tickets.

comment:5 by jonathan-golorry, 3 years ago

We already shortcut logical operations between empty Q objects, so I view this more as an extension of that shortcutting (for legibility, not performance). I don't think it's fair to say it's backwards incompatible, since the simplified Q objects will be logically equivalent to the old ones. The expected outcome of Q object combinations is logical consistency, not a specific tree structure (as evidenced by len((Q(x=1, y=2) | Q(x=1, y=2)).children == 1).

The only place I can think of where this would affect the current Django internals is in simplifying the output of Exists(...) | Q() (and similar), which ends up running Q(Exists(...)) | Q(Q()).

My original example shows why it's desirable to have q.empty() behave differently from not bool(q). Empty Q objects are a potential security risk, but the standard way to check for them doesn't catch nested empty Q objects. #32554 doesn't actually build nested queries, it just uses this to avoid bugs with nested query inputs.

My apologies for reopening without a mailing list consensus, though I feel "If there is a way they can improve the ticket to reopen it, let them know" indicates that I may reopen if I address the only concern you expressed in your closing comment. If the reason for closing is that legibility of Q objects isn't it's own concern, I can roll empty into #32554 without the logic shortcut.

Last edited 3 years ago by jonathan-golorry (previous) (diff)

comment:6 by Ian Foote, 3 years ago

I'm persuaded that this is a problem. Accidentally leaking data because we treat Q(Q()) differently to Q() feels like a bug to me, so fixing it shouldn't be blocked by backwards compatibility arguments.

in reply to:  6 comment:7 by jonathan-golorry, 3 years ago

Replying to Ian Foote:

I'm persuaded that this is a problem. Accidentally leaking data because we treat Q(Q()) differently to Q() feels like a bug to me, so fixing it shouldn't be blocked by backwards compatibility arguments.

#32554 does a lot more to prevent accidentally leaking data. If that's the only concern, I think it's better to roll Q.empty() into that patch. This ticket is primarily about whether or not to shortcut logical operations between nested empty Q objects.

Note: See TracTickets for help on using tickets.
Back to Top