Opened 4 years ago

Closed 3 years ago

Last modified 11 months ago

#32143 closed Cleanup/optimization (fixed)

Use EXISTS to exclude multi-valued relationships

Reported by: Simon Charette Owned by: Simon Charette
Component: Database layer (models, ORM) Version: 3.1
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The current logic that is invoked when excluding a multi-valued relationship (sql.Query.split_exclude) pushes down the filtering criteria in a parent.id NOT IN (SELECT child.parent_id ...) subquery.

These kind of operations can be really hard for some query planners to optimize and also tricky to get right due to how the IN operator treats NULL values which is something we've known for a while.

The NOT EXISTS function should be used instead of NOT IN.

Change History (5)

comment:1 by Mariusz Felisiak, 4 years ago

Triage Stage: UnreviewedAccepted

comment:2 by Mariusz Felisiak, 3 years ago

Owner: changed from nobody to Simon Charette
Status: newassigned
Triage Stage: AcceptedReady for checkin

comment:3 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

Resolution: fixed
Status: assignedclosed

In 8593e162:

Fixed #32143 -- Used EXISTS to exclude multi-valued relationships.

As mentioned in the pre-existing split_exclude() docstring EXISTS is
easier to optimize for query planers and circumvents the IN (NULL)
handling issue.

comment:4 by Mariusz Felisiak <felisiak.mariusz@…>, 11 months ago

In cfc9c94:

Refs #32143 -- Adjusted a comment about subquery usage in Query.split_exclude().

comment:5 by Mariusz Felisiak <felisiak.mariusz@…>, 11 months ago

In 1c4f5f3:

Refs #32143 -- Removed superflous constraints on excluded query.

The outer query reference is not necessary when alias can be reused and
can even be harmful by confusing query planers.

Refs #34597.

Note: See TracTickets for help on using tickets.
Back to Top