Opened 3 weeks ago

Closed 13 days ago

#36875 closed Cleanup/optimization (wontfix)

Avoid unnecessary Coalesce in Concat/ConcatPair

Reported by: David Owned by: Rusheel Chandra Reddy
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: concat, coalese
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by David)

Currently ConcatPair (which is used by Concat) when invoking the .coalesce method it will force the COALESCE on every argument provided to the expression.

https://github.com/django/django/blob/3851601b2e080df34fb9227fe5d2fd43af604263/django/db/models/functions/text.py#L112-L122

This does not take into account:

  • if any involved expression have nullable output (which I understand can be hard to check)
  • if any involved expression is a fixed value (which should easier to detect)

The case in which I am more interested is the latter, because it is completely useless to invoke COALESCE on a fixed value (unless it was provided NULL from the beginning, which should be easy to check).

I came along this while writing a query to detect custom permissions on the database:

from django.contrib.auth.models import Permission
from django.db.models import Value, F
from django.db.models.functions import Concat

custom_permissions = Permission.objects.exclude(
    codename__regex=Concat(
        Value("^(add|change|delete|view)_"),
        F("content_type__model"),
        Value("$"),
    )
)

I was expecting a WHERE clause like:

"auth_permission"."codename"::text ~ ('^(add|change|delete|view)_' || "django_content_type"."model" || '$')

And I was stunned when I saw the following (which is hard to read):

"auth_permission"."codename"::text ~ (COALESCE('^(add|change|delete|view)_', '') || COALESCE((COALESCE("django_content_type"."model", '') ||  COALESCE('$', '')), ''))

My proposal is to change how ConcatPair.coalesce behaves to detect if COALESCE can be skipped (ie: fixed value, already enforced coalesce on previous expression, etc) before putting that function in the output.
This could improve readability for generated queries and may reduce the overhad of building the Coalesce expression.

Change History (8)

comment:1 by David, 3 weeks ago

Description: modified (diff)

comment:2 by Rusheel Chandra Reddy, 3 weeks ago

Owner: set to Rusheel Chandra Reddy
Status: newassigned

comment:3 by Simon Charette, 3 weeks ago

Before making any changes here we should consider the context from #25517, #29582, #30385, #34955.

if any involved expression have nullable output (which I understand can be hard to check)

This is particularly relevant as the ORM doesn't do a good job at differentiaing nullable expression, we can't use expr.output_field.null as we don't update this flag transitively when performing outer joins. That's the main reason why we systematically Coalesce each expression.

Implementing a reliable Expression.is_nullable property is not an easy task so we'd need a better rationale than the provided generated SQL readability concerns to accept this ticket IMO.

Last edited 3 weeks ago by Simon Charette (previous) (diff)

comment:4 by David, 3 weeks ago

Hi Simon, I understand that there are other things which relies on COALESCE to work, expecially in cross-database context where things can break quite easy.

My proposal is to start by addressing the case in which it is involved a Value, where it is simple to detect if it is/not NULL, thus avoiding the Coalesce(Value("a"), Value("")) path.

Yet I realize that this may be a breaking change for people who relies con Concat in their index/constraints and could be affected, however I belive that it can be an improvement.

comment:5 by Rusheel Chandra Reddy, 3 weeks ago

Has patch: set
Resolution: fixed
Status: assignedclosed

the link for the submitted pr is as follows:
https://github.com/django/django/pull/20567

comment:6 by Jacob Walls, 3 weeks ago

Resolution: fixed
Status: closednew
Version: 5.2dev

We close tickets when merging fixes. Also, submitting a PR is premature at this point as the triage is tending against acceptance so far.

comment:7 by Rusheel Chandra Reddy, 3 weeks ago

Thank you for the clarification regarding the workflow; I will leave the ticket status open in the future.Regarding the acceptance: I understood this to be a valid optimization/issue .Could you clarify the concerns regarding acceptance? I am happy to close the PR if we decide this change isn't beneficial for the project, but I'd love to understand the reasoning first.

comment:8 by Jacob Walls, 13 days ago

Resolution: wontfix
Status: newclosed

I'm concerned the additional readability isn't worth the review cycles it would entail. For instance, the provided patch introduces drift against the implementation of SearchVector.as_sql.

Note: See TracTickets for help on using tickets.
Back to Top