Opened 5 hours ago

Closed 19 minutes ago

#36875 closed Cleanup/optimization (fixed)

Avoid unnecessary Coalesce in Concat/ConcatPair

Reported by: David Owned by: Rusheel Chandra Reddy
Component: Database layer (models, ORM) Version: 5.2
Severity: Normal Keywords: concat, coalese
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by David)

Currently ConcatPair (which is used by Concat) when invoking the .coalesce method it will force the COALESCE on every argument provided to the expression.

https://github.com/django/django/blob/3851601b2e080df34fb9227fe5d2fd43af604263/django/db/models/functions/text.py#L112-L122

This does not take into account:

  • if any involved expression have nullable output (which I understand can be hard to check)
  • if any involved expression is a fixed value (which should easier to detect)

The case in which I am more interested is the latter, because it is completely useless to invoke COALESCE on a fixed value (unless it was provided NULL from the beginning, which should be easy to check).

I came along this while writing a query to detect custom permissions on the database:

from django.contrib.auth.models import Permission
from django.db.models import Value, F
from django.db.models.functions import Concat

custom_permissions = Permission.objects.exclude(
    codename__regex=Concat(
        Value("^(add|change|delete|view)_"),
        F("content_type__model"),
        Value("$"),
    )
)

I was expecting a WHERE clause like:

"auth_permission"."codename"::text ~ ('^(add|change|delete|view)_' || "django_content_type"."model" || '$')

And I was stunned when I saw the following (which is hard to read):

"auth_permission"."codename"::text ~ (COALESCE('^(add|change|delete|view)_', '') || COALESCE((COALESCE("django_content_type"."model", '') ||  COALESCE('$', '')), ''))

My proposal is to change how ConcatPair.coalesce behaves to detect if COALESCE can be skipped (ie: fixed value, already enforced coalesce on previous expression, etc) before putting that function in the output.
This could improve readability for generated queries and may reduce the overhad of building the Coalesce expression.

Change History (5)

comment:1 by David, 5 hours ago

Description: modified (diff)

comment:2 by Rusheel Chandra Reddy, 5 hours ago

Owner: set to Rusheel Chandra Reddy
Status: newassigned

comment:3 by Simon Charette, 5 hours ago

Before making any changes here we should consider the context from #25517, #29582, #30385, #34955.

if any involved expression have nullable output (which I understand can be hard to check)

This is particularly relevant as the ORM doesn't do a good job at differentiaing nullable expression, we can't use expr.output_field.null as we don't update this flag transitively when performing outer joins. That's the main reason why we systematically Coalesce each expression.

Implementing a reliable Expression.is_nullable property is not an easy task so we'd need a better rationale than the provided generated SQL readability concerns to accept this ticket IMO.

Last edited 5 hours ago by Simon Charette (previous) (diff)

comment:4 by David, 4 hours ago

Hi Simon, I understand that there are other things which relies on COALESCE to work, expecially in cross-database context where things can break quite easy.

My proposal is to start by addressing the case in which it is involved a Value, where it is simple to detect if it is/not NULL, thus avoiding the Coalesce(Value("a"), Value("")) path.

Yet I realize that this may be a breaking change for people who relies con Concat in their index/constraints and could be affected, however I belive that it can be an improvement.

comment:5 by Rusheel Chandra Reddy, 19 minutes ago

Has patch: set
Resolution: fixed
Status: assignedclosed

the link for the submitted pr is as follows:
https://github.com/django/django/pull/20567

Note: See TracTickets for help on using tickets.
Back to Top