Opened 108 minutes ago

Last modified 41 minutes ago

#36875 assigned Cleanup/optimization

Avoid unnecessary Coalesce in Concat/ConcatPair

Reported by: David Owned by: Rusheel Chandra Reddy
Component: Database layer (models, ORM) Version: 5.2
Severity: Normal Keywords: concat, coalese
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by David)

Currently ConcatPair (which is used by Concat) when invoking the .coalesce method it will force the COALESCE on every argument provided to the expression.

https://github.com/django/django/blob/3851601b2e080df34fb9227fe5d2fd43af604263/django/db/models/functions/text.py#L112-L122

This does not take into account:

  • if any involved expression have nullable output (which I understand can be hard to check)
  • if any involved expression is a fixed value (which should easier to detect)

The case in which I am more interested is the latter, because it is completely useless to invoke COALESCE on a fixed value (unless it was provided NULL from the beginning, which should be easy to check).

I came along this while writing a query to detect custom permissions on the database:

from django.contrib.auth.models import Permission
from django.db.models import Value, F
from django.db.models.functions import Concat

custom_permissions = Permission.objects.exclude(
    codename__regex=Concat(
        Value("^(add|change|delete|view)_"),
        F("content_type__model"),
        Value("$"),
    )
)

I was expecting a WHERE clause like:

"auth_permission"."codename"::text ~ ('^(add|change|delete|view)_' || "django_content_type"."model" || '$')

And I was stunned when I saw the following (which is hard to read):

"auth_permission"."codename"::text ~ (COALESCE('^(add|change|delete|view)_', '') || COALESCE((COALESCE("django_content_type"."model", '') ||  COALESCE('$', '')), ''))

My proposal is to change how ConcatPair.coalesce behaves to detect if COALESCE can be skipped (ie: fixed value, already enforced coalesce on previous expression, etc) before putting that function in the output.
This could improve readability for generated queries and may reduce the overhad of building the Coalesce expression.

Change History (4)

comment:1 by David, 108 minutes ago

Description: modified (diff)

comment:2 by Rusheel Chandra Reddy, 104 minutes ago

Owner: set to Rusheel Chandra Reddy
Status: newassigned

comment:3 by Simon Charette, 59 minutes ago

Before making any changes here we should consider the context from #25517, #29582, #30385, #34955.

if any involved expression have nullable output (which I understand can be hard to check)

This is particularly relevant as the ORM doesn't do a good job at differentiaing nullable expression, we can't use expr.output_field.null as we don't update this flag transitively when performing outer joins. That's the main reason why we systematically Coalesce each expression.

Implementing a reliable Expression.is_nullable property is not an easy task so we'd need a better rationale than the provided generated SQL readability concerns to accept this ticket IMO.

Last edited 54 minutes ago by Simon Charette (previous) (diff)

comment:4 by David, 41 minutes ago

Hi Simon, I understand that there are other things which relies on COALESCE to work, expecially in cross-database context where things can break quite easy.

My proposal is to start by addressing the case in which it is involved a Value, where it is simple to detect if it is/not NULL, thus avoiding the Coalesce(Value("a"), Value("")) path.

Yet I realize that this may be a breaking change for people who relies con Concat in their index/constraints and could be affected, however I belive that it can be an improvement.

Note: See TracTickets for help on using tickets.
Back to Top