Opened 7 years ago

Closed 5 years ago

Last modified 5 years ago

#28643 closed New feature (fixed)

Complete the ORM Function Library

Reported by: Matthew Pava Owned by: Nick Pope
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: josh.smeaton@…, Mariusz Felisiak, Shai Berger, Adam Johnson, Thomas Lagae Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Mariusz Felisiak)

I was surprised to learn that we didn't have a StrIndex function until version 2, and yet we had Substr since at least version 1.8. I wonder how users were using Substr without also finding a use for StrIndex this whole time. Anyway, since we seem to be adding these functions one at a time, why don't we work on trying to get the built-ins implemented in one sweep instead?

We may even want to split the documentation page (https://docs.djangoproject.com/en/dev/ref/models/database-functions/) into further categories with String functions and Numeric functions.

This is just a sample checklist, with corresponding attributes to which backend has them available.

Comparison
DjangoSQLDescriptionPostgreSQLOracleMySQLSQLitePR
NullIfNULLIFReturns NULL if the first argument equals the second.‎✔‎✔‎✔‎✔9543
Math
DjangoSQLDescriptionPostgreSQLOracleMySQLSQLitePR
AbsABSReturns the absolute value.‎✔‎✔‎✔‎✔9622
ACosACOSReturns the arccosine.‎✔‎✔‎✔19622
ASinASINReturns the arcsine.‎✔‎✔‎✔19622
ATanATANReturns the arctangent.‎✔‎✔‎✔19622
ATan2ATAN2Returns the arctangent of the two variables passed to it.‎✔‎✔‎✔19622
CeilCEILINGReturns the smallest integer value that is not less than a numeric expression‎✔CEIL‎✔19622
CosCOSReturns the cosine expressed in radians.‎✔‎✔‎✔19622
CotCOTReturns the cotangent.‎✔‎✔5‎✔19622
DegreesDEGREESReturns a numeric expression converted from radians to degrees.‎✔‎✔‎✔19622
ExpEXPReturns the base of the natural logarithm (e) raised to the power of a numeric expression.‎✔‎✔‎✔19622
FloorFLOORReturns the largest integer value that is not greater than a numeric expression.‎✔‎✔‎✔19622
LnLNReturns the natural logarithm of a numeric expression.‎✔‎✔‎✔19622
LogLOG(B, X)Returns the logarithm of a numeric expression.‎✔‎✔‎✔19622
ModMODReturns the remainder of one expression by diving by another expression.‎✔‎✔‎✔19622
PiPIReturns the value of π‎✔4‎✔19622
PowerPOWERReturns the value of one expression raised to the power of another expression‎✔‎✔‎✔19622
RadiansRADIANSReturns the value of an expression converted from degrees to radians.‎✔‎✔‎✔19622
RoundROUNDReturns a numeric expression rounded to an integer.
Can be used to round an expression to a number of decimal points
‎✔‎✔‎✔‎✔9622
SinSINReturns the sine given in radians.‎✔‎✔‎✔19622
SqrtSQRTReturns the square root.‎✔‎✔‎✔19622
TanTANReturns the tangent expressed in radians.‎✔‎✔‎✔19622
Text
DjangoSQLDescriptionPostgreSQLOracleMySQLSQLitePR
OrdASCIIReturns numeric value of left-most character. (Equivalent to ord() in Python.)‎✔‎✔‎✔UNICODE9583
ChrCHRCharacter with the given code‎✔‎✔CHARCHAR9583
LeftLEFTReturns the leftmost number of characters as specified‎✔2, 6‎✔29583
LPadLPADReturns the string argument, left-padded with the specified string‎✔‎✔‎✔19798
LTrimLTRIMRemoves leading spaces‎✔‎✔‎✔‎✔9220
MD5MD5Calculates the MD5 hash of string, returning the result in hexadecimal‎✔8111004
RepeatREPEATRepeats a string the specified number of times‎✔3‎✔19808
ReplaceREPLACEReplaces occurrences of a specified string‎✔‎✔6‎✔‎✔9539
ReverseREVERSEReverse the characters in a string‎✔6, 7‎✔110827
RightRIGHTReturns the specified rightmost number of characters‎✔2‎✔29583
RPadRPADAppends string the specified number of times‎✔‎✔619798
RTrimRTRIMRemoves trailing spaces‎✔‎✔‎✔‎✔9220
TrimTRIMRemoves leading and trailing spaces‎✔‎✔‎✔‎✔9220
  • 1 Function can be easily supported on SQLite with a user defined function.
  • 2 Can be emulated by using SUBSTR.
  • 3 Can be emulated by using LENGTH and RPAD.
  • 4 Can be emulated by directly substituting constant math.pi.
  • 5 Can be emulated by using 1 / TAN(X).
  • 6 Doesn't work properly with multibyte characters sets on Oracle.
  • 7 The REVERSE function is undocumented on Oracle.
  • 8 Can be supported by using STANDARD_HASH on Oracle.

Change History (63)

comment:1 by Josh Smeaton, 7 years ago

I had hoped that the 3rd party community would take care of providing "Function Packs" for each of the backends. But functions are fairly easy to create in your own project if you need them so I guess few people have bothered to group them into a distributable package.

I would prefer that the django ecosystem had a standard group of function expressions that can be used across backends, rather than a bunch of similar but not quite the same implementations. Then 3rd party libraries can depend on the known good versions and everyone is better off. Since the community hasn't seemed to provide such a library (that I'm aware of), I think it's fine for Django to do so.

Splitting the module up into string/numeric/date types sounds fine. Let's take inspiration from postgres and other db vendor docs so navigating them is familiar. Let's begin with functions that have standard support across our 4 backends, then look into what's left over. If we can approximate support by combining other functions we can do that. Otherwise if a particular function only has support for one or two backends we can consider ignoring it or implementing in a contrib module.

comment:2 by Josh Smeaton, 7 years ago

Cc: josh.smeaton@… added
Triage Stage: UnreviewedAccepted

comment:3 by Mariusz Felisiak, 7 years ago

Cc: Mariusz Felisiak added

comment:4 by JunyiJ, 6 years ago

Owner: changed from nobody to JunyiJ
Status: newassigned

comment:5 by Josh Smeaton, 6 years ago

For whoever wants to begin here, I think a good plan of attack will be a PR per function or per a small set of functions. Then the commit message would be something like "Refs #28643 -- Added X, Y, Z functions".

That way we can get through smaller pieces at a time, without such a large burden being placed on any one individual. We could also get some newer contributors to help in this way. Each function on its own should be a relatively easy thing to implement. A higher level task of re-organising the layout into functions/string functions/number may be helpful to get the ticket rolling.

comment:6 by Matthew Pava, 6 years ago

I did find this third-party utility that handles specific PostgreSQL functions. I wonder if there are other backends that they could be ported to.
https://github.com/hypertrack/django-pg-utils

And I also wonder how all of this connects with specific PostgreSQL aggregate functions already builtin to Django:
https://docs.djangoproject.com/en/1.11/ref/contrib/postgres/aggregates/

in reply to:  6 comment:7 by Mads Jensen, 6 years ago

Replying to Matthew Pava:

I did find this third-party utility that handles specific PostgreSQL functions. I wonder if there are other backends that they could be ported to.
https://github.com/hypertrack/django-pg-utils

From a quick look, these things are already supported by Django.

comment:8 by Mariusz Felisiak, 6 years ago

Description: modified (diff)

comment:9 by Mariusz Felisiak, 6 years ago

I prepared patch to reorganize database functions docs and code (PR). I think we should organize code and doc as follows (bolded functions doesn't exist):

  • docs/ref/models/database-functions.txt section Comparison and conversion functions (django/db/models/functions/comp_conv.py):
    • Cast, Coalesce, Greatest, Least;
  • docs/ref/models/database-functions.txt section Date Functions (django/db/models/functions/datetime.py):
    • Extract, ExtractDay, ExtractHour, ExtractMinute, ExtractMonth, ExtractQuarter, ExtractSecond, ExtractWeek, ExtractWeekDay, ExtractYear, Now, Trunc, TruncDate, TruncDay, TruncHour, TruncMinute, TruncMonth, TruncQuarter, TruncSecond, TruncTime, TruncYear;';
  • docs/ref/models/database-functions.txt section Math Functions (django/db/models/functions/math.py):
    • Abs, Acos, Asin, Atan, Atan2, Ceil, Cos, Cot, Exp, Floor, Log, Mod, Power, Round, Sin, Sqrt, Tan;
  • docs/ref/models/database-functions.txt section Text Functions (django/db/models/functions/text.py):
    • Ascii, Chr, Concat, ConcatPair, Length, Lpad, Lower, Ltrim, Replace, Rpad, Rtrim, StrIndex, Substr, Trim, Upper;
  • docs/ref/models/database-functions.txt section Window Functions (django/db/models/functions/window.py):
    • CumeDist, DenseRank, FirstValue, Lag, LastValue, Lead, NthValue, Ntile, PercentRank, Rank, RowNumber.

Headers and functions in docs and code should be organized alphabetically. Thanks Tim Graham for suggestions.

I'm not convince that following functions should be implemented because there not supported on all databases:

  • LEFT(), MD5(), REPEAT(), REVERSE(), RIGHT(), DEGREES(), PI(), RADIANS().
Version 2, edited 6 years ago by Mariusz Felisiak (previous) (next) (diff)

comment:10 by Tim Graham, 6 years ago

The "Miscellaneous" group is sort of bothering me.

Cast, Coalesce, Greatest, Least sound like candidates for a section called something like "Comparison and conversion functions"

Ascii and Chr sound like they could be "text" related.

Now sounds like a candidate for "Date Functions".

comment:11 by Mariusz Felisiak, 6 years ago

Agreed. I updated above comment and both PRs.

comment:12 by Matthew Pava, 6 years ago

In regards to Left() and Right(), they are available in PostgreSQL and MySQL, but not in SQLite or Oracle. However, you can emulate their functionality using the corresponding Substr function in the database backend. I would hate to see us leave those out for that reason alone. At the same time, developers could use the Substr across all databases instead of using Left and Right.

I was also wondering about the use of Trim. Python doesn't use that term; instead, it uses strip. I wonder if that really matters. Some developers may look for a strip functionality in Django ORM not realizing that the databases use the term Trim. Perhaps that could be clarified in the documentation.

in reply to:  12 comment:13 by Shai Berger, 6 years ago

Cc: Shai Berger added

Replying to Matthew Pava:

I was also wondering about the use of Trim. Python doesn't use that term; instead, it uses strip. I wonder if that really matters. Some developers may look for a strip functionality in Django ORM not realizing that the databases use the term Trim. Perhaps that could be clarified in the documentation.

Unless there is some other use for the term strip in the databases, I'd consider "documenting" this by giving an alias:

class Trim(Transform):
    # ...

Strip = Trim  # Give Trim a more Pythonic name

comment:14 by GitHub <noreply@…>, 6 years ago

In ad8036d7:

Refs #28643 -- Reorganized database functions docs.

Thanks Tim Graham for the review.

comment:15 by Mariusz Felisiak <felisiak.mariusz@…>, 6 years ago

In 8b42a18b:

[2.0.x] Refs #28643 -- Reorganized database functions docs.

Thanks Tim Graham for the review.

Backport of ad8036d715d4447b95d485332511b4edb1a40c0e from master

comment:16 by GitHub <noreply@…>, 6 years ago

In 4f27e475:

Refs #28643 -- Reorganized database functions.

Thanks Tim Graham for the review.

comment:17 by Mariusz Felisiak, 6 years ago

PR - Ltrim, Rtrim, and Trim with aliases Lstrip, Rstrip, and Strip, respectively.

Last edited 6 years ago by Mariusz Felisiak (previous) (diff)

comment:18 by Matthew Pava, 6 years ago

And, of course, there is a Python lstrip and rstrip as well.

Lstrip = Ltrim  # Give Ltrim a more Pythonic name
Rstrip = Rtrim  # Give Rtrim a more Pythonic name

in reply to:  18 comment:19 by Mariusz Felisiak, 6 years ago

Replying to Matthew Pava:

And, of course, there is a Python lstrip and rstrip as well.

Lstrip = Ltrim  # Give Ltrim a more Pythonic name
Rstrip = Rtrim  # Give Rtrim a more Pythonic name

I added these aliases.

comment:20 by Mariusz Felisiak, 6 years ago

Has patch: set

comment:21 by Adam Johnson, 6 years ago

Cc: Adam Johnson added

comment:22 by Matthew Pava, 6 years ago

Description: modified (diff)

I formatted the list into a table so we could see better what we can do to address this ticket. It looks like SQLite is missing much functionality that the other backends support, especially in the math department.

I also suggest implementing Left and Right, but for the unsupported backends to use the corresponding Substr function.

comment:24 by Nick Pope, 6 years ago

Description: modified (diff)

comment:25 by Mads Jensen, 6 years ago

Description: modified (diff)

comment:26 by Mads Jensen, 6 years ago

Description: modified (diff)

Added link to NullIf PR.

comment:27 by Thomas Lagae, 6 years ago

Cc: Thomas Lagae added
Description: modified (diff)

Add link to Abs PR.

I am a first time contributor so any feedback is much appreciated.

comment:28 by Matthew Pava, 6 years ago

Description: modified (diff)

comment:29 by Thomas Lagae, 6 years ago

When passing a non-numeric value into the Abs function, the behaviour differs between databases. I think this will also be the case for other math functions. Should we catch non-numeric values and throw an error or just document that this behaviour is undefined and differs between databases?

comment:30 by Matthew Pava, 6 years ago

I added a link to the Ord, Chr, Left, and Right PR.

I just started thinking more about the Left and Right aspects, and it occurred to me that a more pythonic solution would be the implementation of slicing instead of having these separate database functions. This would also change the implementation of Substr since that would no longer be necessary either.

comment:31 by Tim Graham <timograham@…>, 6 years ago

In 65728550:

Refs #28643 -- Added Replace database function.

comment:32 by Nick Pope, 6 years ago

Description: modified (diff)

comment:33 by Tim Graham <timograham@…>, 6 years ago

In f82de6bf:

Refs #28643 -- Added Ord, Chr, Left, and Right database functions.

comment:34 by GitHub <noreply@…>, 6 years ago

In 9421aee:

Refs #28643 -- Added LTrim, RTrim, and Trim database functions.

Thanks Tim Graham and Mads Jensen for reviews.

comment:35 by Mariusz Felisiak, 6 years ago

Description: modified (diff)

comment:36 by GitHub <noreply@…>, 6 years ago

In cede5111:

Refs #28643 -- Added LPad and RPad database functions.

Thanks Tim Graham for the review.

comment:37 by Mariusz Felisiak, 6 years ago

Description: modified (diff)

comment:38 by Nick Pope, 6 years ago

Description: modified (diff)

comment:39 by Mariusz Felisiak, 6 years ago

Description: modified (diff)

comment:40 by GitHub <noreply@…>, 6 years ago

In 55cc269:

Refs #28643 -- Added Repeat database function.

Thanks Tim Graham and Nick Pope for reviews.

comment:41 by Mariusz Felisiak, 6 years ago

Description: modified (diff)

comment:42 by Tim Graham, 6 years ago

Patch needs improvement: set

comment:43 by Tim Graham <timograham@…>, 6 years ago

In a0b19a0:

Refs #28643 -- Added math database functions.

Thanks Nick Pope for much review.

comment:44 by Nick Pope, 6 years ago

Description: modified (diff)

comment:45 by Tim Graham <timograham@…>, 5 years ago

In c3bbf1fd:

Refs #28643 -- Skipped ATan2() workaround on SpatiaLite 5+.

comment:46 by Nick Pope, 5 years ago

Owner: changed from JunyiJ to Nick Pope
Patch needs improvement: unset

I have added a new PR that cleans up some of the compatibility mixins added as part of the math function work so that they can be reused.
It also makes use of these mixins for aggregates defined in django.db.models.aggregates.

comment:47 by Nick Pope, 5 years ago

Description: modified (diff)

Added a PR implementing Reverse().
Note that REVERSE exists for Oracle but is undocumented and works on bytes, not multi-byte strings. (Support table updated in ticket description.)

comment:48 by Nick Pope, 5 years ago

Description: modified (diff)

comment:49 by Tim Graham <timograham@…>, 5 years ago

In 4b9d7221:

Refs #28643 -- Added NullIf database function.

Thanks Nick Pope, Mariusz Felisiak, and Tim Graham for reviews.

comment:50 by Tim Graham, 5 years ago

Description: modified (diff)

comment:51 by Tim Graham <timograham@…>, 5 years ago

In abf8e390:

Refs #28643 -- Added Reverse database function.

Thanks Mariusz Felisiak for Oracle advice and review.

comment:52 by Tim Graham <timograham@…>, 5 years ago

In 3d5e0f83:

Refs #28643 -- Moved db function mixins to a separate module.

comment:53 by Tim Graham <timograham@…>, 5 years ago

In c690afb8:

Refs #28643 -- Changed Avg() to use NumericOutputFieldMixin.

Keeps precision instead of forcing DecimalField to FloatField.

comment:54 by Tim Graham <timograham@…>, 5 years ago

In e85afa59:

Refs #28643 -- Changed StdDev() to use NumericOutputFieldMixin.

Keeps precision instead of forcing DecimalField to FloatField.

comment:55 by Tim Graham <timograham@…>, 5 years ago

In 6d4efa8e:

Refs #28643 -- Changed Variance() to use NumericOutputFieldMixin.

Keeps precision instead of forcing DecimalField to FloatField.

comment:56 by Tim Graham <timograham@…>, 5 years ago

In 846624ed:

Refs #28643 -- Extracted DurationField logic for Avg() and Sum() into mixin.

Also addresses Sum() not handling the filter option correctly.

comment:57 by Tim Graham, 5 years ago

Has patch: unset

comment:58 by Mariusz Felisiak, 5 years ago

Description: modified (diff)

comment:59 by Mariusz Felisiak, 5 years ago

Has patch: set

PR implementing MD5().

Last edited 5 years ago by Mariusz Felisiak (previous) (diff)

comment:60 by Nick Pope, 5 years ago

PR implementing SHA1(), SHA224(), SHA256(), SHA384() and SHA512(). Moved to #30240.

Last edited 5 years ago by Nick Pope (previous) (diff)

comment:61 by Mariusz Felisiak, 5 years ago

Description: modified (diff)

comment:62 by GitHub <noreply@…>, 5 years ago

In 9ff18c08:

Refs #28643 -- Added MD5 database function.

Thanks Tim Graham, Nick Pope and Simon Charette for reviews.

comment:63 by Mariusz Felisiak, 5 years ago

Description: modified (diff)
Resolution: fixed
Status: assignedclosed

Finally 🎉 Thanks all

Note: See TracTickets for help on using tickets.
Back to Top