#29222 closed Bug (wontfix)
Substr on NULL values returns incorrect results with pattern lookups.
| Reported by: | Mariusz Felisiak | Owned by: | Siddharth Panditrao |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | dev |
| Severity: | Normal | Keywords: | Oracle |
| Cc: | Mariusz Felisiak | Triage Stage: | Unreviewed |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description (last modified by )
feb683c4c2c5ecfb61e4cb490c3e357450c0c0e8 revealed an unrelated issue on Oracle. SUBSTR(NULL, x, y) returns NULL on Oracle which can be concatenate with other strings, hence if we use it with pattern lookups then all rows match an query, e.g.
__startswith=Substr(sth, x, y)->LIKE SUBSTR(sth, x, y) || '%'->LIKE '%',__endswith=Substr(sth, x, y)->LIKE '%' || SUBSTR(sth, x, y)->LIKE '%',__contains=Substr(sth, x, y)->LIKE '%' || SUBSTR(sth, x, y) || '%'->LIKE '%%',
which is unexpected.
"Although Oracle treats zero-length character strings as nulls, concatenating a zero-length character string with another operand always results in the other operand, so null can result only from the concatenation of two null strings. However, this may not continue to be true in future versions of Oracle Database. To concatenate an expression that might be null, use the NVL function to explicitly convert the expression to a zero-length string." (see https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Concatenation-Operator.html#GUID-08C10738-706B-4290-B7CD-C279EBC90F7E).
Change History (10)
comment:1 by , 8 years ago
| Cc: | added |
|---|
comment:2 by , 8 years ago
| Description: | modified (diff) |
|---|
comment:3 by , 8 years ago
| Triage Stage: | Unreviewed → Accepted |
|---|
comment:4 by , 8 years ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:5 by , 4 years ago
| Description: | modified (diff) |
|---|---|
| Owner: | removed |
| Status: | assigned → new |
comment:6 by , 5 days ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
follow-up: 8 comment:7 by , 4 days ago
I left a note to this effect on the forum but I'm afraid there isn't much we can do here given __startswith and related lookups have historically accepted an empty strings as their right-side-hand to match all rows.
Because '' and NULL are the same thing on Oracle we have to choose; we can't return all rows in the first case and no rows in the second like on all other backends.
Given these lookups are meant to treat with text fields left-hand-sides in which we document that storing `NULL` is an anti-pattern
Avoid using
nullon string-based fields such asCharFieldandTextField. The Django convention is to use an empty string, notNULL, as the “no data” state for string-based fields.
and things have been working this way on Oracle since their inception I'd be more inclined to wont-fix this ticket and accept it as a unfortunate side effect of interprets_empty_strings_as_nulls.
follow-up: 9 comment:8 by , 4 days ago
| Resolution: | → wontfix |
|---|---|
| Status: | assigned → closed |
| Triage Stage: | Accepted → Unreviewed |
Replying to Simon Charette:
I left a note to this effect on the forum but I'm afraid there isn't much we can do here given
__startswithand related lookups have historically accepted an empty strings as their right-side-hand to match all rows.
Because
''andNULLare the same thing on Oracle we have to choose; we can't return all rows in the first case and no rows in the second like on all other backends.
Given these lookups are meant to treat with text fields left-hand-sides in which we document that storing `NULL` is an anti-pattern
Avoid using
nullon string-based fields such asCharFieldandTextField. The Django convention is to use an empty string, notNULL, as the “no data” state for string-based fields.
and things have been working this way on Oracle since their inception I'd be more inclined to wont-fix this ticket and accept it as a unfortunate side effect of
interprets_empty_strings_as_nullsthan change things for an arguably worst default.
Yes, exactly, that's why I didn't fix this for so long.
comment:9 by , 4 days ago
Replying to Mariusz Felisiak:
Replying to Simon Charette:
I left a note to this effect on the forum but I'm afraid there isn't much we can do here given
__startswithand related lookups have historically accepted an empty strings as their right-side-hand to match all rows.
Because
''andNULLare the same thing on Oracle we have to choose; we can't return all rows in the first case and no rows in the second like on all other backends.
Given these lookups are meant to treat with text fields left-hand-sides in which we document that storing `NULL` is an anti-pattern
Avoid using
nullon string-based fields such asCharFieldandTextField. The Django convention is to use an empty string, notNULL, as the “no data” state for string-based fields.
and things have been working this way on Oracle since their inception I'd be more inclined to wont-fix this ticket and accept it as a unfortunate side effect of
interprets_empty_strings_as_nullsthan change things for an arguably worst default.
Yes, exactly, that's why I didn't fix this for so long.
Thanks for explaining that! Makes sense now why fixing this would break the existing empty-string behavior on Oracle. I agree this should be marked as wontfix given the trade-offs involved. Should I update the ticket status myself, or will a maintainer handle that? And would it be helpful if I added a test or docs clarifying this Oracle-specific behavior?
This was my first attempt at contributing to Django, so I appreciate you taking the time to walk through the reasoning.
comment:10 by , 4 days ago
Hello Siddhahrt,
Should I update the ticket status myself, or will a maintainer handle that?
Mariusz has already taken care of that in comment:8 so you should be all set.
And would it be helpful if I added a test or docs clarifying this Oracle-specific behavior?
I think we're already covered with this documentation. The fact it states
When fetching from the database, it is assumed that a
NULLvalue in one of these fields really means the empty string, and the data is silently converted to reflect this assumption.
strengthen my believe we're making the right call here as we consider that NULL means empty string by continuing to match all rows.
This was my first attempt at contributing to Django, so I appreciate you taking the time to walk through the reasoning.
Happy to help and thanks for the triage and giving it a shot.
Triaging issues and spending time understanding them is often more valuable than landing code itself as it strengthen our common understanding of the problem and allow us to build a consistent experience accross the diverse interfaces the ORM provides.
Hi! I'd like to work on this issue. This would be my first contribution to Django, so please let me know if I'm on the right track.
What I've Found
When using Substr with pattern lookups like
__startswithon Oracle, if the Substr returns NULL, it ends up matching every row instead of none.The problem is Oracle-specific:
NULL || '%'becomes just'%'in Oracle (universal match), but stays NULL in other databases (no match).My Plan
I'm thinking of wrapping the expression with
NVL({}, CHR(0))in the Oracle backend's_pattern_ops. This would convert NULL to a null byte character instead of letting it vanish during concatenation.Something like:
Current: "startswith": "{} || '%%'" Fixed: "startswith": "NVL({}, CHR(0)) || '%%'"Testing
I've put together a test case on a branch that reproduces the bug - it passes on Oracle (showing the bug) and fails on other databases (confirming it's Oracle-only).
Does this approach make sense? Happy to submit a PR if this looks good, just want to make sure I'm heading in the right direction before diving deeper.