Opened 9 years ago

Last modified 5 years ago

#25946 assigned Bug

Negated clauses' "isnull" added term does not take field transforms into account

Reported by: ris Owned by: Can Sarıgöl
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: transform negated null
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

In https://github.com/django/django/blob/32ef48aa562e6aaee9983f5d0f1c60f02fd555fb/django/db/models/sql/query.py#L1209:

            if (lookup_type != 'isnull' and (
                    self.is_nullable(targets[0]) or
                    self.alias_map[join_list[-1]].join_type == LOUTER)):
                # The condition added here will be SQL like this:
                # NOT (col IS NOT NULL), where the first NOT is added in
                # upper layers of code. The reason for addition is that if col
                # is null, then col != someval will result in SQL "unknown"
                # which isn't the same as in Python. The Python None handling
                # is wanted, and it can be gotten by
                # (col IS NULL OR col != someval)
                #   <=>
                # NOT (col IS NOT NULL AND col = someval).
                lookup_class = targets[0].get_lookup('isnull')
                clause.add(lookup_class(targets[0].get_col(alias, sources[0]), False), AND)

The problem with this is that it always performs the isnull on the base field, ignoring any transforms that may apply.

Consider a simple custom transform on a field, base_field, which transforms it into some_function ( base_field ). Performing a negated exact lookup on this will generate SQL along the lines of:

NOT (some_function ( "some_table"."base_field" ) = 1 AND "some_table"."base_field" IS NOT NULL)

This will not produce the correct results in cases where some_function doesn't pass through nulls as nulls or generates nulls of its own (though it's true that many functions null behaviour is to mirror the source expression null-ness exactly, which is probably why this hasn't been noticed before).

The suggested "correct" SQL is probably

NOT (some_function ( "some_table"."base_field" ) = 1 AND some_function ( "some_table"."base_field" ) IS NOT NULL)

Though I note that this is itself not problem free if we consider cases where some_function may have side effects or may not be STABLE or IMMUTABLE (as postgres would put it), i.e. won't necessarily return the same result when called twice given the same input. I would consider use of COALESCE() instead to remedy this but this may have portability concerns or may be more opaque to the query planner leading to slower queries.

Noticed on 1.8, master's equivalent method doesn't appear to have changed much since then.

Change History (3)

comment:1 by Anssi Kääriäinen, 9 years ago

Triage Stage: UnreviewedAccepted

Yes, we need a fix here. At least we need to mark if transforms/expressions are nullable, and then do the IS NOT NULL check against the transformed value.

I don't think we can go for the coalesce solution, it is likely a lot slower than using the current two-conditions-ANDed approach.

comment:2 by Can Sarıgöl, 5 years ago

Has patch: set
Owner: changed from nobody to Can Sarıgöl
Status: newassigned

comment:3 by Mariusz Felisiak, 5 years ago

Patch needs improvement: set
Note: See TracTickets for help on using tickets.
Back to Top