Opened 3 years ago

Last modified 3 years ago

#23797 new Bug

SQL generated for negated F() expressions is incorrect

Reported by: Suriya Subramanian Owned by: nobody
Component: Database layer (models, ORM) Version: 1.7
Severity: Normal Keywords:
Cc: Simon Charette Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Consider the following model definition.

from django.db import models

class Rectangle(models.Model):
    length = models.IntegerField(null=True)
    width = models.IntegerField(null=True)

We make the following queries: Q1) Rectangles that are squares. Q2a) Rectangles that are not squares (length != width). Q2b) Rectangles that are not squares (width != length). Queries Q2a and Q2b semantically mean the same. However, the ORM generates different SQL for these two queries.

import django
django.setup()

from django.db.models import F
from testapp.models import Rectangle

print '--- Q1: Get the rectangles that are squares'
print Rectangle.objects.filter(length=F('width')).values('pk').query

print '--- Q2a: Get the rectangles that are not squares'
print Rectangle.objects.exclude(length=F('width')).values('pk').query

print '--- Q2b: Get the rectangles that are not squares'
print Rectangle.objects.exclude(width=F('length')).values('pk').query

The generated SQL is

--- Q1: Get the rectangles that are squares
SELECT "testapp_rectangle"."id" FROM "testapp_rectangle" WHERE "testapp_rectangle"."length" = ("testapp_rectangle"."width")
--- Q2a: Get the rectangles that are not squares
SELECT "testapp_rectangle"."id" FROM "testapp_rectangle" WHERE NOT ("testapp_rectangle"."length" = ("testapp_rectangle"."width") AND "testapp_rectangle"."length" IS NOT NULL)
--- Q2b: Get the rectangles that are not squares
SELECT "testapp_rectangle"."id" FROM "testapp_rectangle" WHERE NOT ("testapp_rectangle"."width" = ("testapp_rectangle"."length") AND "testapp_rectangle"."width" IS NOT NULL)

Note the asymmetry between Q2a and Q2b. These queries will return different results.

Reddit user /u/charettes set up this useful SQL fiddle with the above mentioned schema and test values: http://sqlfiddle.com/#!12/c8283/4 Here's my reddit post on this issue: http://www.reddit.com/r/django/comments/2lxjcc/unintuitive_behavior_of_f_expression_with_exclude/

Change History (8)

comment:1 Changed 3 years ago by Anssi Kääriäinen

Triage Stage: UnreviewedAccepted

Unfortunately this will be messy to fix. It is true that .exclude() should produce the complement of .filter(), and that doesn't happen here.

One possible solution is to use WHERE ("testapp_rectangle"."length" = ("testapp_rectangle"."width")) IS NOT true. This should match both null and false values (the .filter() query could be written as WHERE ("testapp_rectangle"."length" = ("testapp_rectangle"."width")) IS true. I have no idea how well IS NOT true is optimized, and if it is available for all backends.

Another possibility is to also filter F() NULL values, but as said that will be messy.

comment:2 Changed 3 years ago by Anssi Kääriäinen

After a quick test, it seems the WHERE ("testapp_rectangle"."length" = ("testapp_rectangle"."width")) IS NOT true approach will not work that well. At least PostgreSQL wont use indexes for a query select * from foobar where (id > 1 and id < 20) is true, but will use the index for select * from foobar where (id > 1 and id < 20). This tells me PostgreSQL's optimizer will not handle the is true / is not true conditions well.

If somebody is able to provide a somewhat clean solution to this, I am all for fixing this.

comment:3 Changed 3 years ago by Suriya Subramanian

Is it the case that filter() and exclude() should produce the complement of each other? If that is not the case, the obvious queries would be WHERE (length = width) and WHERE NOT (length = width).

If the results should be complimentary, couldn't we use WHERE (length = width) for the filter and WHERE (length = width)) IS NOT true for the exclude. The exclude would be slower, but at least the results would be consistent.

comment:4 Changed 3 years ago by Simon Charette

@akaariai what about adding another IS NOT NULL clause here based on value we should end up with NOT ((length = width) AND (length IS NULL) AND (width IS NULL))?

If the query planer use the index when the generated constraint is NOT ((length = width) AND (length IS NULL)) then it should also be able to use it in this case.

comment:5 Changed 3 years ago by Simon Charette

Cc: Simon Charette added

comment:6 Changed 3 years ago by Anssi Kääriäinen

Yes, .exclude() should be the complement of .filter(). As said in comment:2 the WHERE (length = width)) IS NOT true way is just way too slow at least on PostgreSQL (always a sequential scan according to my tests). Going for the NOT ((length = width) AND (length IS NULL) AND (width IS NULL)) should likely work.

I'd like to do this in the Lookup class. We can't do this directly in as_sql() as we don't have enough context there. Maybe a new method get_extra_restriction(self, is_negated) could work. The method returns an expression (a Lookup instance or WhereNode instance) that will be ANDed to the lookup's main condition. I'll try to find the time to write a small proof of concept.

comment:7 Changed 3 years ago by Anssi Kääriäinen

I have to take back what I said earlier. Unfortunately we have backwards compatibility requirements for the pre-Lookup classes way of writing custom lookups. So, we actually don't necessarily have a lookup class at the point where we want to add the (width IS NULL) condition. So, for now we have to do this directly in build_filter() method.

comment:8 Changed 3 years ago by Shai Berger

Just for the record, the (x=y) is true syntax is not supported on Oracle (or Mysql, IIRC). It's a database, it doesn't really handle logic.

Note: See TracTickets for help on using tickets.
Back to Top