Opened 2 months ago

Last modified 7 hours ago

#31639 assigned Bug

F objects do not error when referencing transforms.

Reported by: Baptiste Mispelon Owned by: Srinivas Reddy Thatiparthy
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: Baptiste Mispelon Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Consider two models Client and Bill defined like so:

from django.db import models


class Client(models.Model):
    name = models.CharField(max_length=50)


class Bill(models.Model):
    client = models.ForeignKey('Client', on_delete=models.CASCADE, related_name='bills')
    issued_on = models.DateField()

I was trying to get a Bill queryset where each bill is annotated with the year of the previous bill (same client) and came up with the following code which gave me incorrect results:

from django.db.models import F, Max, Q
from django.db.models.functions import ExtractYear


filter_Q = Q(client__bills__issued_on__year__lt=F('issued_on__year'))
annotation = Max(ExtractYear('client__bills__issued_on'), filter=filter_Q)

# Returns wrong annotations (as if the filter=... was not there)
Bill.objects.annotate(previous_year=annotation)

However if I use ExtractYear instead of the __year lookup then I get the correct results:

filter_Q = Q(client__bills__issued_on__year__lt=ExtractYear('issued_on'))

I would assume that F('issued_on__year') should be strictly equivalent to ExtractYear('issued_on') but somehow it's not the case here. I believe that's a bug.

Here's a small testcase that summarizes the issue:

from django.db.models import F, Max, Q
from django.db.models.functions import ExtractYear
from django.test import TestCase

from .models import Bill, Client


class ReproductionTestCase(TestCase):
    @classmethod
    def setUpTestData(cls):
        c = Client.objects.create(name="Baptiste")
        c.bills.create(issued_on='2020-01-01')
        c.bills.create(issued_on='2019-01-01')
        c.bills.create(issued_on='2019-07-01')
        c.bills.create(issued_on='2018-01-01')


    def test_extractyear(self):
        filter_Q = Q(client__bills__issued_on__year__lt=ExtractYear('issued_on'))
        annotation = Max(ExtractYear('client__bills__issued_on'), filter=filter_Q)

        queryset = Bill.objects.annotate(previous_year=annotation).order_by('issued_on')
        expected = [None, 2018, 2018, 2019]
        self.assertQuerysetEqual(queryset, expected, transform=lambda bill: bill.previous_year)

    def test_f_object_and_lookup(self):
        filter_Q = Q(client__bills__issued_on__year__lt=F('issued_on__year'))
        annotation = Max(ExtractYear('client__bills__issued_on'), filter=filter_Q)

        queryset = Bill.objects.annotate(previous_year=annotation).order_by('issued_on')
        expected = [None, 2018, 2018, 2019]
        self.assertQuerysetEqual(queryset, expected, transform=lambda bill: bill.previous_year)

Not sure if it's related but while debugging this I came across #29396. Before that change, evaluating the queryset raised an IndexError (both when using ExtractYear and with the __year lookup).

Change History (6)

comment:1 Changed 2 months ago by Simon Charette

Summary: Incorrect result when using __year lookup in aggregation filterF objects do not error when referencing transforms.
Triage Stage: UnreviewedAccepted

I think the underlying issue here is that F objects simply don't support transforms but silently drops them when resolving to Col.

For example, given the models you've provided

Bill.objects.filter(issued_on__year=F('issued_on__year'))

Results in the following on SQLite

SELECT "db_functions_bill"."id", "db_functions_bill"."client_id", "db_functions_bill"."issued_on" FROM "db_functions_bill" WHERE django_date_extract('year', "db_functions_bill"."issued_on") = "db_functions_bill"."issued_on"

Notice how the right hand side of the filter __year transform doesn't result in wrapping what comes after the =.

It's not an issue with filter's compilation, .annotate exhibit the same behaviour

Bill.objects.annotate(year=F('issued_on__year'))
SELECT "db_functions_bill"."id", "db_functions_bill"."client_id", "db_functions_bill"."issued_on", "db_functions_bill"."issued_on" AS "year" FROM "db_functions_bill"

In the end, references to transforms via F is not supported and whether or not it should be discussed on the mailing list. Accepting the ticket on the basis that we should at least error out in this case.

comment:2 Changed 2 months ago by Baptiste Mispelon

I personally don't have strong feelings on whether F objects should accept transforms or not.

Raising an error instead of silently dropping them seems like a good fix to me.

I wonder if this relates to #25534 somehow.

comment:3 Changed 2 months ago by felixxm

Yes it's related. Supporting transforms/lookups in the F() expression will automatically fix #25534 (see BaseExpression._parse_expressions()).

comment:4 Changed 5 weeks ago by Srinivas Reddy Thatiparthy

Owner: changed from nobody to Srinivas Reddy Thatiparthy
Status: newassigned

comment:5 Changed 3 days ago by Carlton Gibson

#31860 was a duplicate using __date.

comment:6 Changed 7 hours ago by Tim Park

Hey guys - Here is an outline of a brute force solution for raising a ValueError if a lookup is passed into F().

Thought Process

  1. We init the F() object with a parameter name
  2. If a lookup value is found in name, raise ValueError
  3. Otherwise, continue instantiating the F() object as normal

So the next step is figuring out how to check if a lookup value is contained in name.

  1. Lookup expressions follow the format of <fieldname>__<lookup>
  2. First, I wanted to raise the error if name contained two underscores __ but I realized we use double underscores for referencing foreign keys as shown in our example of client__bills__issued_on.
  3. Per docs linked below, lookups are ALWAYS evaluated last. So if we split the name string on character __, the last portion of that split would be a potential lookup expression. Check if that potential lookup expression exists in Django's actual lookup expressions. I could follow this process to check for transforms within name

Conclusion
So this technically works but it seems hacky because it won't take into account situations where users define their own custom lookups.

Let me know if I am missing something obvious here. Happy to take hints/direction to help explore the right places.

Thanks!

References
Lookup Positioning Docs https://docs.djangoproject.com/en/3.0/howto/custom-lookups/#how-django-determines-the-lookups-and-transforms-which-are-used

List of All Django Field Lookups https://docs.djangoproject.com/en/3.0/ref/models/querysets/#field-lookups

Code for F() https://github.com/django/django/blob/58a336a674a658c1cda6707fe1cacb56aaed3008/django/db/models/expressions.py#L560

class F(Combinable):

    def __init__(self, name):
        lookups = {'exact', 'iexact', 'contains', 'icontains', 'in', 'gt',
                   'gte', 'lt', 'lte', 'startswith', 'istartswith', 'endswith',
                   'iendswith', 'range', 'date', 'year', 'iso_year', 'month', 'day',
                   'week', 'week_day', 'quarter', 'time', 'hour', 'minute', 'second',
                   'isnull', 'regex', 'iregex'}

        if name.split("__")[-1] in lookups:
            raise ValueError("F() objects do not support lookups")

        self.name = name
Note: See TracTickets for help on using tickets.
Back to Top