Opened 2 years ago

Closed 2 years ago

Last modified 17 months ago

#33397 closed Bug (fixed)

Arithmetic operations on DateField/DateTimeField/DurationField expressions are buggy.

Reported by: Luke Plant Owned by: Luke Plant
Component: Database layer (models, ORM) Version: 4.0
Severity: Normal Keywords:
Cc: Simon Charette, Egor R Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

There are two main sets of bugs here:

  1. Nonsensical operations involving DateField or DateTimeField expressions (such as adding two dates) do not raise the expected FieldError exceptions. They usually raise exceptions later that vary depending on the database backend.
  1. Well-defined operations, which work in SQL, such as ’date + duration’, require using ExpressionWrapper(output_field=…) when this could be inferred.

Although we could technically split this into two bugs, I’m filing as one since the two parts are closely related and fixing part 2 (which is the real reason I’m
here) will require some changes that impinge on part 1.

Part 1

Test case

# tests/experiments/tests.py

class FTimeDeltaTests(TestCase):
    def test_nonsensical_date_operations(self):
        queryset = Experiment.objects.annotate(nonsense=F('name') + F('assigned'))
        with self.assertRaises(FieldError):
            list(queryset)

        queryset = Experiment.objects.annotate(nonsense=F('assigned') + F('completed'))
        with self.assertRaises(FieldError):
            list(queryset)

The first part works as expected (it makes no sense to add a string to a date, and a FieldError is raised), but the second doesn’t.

Expected behaviour: FieldError should be raised

Actual behaviour:

With Postgres:

  File "/home/luke/devel/django/main/django/db/models/query.py", line 51, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File "/home/luke/devel/django/main/django/db/models/sql/compiler.py", line 1211, in execute_sql
    cursor.execute(sql, params)
  File "/home/luke/devel/django/main/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/home/luke/devel/django/main/django/db/backends/utils.py", line 76, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/home/luke/devel/django/main/django/db/backends/utils.py", line 85, in _execute
    return self.cursor.execute(sql, params)
  File "/home/luke/devel/django/main/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/home/luke/devel/django/main/django/db/backends/utils.py", line 85, in _execute
    return self.cursor.execute(sql, params)
psycopg2.errors.UndefinedFunction: operator does not exist: date + date
LINE 1: ...t"."scalar", ("expressions_ExPeRiMeNt"."assigned" + "express...
                                                             ^
HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.

With SQLite:

  File "/home/luke/devel/django/main/django/db/models/query.py", line 68, in __iter__
    for row in compiler.results_iter(results):
  File "/home/luke/devel/django/main/django/db/models/sql/compiler.py", line 1158, in apply_converters
    value = converter(value, expression, connection)
  File "/home/luke/devel/django/main/django/db/backends/sqlite3/operations.py", line 305, in convert_datefield_value
    value = parse_date(value)
  File "/home/luke/devel/django/main/django/utils/dateparse.py", line 76, in parse_date
    return datetime.date.fromisoformat(value)
TypeError: fromisoformat: argument must be str

I have not tested on other databases.

Additional notes

There is a related bug in some context. For example, in contrast to the above test case Experiment.objects.filter(name=F('name') + F('assigned')) does not raise FieldError, despite the attempt to add a date to a string. Instead you get backend dependent results - SQLite silently does some type coercion, Postgres fails with UndefinedFunction. Tackling this may need to be done separately - there are different code paths involved when using QuerySet.annotate() compared to QuerySet.filter()

Part 2

Test case:

# tests/experiments/tests.py

class FTimeDeltaTests(TestCase):
    def test_datetime_and_duration_field_addition_without_output_field(self):
        test_set = Experiment.objects.annotate(estimated_end=F('start') + F('estimated_time'))
        self.assertEqual(
            [e.estimated_end for e in test_set],
            [e.start + e.estimated_time for e in test_set]
        )

Expected behaviour: Django should infer the output type, like it does for other expressions such as integer field addition.

Actual behaviour: Django raises django.core.exceptions.FieldError: Expression contains mixed types: DateTimeField, DurationField. You must set output_field.

Additional motivation

If we have this code:

Experiment.objects.filter(end__gt=F('start') + F('estimated_time'))

we should be able to refactor to:

Experiment.objects.alias(estimated_end=F('start') + F('estimated_time')).filter(end__gt=F('estimated_end'))

But the latter fails with FieldError. The former succeeds because in that context the ORM doesn't need to do any type inference.

Notes

  • Above tests have been run against main branch.
  • There are a bunch of other cases, like "date multiplied by date" etc. that don't work as expected.

Change History (13)

comment:1 by Luke Plant, 2 years ago

For anyone else looking at this:

The underlying behaviour of BaseExpression._resolve_output_field is an enormous can of worms, especially once you dig into database differences, tickets #26650, #31506 and preserving compatibility with existing behaviour.

comment:2 by Mariusz Felisiak, 2 years ago

Cc: Simon Charette added

comment:3 by Egor R, 2 years ago

Cc: Egor R added

comment:4 by Carlton Gibson, 2 years ago

Triage Stage: UnreviewedAccepted

Hi Luke — thanks for the ticket. Sounds good. I'm going to Accept, given that Simon's initial review is generally positive.

comment:5 by Jacob Walls, 2 years ago

Has patch: set

comment:6 by Mariusz Felisiak, 2 years ago

#33464 was a duplicate for DecimalField and MOD.

comment:7 by Mariusz Felisiak <felisiak.mariusz@…>, 2 years ago

In 04ad0f26:

Refs #33397 -- Added extra tests for resolving an output_field of CombinedExpression.

comment:8 by Mariusz Felisiak, 2 years ago

Triage Stage: AcceptedReady for checkin

comment:9 by Mariusz Felisiak, 2 years ago

Note that currently output_field has limited scope in what it affects:

  • it does not cause a database level CAST to be done (see #26650)
  • it controls conversion to Python data types, but that doesn't necessarily include validation, so some invalid values for the field may be returned.

Correct behavior for NULL was difficult to decide, due to different database handling of NULL. The current approach is based on lowest common denominator behavior i.e. if one of the supported databases is raising an error (rather than return NULL) for val <op> NULL, then Django raises FieldError.

See also Luke's description for more details.

comment:10 by Mariusz Felisiak <felisiak.mariusz@…>, 2 years ago

In 1efea118:

Refs #33397 -- Added register_combinable_fields().

comment:11 by Mariusz Felisiak <felisiak.mariusz@…>, 2 years ago

Resolution: fixed
Status: assignedclosed

In 40b8a617:

Fixed #33397 -- Corrected resolving output_field for DateField/DateTimeField/TimeField/DurationFields.

This includes refactoring of CombinedExpression._resolve_output_field()
so it no longer uses the behavior inherited from Expression of guessing
same output type if argument types match, and instead we explicitly
define the output type of all supported operations.

This also makes nonsensical operations involving dates
(e.g. date + date) raise a FieldError, and adds support for
automatically inferring output_field for cases such as:

  • date - date
  • date + duration
  • date - duration
  • time + duration
  • time - time

comment:12 by GitHub <noreply@…>, 17 months ago

In e8dcef15:

Refs #33397, Refs #34160 -- Added release note for resolving output_field changes.

comment:13 by Mariusz Felisiak <felisiak.mariusz@…>, 17 months ago

In 58156f4e:

[4.1.x] Refs #33397, Refs #34160 -- Added release note for resolving output_field changes.

Backport of e8dcef155c1848ef49e54f787a7d20faf3bf9296 from main

Note: See TracTickets for help on using tickets.
Back to Top