Opened 2 years ago

Last modified 3 months ago

#25937 assigned Bug

Failure when using expressions.DateTime on NULL values and aggregating

Reported by: Thomas Recouvreux Owned by: Cheryl Yang
Component: Database layer (models, ORM) Version: 1.9
Severity: Normal Keywords: aggregate timezone
Cc: cherie@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by Simon Charette)

Running query like qs.annotate(time=DateTime('removed', 'month', timezone.UTC())).values('time').annotate(Count('pk')) fails with <repr(<gill.contrib.sales.models.TransactionRowQuerySet at 0x10cc99320>) failed: ValueError: Database returned an invalid value in QuerySet.datetimes(). Are time zone definitions for your database and pytz installed?>. I noted the failure on MySQL and PostgreSQL, Django 1.8.X and 1.9.X. I did not try on other db backends or other Django versions.

To reproduce the behaviour:

  1. create a model:
    class Potato(models.Model):
        ..
        removed = models.DateTimeField(null=True, blank=True)
    
  1. Insert some of them
    Potato.objects.create(removed=timezone.now())
    Potato.objects.create(removed=None)
    ..
    
  1. Fire the request
    # Failure
    Potato.objects.annotate(time=DateTime('removed', 'month', timezone.UTC())).values('time').annotate(c=Count('pk'))
    
    # Success
    Potato.objects.filter(removed__isnull=False).annotate(time=DateTime('removed', 'month', timezone.UTC())).values('time').annotate(c=Count('pk'))
    

Note: I printed the raw sql generated by Django and fired it directly in a db shell without any error, so the problem seems to come from the code Django uses to convert the result from the backend to python code, especially the line with NULL coming from the aggregation:

+---------------------+-----------+
| time                | pk__count |
+---------------------+-----------+
| NULL                |        17 |
| 2015-01-01 00:00:00 |         7 |
..

Change History (10)

comment:1 Changed 2 years ago by Simon Charette

Description: modified (diff)

comment:2 Changed 2 years ago by Simon Charette

Keywords: timezone added; orm removed
Triage Stage: UnreviewedAccepted

It looks like the heuristics to determine whether or not time zone definition are set up correctly conflicts with the use of DateTime expressions on nullable fields.

I wonder if convert_value has enough context to figure out if its lookup is NULLable. If it's the case could adjust the heuristics to only thow a ValueError if its lookup is not nullable and None is returned.

comment:3 Changed 16 months ago by Cheryl Yang

Cc: cherie@… added

Could you please provide the import required for DateTime or a project/test case that we can use to reproduce the error?

Thank you @charettes for the answer. I will look into this further.

Last edited 16 months ago by Cheryl Yang (previous) (diff)

comment:4 Changed 16 months ago by Simon Charette

@cheryllium, the DateTime object referred to here as been refactored in 2a4af0ea43512370764303d35bc5309f8abce666.

You should be able to reproduce using django.db.models.functions.TruncMonth('removed', tzinfo=timezone.UTC()) instead of DateTime.

Last edited 16 months ago by Simon Charette (previous) (diff)

comment:5 Changed 16 months ago by Cheryl Yang

Owner: changed from nobody to Cheryl Yang
Status: newassigned

Started working on tests and fixes here, using the refactored code. Currently working with SQLite.

comment:6 Changed 16 months ago by Cheryl Yang

Right now the remaining issue is with this query and on SQLite only:

with_null_fields = DTModel.objects.annotate(
    extracted=TruncTime('start_time', tzinfo=timezone.UTC())
).values('extracted').annotate(c=Count('pk'))

It generates the following SQL:

SELECT django_datetime_cast_time("db_functions_dtmodel"."start_time", UTC) AS "extracted", COUNT("db_functions_dtmodel"."id") AS "c" FROM "db_functions_dtmodel" GROUP BY django_datetime_cast_time("db_functions_dtmodel"."start_time", UTC)

If you run this test (using my branch linked in the above pull request), you will see an error that only happens with SQLite as the db:

PYTHONPATH=...:$PYTHONPATH ./runtests.py db_functions.test_datetime.DateFunctionWithTimeZoneTests.test_trunc_time_func

This is because the generated query ends up calling django_datetime_cast_time with a time as its first argument (start_time contains only the time data, not a date).

This argument is passed from django_datetime_cast_time to _sqlite_datetime_parse to typecast_timestamp (defined in django/db/backends/utils.py). This last function parses it to a datetime.time object, which is returned.

After the datetime.time is returned, the caller proceeds to pass it to timezone.localtime, which then throws an exception as it expects a datetime.datetime object, not a datetime.time.

I am not sure what the correct behavior should be in this situation. I suppose it doesn't make a lot of sense to adjust a time alone according to a time zone. Should we ignore the timezone in this case?

I hope I have articulated this clearly. Please ping if you have any questions I might be able to answer.

I'll take a look at how this test is handled with the other databases (as the test passes with those), but any guidance would be appreciated as it can be dizzying to dig through these functions :)

comment:7 Changed 16 months ago by Simon Charette

Thanks for your detailed comment.

Using TruncTime on TimeField is something I overlooked when fixing #26348.

Making TruncTime.as_sql a noop when self.lhs.output_field is an instance of TimeField should do.

comment:8 Changed 16 months ago by Simon Charette

Has patch: set

comment:9 Changed 15 months ago by Tim Graham

Patch needs improvement: set

Comments for improvement on the PR.

comment:10 Changed 3 months ago by Aaron McMillin

Here's a possible workaround for the failing case:

Potato.objects.annotate(
    time=Coalesce(
        DateTime('removed', 'month', timezone.UTC()),
        Value(datetime.min),
    ).values('time').annotate(c=Count('pk'))

This replaces the NULL times with an easy to spot sentinel. if you were already using datetime.min, you'll have to come up with something else.

Note: See TracTickets for help on using tickets.
Back to Top