Context Navigation

← Previous Ticket
Next Ticket →

#25937 new Bug

Failure when using expressions.DateTime on NULL values and aggregating

Reported by:	Thomas Recouvreux	Owned by:
Component:	Database layer (models, ORM)	Version:	1.9
Severity:	Normal	Keywords:	aggregate timezone
Cc:	cherie@…, Sarah Boyce	Triage Stage:	Accepted
Has patch:	yes	Needs documentation:	no
Needs tests:	no	Patch needs improvement:	yes
Easy pickings:	no	UI/UX:	no

Description (last modified by Simon Charette) ¶

Running query like qs.annotate(time=DateTime('removed', 'month', timezone.UTC())).values('time').annotate(Count('pk')) fails with <repr(<gill.contrib.sales.models.TransactionRowQuerySet at 0x10cc99320>) failed: ValueError: Database returned an invalid value in QuerySet.datetimes(). Are time zone definitions for your database and pytz installed?>. I noted the failure on MySQL and PostgreSQL, Django 1.8.X and 1.9.X. I did not try on other db backends or other Django versions.

To reproduce the behaviour:

create a model:

class Potato(models.Model):
    ..
    removed = models.DateTimeField(null=True, blank=True)

Insert some of them

Potato.objects.create(removed=timezone.now())
Potato.objects.create(removed=None)
..

Fire the request

# Failure
Potato.objects.annotate(time=DateTime('removed', 'month', timezone.UTC())).values('time').annotate(c=Count('pk'))

# Success
Potato.objects.filter(removed__isnull=False).annotate(time=DateTime('removed', 'month', timezone.UTC())).values('time').annotate(c=Count('pk'))

Note: I printed the raw sql generated by Django and fired it directly in a db shell without any error, so the problem seems to come from the code Django uses to convert the result from the backend to python code, especially the line with NULL coming from the aggregation:

+---------------------+-----------+
| time                | pk__count |
+---------------------+-----------+
| NULL                |        17 |
| 2015-01-01 00:00:00 |         7 |
..

According to the ticket's flags, the next step(s) to move this issue forward are:

To improve the patch as described in the pull request review comments or on this ticket, then uncheck "Patch needs improvement".
If creating a new pull request, include a link to the pull request in the ticket comment when making that update. The usual format is: [https://github.com/django/django/pull/#### PR].

Change History (12)

comment:1 by Simon Charette, 9 years ago

Description:	modified (diff)

comment:2 by Simon Charette, 9 years ago

Keywords:	timezone added; orm removed
Triage Stage:	Unreviewed → Accepted

It looks like the heuristics to determine whether or not time zone definition are set up correctly conflicts with the use of DateTime expressions on nullable fields.

I wonder if convert_value has enough context to figure out if its lookup is NULLable. If it's the case could adjust the heuristics to only thow a ValueError if its lookup is not nullable and None is returned.

comment:3 by Cheryl Yang, 9 years ago

Cc:	cherie@… added

Could you please provide the import required for DateTime or a project/test case that we can use to reproduce the error?

Thank you @charettes for the answer. I will look into this further.

Last edited 9 years ago by Cheryl Yang (previous) (diff)

comment:4 by Simon Charette, 9 years ago

@cheryllium, the DateTime object referred to here as been refactored in 2a4af0ea43512370764303d35bc5309f8abce666.

You should be able to reproduce using django.db.models.functions.TruncMonth('removed', tzinfo=timezone.UTC()) instead of DateTime.

Last edited 9 years ago by Simon Charette (previous) (diff)

comment:5 by Cheryl Yang, 9 years ago

Owner:	changed from nobody to Cheryl Yang
Status:	new → assigned

Started working on tests and fixes here, using the refactored code. Currently working with SQLite.

comment:6 by Cheryl Yang, 9 years ago

Right now the remaining issue is with this query and on SQLite only:

with_null_fields = DTModel.objects.annotate(
    extracted=TruncTime('start_time', tzinfo=timezone.UTC())
).values('extracted').annotate(c=Count('pk'))

It generates the following SQL:

SELECT django_datetime_cast_time("db_functions_dtmodel"."start_time", UTC) AS "extracted", COUNT("db_functions_dtmodel"."id") AS "c" FROM "db_functions_dtmodel" GROUP BY django_datetime_cast_time("db_functions_dtmodel"."start_time", UTC)

If you run this test (using my branch linked in the above pull request), you will see an error that only happens with SQLite as the db:

PYTHONPATH=...:$PYTHONPATH ./runtests.py db_functions.test_datetime.DateFunctionWithTimeZoneTests.test_trunc_time_func

This is because the generated query ends up calling django_datetime_cast_time with a time as its first argument (start_time contains only the time data, not a date).

This argument is passed from django_datetime_cast_time to _sqlite_datetime_parse to typecast_timestamp (defined in django/db/backends/utils.py). This last function parses it to a datetime.time object, which is returned.

After the datetime.time is returned, the caller proceeds to pass it to timezone.localtime, which then throws an exception as it expects a datetime.datetime object, not a datetime.time.

I am not sure what the correct behavior should be in this situation. I suppose it doesn't make a lot of sense to adjust a time alone according to a time zone. Should we ignore the timezone in this case?

I hope I have articulated this clearly. Please ping if you have any questions I might be able to answer.

I'll take a look at how this test is handled with the other databases (as the test passes with those), but any guidance would be appreciated as it can be dizzying to dig through these functions :)

comment:7 by Simon Charette, 9 years ago

Thanks for your detailed comment.

Using TruncTime on TimeField is something I overlooked when fixing #26348.

Making TruncTime.as_sql a noop when self.lhs.output_field is an instance of TimeField should do.

comment:8 by Simon Charette, 9 years ago

Has patch:	set

comment:9 by Tim Graham, 9 years ago

Patch needs improvement:	set

Comments for improvement on the PR.

comment:10 by Aaron McMillin, 8 years ago

Here's a possible workaround for the failing case:

Potato.objects.annotate(
    time=Coalesce(
        DateTime('removed', 'month', timezone.UTC()),
        Value(datetime.min),
    ).values('time').annotate(c=Count('pk'))

This replaces the NULL times with an easy to spot sentinel. if you were already using datetime.min, you'll have to come up with something else.

comment:11 by Mariusz Felisiak, 3 years ago

Owner:	Cheryl Yang removed
Status:	assigned → new

comment:12 by Sarah Boyce, 23 months ago

Cc:	Sarah Boyce added

Note: See TracTickets for help on using tickets.

Download in other formats:

Issues