Opened 9 years ago
Last modified 19 months ago
#25937 new Bug
Failure when using expressions.DateTime on NULL values and aggregating
Reported by: | Thomas Recouvreux | Owned by: | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.9 |
Severity: | Normal | Keywords: | aggregate timezone |
Cc: | cherie@…, Sarah Boyce | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Running query like qs.annotate(time=DateTime('removed', 'month', timezone.UTC())).values('time').annotate(Count('pk'))
fails with <repr(<gill.contrib.sales.models.TransactionRowQuerySet at 0x10cc99320>) failed: ValueError: Database returned an invalid value in QuerySet.datetimes(). Are time zone definitions for your database and pytz installed?>
. I noted the failure on MySQL and PostgreSQL, Django 1.8.X and 1.9.X. I did not try on other db backends or other Django versions.
To reproduce the behaviour:
- create a model:
class Potato(models.Model): .. removed = models.DateTimeField(null=True, blank=True)
- Insert some of them
Potato.objects.create(removed=timezone.now()) Potato.objects.create(removed=None) ..
- Fire the request
# Failure Potato.objects.annotate(time=DateTime('removed', 'month', timezone.UTC())).values('time').annotate(c=Count('pk')) # Success Potato.objects.filter(removed__isnull=False).annotate(time=DateTime('removed', 'month', timezone.UTC())).values('time').annotate(c=Count('pk'))
Note: I printed the raw sql generated by Django and fired it directly in a db shell without any error, so the problem seems to come from the code Django uses to convert the result from the backend to python code, especially the line with NULL coming from the aggregation:
+---------------------+-----------+ | time | pk__count | +---------------------+-----------+ | NULL | 17 | | 2015-01-01 00:00:00 | 7 | ..
Change History (12)
comment:1 by , 9 years ago
Description: | modified (diff) |
---|
comment:2 by , 9 years ago
Keywords: | timezone added; orm removed |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:3 by , 8 years ago
Cc: | added |
---|
Could you please provide the import
required for DateTime
or a project/test case that we can use to reproduce the error?
comment:4 by , 8 years ago
@cheryllium, the DateTime
object referred to here as been refactored in 2a4af0ea43512370764303d35bc5309f8abce666.
You should be able to reproduce using django.db.models.functions.TruncMonth('removed', tzinfo=timezone.UTC())
instead of DateTime
.
comment:5 by , 8 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Started working on tests and fixes here, using the refactored code. Currently working with SQLite.
comment:6 by , 8 years ago
Right now the remaining issue is with this query and on SQLite only:
with_null_fields = DTModel.objects.annotate( extracted=TruncTime('start_time', tzinfo=timezone.UTC()) ).values('extracted').annotate(c=Count('pk'))
It generates the following SQL:
SELECT django_datetime_cast_time("db_functions_dtmodel"."start_time", UTC) AS "extracted", COUNT("db_functions_dtmodel"."id") AS "c" FROM "db_functions_dtmodel" GROUP BY django_datetime_cast_time("db_functions_dtmodel"."start_time", UTC)
If you run this test (using my branch linked in the above pull request), you will see an error that only happens with SQLite as the db:
PYTHONPATH=...:$PYTHONPATH ./runtests.py db_functions.test_datetime.DateFunctionWithTimeZoneTests.test_trunc_time_func
This is because the generated query ends up calling django_datetime_cast_time
with a time as its first argument (start_time
contains only the time data, not a date).
This argument is passed from django_datetime_cast_time
to _sqlite_datetime_parse
to typecast_timestamp
(defined in django/db/backends/utils.py). This last function parses it to a datetime.time
object, which is returned.
After the datetime.time
is returned, the caller proceeds to pass it to timezone.localtime
, which then throws an exception as it expects a datetime.datetime
object, not a datetime.time
.
I am not sure what the correct behavior should be in this situation. I suppose it doesn't make a lot of sense to adjust a time alone according to a time zone. Should we ignore the timezone in this case?
I hope I have articulated this clearly. Please ping if you have any questions I might be able to answer.
I'll take a look at how this test is handled with the other databases (as the test passes with those), but any guidance would be appreciated as it can be dizzying to dig through these functions :)
comment:7 by , 8 years ago
Thanks for your detailed comment.
Using TruncTime
on TimeField
is something I overlooked when fixing #26348.
Making TruncTime.as_sql
a noop when self.lhs.output_field
is an instance of TimeField
should do.
comment:8 by , 8 years ago
Has patch: | set |
---|
comment:10 by , 7 years ago
Here's a possible workaround for the failing case:
Potato.objects.annotate( time=Coalesce( DateTime('removed', 'month', timezone.UTC()), Value(datetime.min), ).values('time').annotate(c=Count('pk'))
This replaces the NULL times with an easy to spot sentinel. if you were already using datetime.min
, you'll have to come up with something else.
comment:11 by , 3 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:12 by , 19 months ago
Cc: | added |
---|
It looks like the heuristics to determine whether or not time zone definition are set up correctly conflicts with the use of
DateTime
expressions on nullable fields.I wonder if
convert_value
has enough context to figure out if its lookup isNULL
able. If it's the case could adjust the heuristics to only thow aValueError
if its lookup is not nullable andNone
is returned.