Opened 6 years ago
Last modified 8 months ago
#29884 new Bug
QuerySet.filter() with TruncBase functions not working as expected when USE_TZ= True
Reported by: | slide333333 | Owned by: | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 2.1 |
Severity: | Normal | Keywords: | |
Cc: | Ülgen Sarıkavak | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Tested PostgreSQL only
USE_TZ=True
Consider the following model:
class TimeStampModel(models.Model): timestamp = models.DateTimeField()
Create some data:
TimeStampModel.objects.bulk_create([ TimeStampModel(timestamp=datetime.datetime(2018, 10, 24, 5, 13, tzinfo=pytz.utc)), TimeStampModel(timestamp=datetime.datetime(2018, 10, 24, 7, 4, tzinfo=pytz.utc)), TimeStampModel(timestamp=datetime.datetime(2018, 10, 24, 8, 56, tzinfo=pytz.utc)), TimeStampModel(timestamp=datetime.datetime(2018, 10, 24, 13, 49, tzinfo=pytz.utc)), TimeStampModel(timestamp=datetime.datetime(2018, 10, 24, 15, 33, tzinfo=pytz.utc)), TimeStampModel(timestamp=datetime.datetime(2018, 10, 24, 18, 29, tzinfo=pytz.utc)), TimeStampModel(timestamp=datetime.datetime(2018, 10, 24, 19, 12, tzinfo=pytz.utc)), TimeStampModel(timestamp=datetime.datetime(2018, 10, 24, 21, 37, tzinfo=pytz.utc)), TimeStampModel(timestamp=datetime.datetime(2018, 10, 24, 21, 9, tzinfo=pytz.utc)), TimeStampModel(timestamp=datetime.datetime(2018, 10, 24, 23, 23, tzinfo=pytz.utc)), ])
The following code shoud definitely return the data. But it returns an empty queryset, because the SQL generated is not correct.
>>> from django.db.models.functions import * >>> from django.utils import timezone >>> import datetime, pytz >>> TimeStampModel.objects.annotate( ... day=TruncDay('timestamp', tzinfo=pytz.timezone('Europe/Berlin'))).filter( ... day=timezone.make_aware(datetime.datetime(2018, 10, 24), pytz.timezone('Europe/Berlin')) ... ) DEBUG: (0.000) SELECT "truncbase_timestampmodel"."id", "truncbase_timestampmodel"."timestamp", DATE_TRUNC('day', "truncbase_timestampmodel"."timestamp" AT TIME ZONE 'Europe/Berlin') AS "day" FROM "truncbase_timestampmodel" WHERE DATE_TRUNC('day', "truncbase_timestampmodel"."timestamp" AT TIME ZONE 'Europe/Berlin') = '2018-10-24T00:00:00+02:00'::timestamptz LIMIT 21; args=('datetime.datetime(2018, 10, 24, 0, 0, tzinfo=<DstTzInfo 'Europe/Berlin' CEST+2:00:00 DST>)') <QuerySet []>
The SQL should be (note the additional AT TIME ZONE 'Europe/Berlin'
):
SELECT "truncbase_timestampmodel"."id", "truncbase_timestampmodel"."timestamp", DATE_TRUNC('day', "truncbase_timestampmodel"."timestamp" AT TIME ZONE 'Europe/Berlin') AT TIME ZONE 'Europe/Berlin' AS "day" FROM "truncbase_timestampmodel" WHERE DATE_TRUNC('day', "truncbase_timestampmodel"."timestamp" AT TIME ZONE 'Europe/Berlin') AT TIME ZONE 'Europe/Berlin' = '2018-10-24T00:00:00+02:00'::timestamptz LIMIT 21;
https://www.postgresql.org/docs/9.2/static/functions-datetime.html#FUNCTIONS-DATETIME-ZONECONVERT
This fix will also make sure that the returned value from the database driver is an aware dateime. Currently the returned value is native and manually made aware in django.db.models.functions.datetime.TruncBase.convert_value
!
I'm not sure how the problem relates to databases without timezone support. But for those with time zone support like PostgreSQL this should work as expected. For Postgres the fix should be pretty easy: django.db.backends.postgresql.operations.DatabaseOperations.datetime_trunc_sql
has to be patched and django.db.models.functions.datetime.TruncBase.convert_value
should be patched. The latter will also affect other database engines.
Change History (13)
comment:1 by , 6 years ago
Description: | modified (diff) |
---|
comment:2 by , 6 years ago
Description: | modified (diff) |
---|
comment:3 by , 6 years ago
Description: | modified (diff) |
---|
comment:4 by , 6 years ago
follow-ups: 6 8 comment:5 by , 6 years ago
Summary: | Queryset.filter with TruncBase functions not working as expected when USE_TZ= True → QuerySet.filter() with TruncBase functions not working as expected when USE_TZ= True |
---|---|
Type: | Uncategorized → Bug |
Can you provide tests for tests/db_functions/datetime/test_extract_trunc.py
? It would be easier to evaluate the ticket with a patch.
comment:6 by , 6 years ago
Replying to Tim Graham:
Can you provide tests for
tests/db_functions/datetime/test_extract_trunc.py
? It would be easier to evaluate the ticket with a patch. tim
I can, but I am wondering about the use of timezone within the database. I know that Django uses Oracle's DATE type, which is not timezone aware. The same has got to be true for sqlite3. Really, the timezone in use for Oracle DATE types is the timezone for the process running Oracle. Also, evaluating the submission, it is clear that TruncDay still returns a datetime.datetime, which I would expect.
I will go directly to that once I understand which types Django uses for each of these. I'm on the track to do this sometime tonight.
comment:7 by , 6 years ago
Needs tests: | set |
---|---|
Owner: | changed from | to
Status: | new → assigned |
Triage Stage: | Unreviewed → Accepted |
Only for PostgreSQL does Django use a timezone aware datetime:
Oracle - data type DATE is not timezone aware.
SQLite 3 - data type DATETIME is advisory and not a datetime at all, but is not timezone aware
MySQL - data type DATETIME(6), stored in UTC but not timezone aware
PostgreSQL - data type TIMESTAMP with TIMEZONE
query generated by TimeStampModel.objects.annotate(day=TruncDay('timestamp', tzinfo=pytz.timezone('Europe/Berlin'))) is:
MySQL
SELECT `app_timestampmodel`.`id`, `app_timestampmodel`.`timestamp`, CAST(DATE_FORMAT(CONVERT_TZ(`app_timestampmodel`.`timestamp`, 'UTC', 'Europe/Berlin'), '%Y-%m-%d 00:00:00') AS DATETIME) AS `day` FROM `app_timestampmodel`
PostgreSQL
SELECT "app_timestampmodel"."id", "app_timestampmodel"."timestamp", DATE_TRUNC('day', "app_timestampmodel"."timestamp" AT TIME ZONE 'Europe/Berlin') AS "day" FROM "app_timestampmodel"
This makes the intent of Django clear:
- Always return a datetime type
- Do truncation after converting the datetime from database native (usually UTC) to the given timezone
I've looked at the SQL produced for the filter query as well, and the DateTrunc doesn't work.
comment:8 by , 6 years ago
Replying to Tim Graham:
Can you provide tests for
tests/db_functions/datetime/test_extract_trunc.py
? It would be easier to evaluate the ticket with a patch.
After looking at this more deeply, a solution is proposed for PostgreSQL, and the problem also appears in multiple backends. TruncBase
delegates to the backend to do this, calling connection.ops.datetime_trunc_sql
for target fields that are datetime fields. I don't see any specific backend tests for datetime truncation, but I see that seetest/backends/base/test_operations.py
might be a good place. I do not see how backend specific tests are invoked, but I see some with skips.
follow-up: 10 comment:9 by , 6 years ago
I don't expect this would require backend-specific tests. The queries results should be the same across all databases, shouldn't they? I haven't looking into this issue to say whether or not there's a bug here.
comment:10 by , 6 years ago
Needs tests: | unset |
---|---|
Owner: | removed |
Status: | assigned → new |
Replying to Tim Graham:
I don't expect this would require backend-specific tests. The queries results should be the same across all databases, shouldn't they? I haven't looking into this issue to say whether or not there's a bug here.
This would be fixed in each backend separately, because the method to truncate a datetime is delegated to each backend separately. Nevertheless, since the unit tests are run against a specific backend, by default sqlite3, the test can be in tests/db_functions/datetime/test_extract_trunc.py
.
Pull request https://github.com/django/django/pull/10611 provides a failing test for the filtering case. It isn't ready, but anyone else can see the diff.
I've run this against sqlite3 and also against postgresql like this:
./runtests.py --settings postgresql_settings db_functions.datetime.test_extract_trunc
I'm out of this one - I would want someone with more experience in the project to decide whether the single ticket should be fixed for all backends or just for PostgreSQL.
One thing I notice is that make_aware is surprisingly not the same thing as adding the tzinfo to the datetime in the first place. The following raises:
from datetime import datetime import pytz from django.utils.timezone import make_aware euberlin = pytz.timezone('Europe/Berlin')s a dt1 = datetime(2018, 10, 24, tzinfo=euberlin) dt2 = make_aware(datetime(2018, 10, 24)) assert dt1 == dt2
follow-up: 12 comment:11 by , 6 years ago
One thing I notice is that make_aware is surprisingly not the same thing as adding the tzinfo to the datetime in the first place.
Except for the TIME_ZONE
setting omission that Aymeric mentioned on the mailing list there's a big difference between datetime(2018, 10, 24, tzinfo=tz)
and tz.localize(datetime(2018, 10, 24))
which is what make_aware
does. The first will assign the timezone Europe/Berlin
timezone as it was first defined and the latter will assign the timezone as it is defined on the specified datetime.
In other words direct tzinfo
, assignment will create highly likely bogus datetime objects (unless a timezone definition such as UTC is used) while localize
will make sure the right timezone is used. In your case datetime(2018, 10, 24, tzinfo=euberlin)
creates a datetime for 2018-10-24 with the timezone definition of Europe/Berlin
in 1901.
comment:12 by , 6 years ago
Replying to Simon Charette:
Except for the
TIME_ZONE
setting omission that Aymeric mentioned on the mailing list there's a big difference betweendatetime(2018, 10, 24, tzinfo=tz)
andtz.localize(datetime(2018, 10, 24))
which is whatmake_aware
does. The first will assign the timezoneEurope/Berlin
timezone as it was first defined and the latter will assign the timezone as it is defined on the specified datetime.
In other words direct
tzinfo
, assignment will create highly likely bogus datetime objects (unless a timezone definition such as UTC is used) whilelocalize
will make sure the right timezone is used. In your casedatetime(2018, 10, 24, tzinfo=euberlin)
creates a datetime for 2018-10-24 with the timezone definition ofEurope/Berlin
in 1901.
Thanks for the explanation. Got it.
comment:13 by , 8 months ago
Cc: | added |
---|---|
Description: | modified (diff) |
I've started the work to triage, and find out which backends may be affected. Work is in https://github.com/danizen/django-review-29984.
Will test sqlite3, postgresql, mysql, and oracle for good measure.