Opened 5 years ago

Closed 5 years ago

#30575 closed Bug (wontfix)

Union of TruncBase annotations with different tzinfo apply `convert_value` of last tzinfo.

Reported by: Jurgis Pralgauskis Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: timezone, TruncBase, Union, PostgreSQL
Cc: Jurgis Pralgauskis Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Jurgis Pralgauskis)

class Message(models.Model):
    timestamp = models.DateTimeField(auto_now=True) # PostgreSQL, USE_TZ=True
    msg = models.CharField(max_length=254)
    timezone = models.CharField(max_length=64, default="UTC", help_text="pytz name")
    def test_demo_bug(self):
        from ..models import Message
        import pytz
        from django.db.models.functions import Trunc, TruncSecond
        import mock
        from django.utils import timezone

        with patch.object(timezone, 'now', return_value=pytz.utc.localize(datetime(2017, 1, 1, 0, 0))):

            # we have info
            Message.objects.create( msg = "bar", timezone = 'UTC')
            Message.objects.create( msg = "foo", timezone = 'America/Los_Angeles')


            # we want to get it with "localized" timestamps:
            qs = Message.objects.all()

            partitions = []  # partition by timezones
            for tzname in ['UTC', 'America/Los_Angeles']:
                tz = pytz.timezone(tzname)
                _qs = qs.filter(timezone=tzname)  # was but in test hardcoded "UTC" instead of `tzname`
                _qs = _qs.annotate(trunc_local_time=TruncSecond('timestamp', tzinfo=tz)) # could be Trunc_anything_
                partitions.append(_qs)

            qs1, qs2 = partitions

            result = list(qs1.union(qs2))

            for x in result:
                tz = pytz.timezone(x.timezone)
                print(x.msg)
                x.timestamp_trunc = x.timestamp.replace(microsecond=0)
                x.expected = x.timestamp_trunc.astimezone(tz)  # this is the working way  (but I wanted to get localization in DB layer)
                assert x.trunc_local_time == x.expected,  "Error (msg: '{x.msg}'): {x.trunc_local_time} != {x.expected}".format(x=x)

Problem - different TZ offsets

Failure
Traceback (most recent call last):
  File "/home/jurgis/dev/new/tableair/sync_tableair-cloud/ta/api/bookables/tests/test_endpoints.py", line 689, in test_demo_bug
    assert x.trunc_local_time == x.expected,  "Error (msg: '{x.msg}'): {x.trunc_local_time} != {x.expected}".format(x=x)
AssertionError: Error (msg: 'foo'): 2016-12-31 16:00:00+00:00 != 2016-12-31 16:00:00-08:00

Change History (9)

comment:1 by Jurgis Pralgauskis, 5 years ago

related idea, using CASE/WHEN, gives localized (but naive) datetime (because it just don't apply BaseTrunc.convert_value
https://mjec.blog/2016/06/27/djangos-queryset-union-isnt-quite-what-it-seems/

comment:2 by Jurgis Pralgauskis, 5 years ago

workarounds are:

  • to take initial query and later adapt
    tz = pytz.timezone(x.timezone)
    x.timestamp.astimezone(tz)
    
  • or just make separate queries per timezone and chain (kind of union) them in python

comment:3 by Jurgis Pralgauskis, 5 years ago

Cc: Jurgis Pralgauskis added
Type: UncategorizedBug

comment:4 by Jurgis Pralgauskis, 5 years ago

Description: modified (diff)

comment:5 by Mariusz Felisiak, 5 years ago

Resolution: worksforme
Status: newclosed
Summary: Union of TruncBase annotations with different tzinfo apply `convert_value` of last tzinfo (PostgreSQL)Union of TruncBase annotations with different tzinfo apply `convert_value` of last tzinfo.
Version: 1.11master

Thanks for the report, however ticket description is not correct. Django uses tzinfo properly in your use case:

SELECT
    ...
    DATE_TRUNC('second', "ticket_30575_message"."timestamp" AT TIME ZONE 'UTC') AS "trunc_local_time",
    UTC AS "tzname"
FROM "ticket_30575_message" WHERE "ticket_30575_message"."timezone" = UTC
UNION
SELECT
    ...
    DATE_TRUNC('second', "ticket_30575_message"."timestamp" AT TIME ZONE 'America/Los_Angeles') AS "trunc_local_time",
    America/Los_Angeles AS "tzname"
FROM "ticket_30575_message" WHERE "ticket_30575_message"."timezone" = UTC)

i.e. UTC in a qs1 and America/Los_Angeles in qs2. You can add tzname to you queryset (e.g. tzname=Value(tzname, output_field=CharField())) to check that everything works properly:

>>> for x in result:
>>>     print(x.timestamp, x.trunc_local_time, x.tzname)
2019-06-18 10:07:04.111245+00:00 2019-06-18 03:07:04+00:00 America/Los_Angeles 
2019-06-18 10:07:04.111245+00:00 2019-06-18 10:07:04+00:00 UTC 

Closing per TicketClosingReasons/UseSupportChannels.

comment:6 by Jurgis Pralgauskis, 5 years ago

Description: modified (diff)

comment:7 by Jurgis Pralgauskis, 5 years ago

Description: modified (diff)
Resolution: worksforme
Status: closednew

Sorry, was a bug in my test, and failure didn't show up (probalby copied wrong revision here)..

The SQL is generated OK, but imo, the problem is that TruncBase#convert_value has the lines

                value = value.replace(tzinfo=None)   
                value = timezone.make_aware(value, self.tzinfo)  

they apply (replace) the same tzinfo to all (union'ed) rows (that come from different timezones)

Last edited 5 years ago by Jurgis Pralgauskis (previous) (diff)

comment:8 by Mariusz Felisiak, 5 years ago

Thanks for an extra clarification, I'm not sure if it is feasible with the current implementation of combined queries, because on Python's side we may not be able to recognize which row is from which query. I will confirm this later.

comment:9 by Mariusz Felisiak, 5 years ago

Resolution: wontfix
Status: newclosed

Thanks again for detailed reported, unfortunately you found an edge case that is not fixable. Django gets naive datetimes from the database and makes them aware with tzinfo argument from e.g. TruncSecond function, however with combined queries we use (if required) functions from the first query because we are not be able to recognize which row is from which query.

Note: See TracTickets for help on using tickets.
Back to Top