Opened 9 months ago

Last modified 9 months ago

#27921 new Bug

Documentation of make_aware() with is_dst is misleading

Reported by: Kevin Christopher Henry Owned by: nobody
Component: Documentation Version: master
Severity: Normal Keywords:
Cc: Josh Smeaton Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

A recent StackOverflow question prompted me to look into what make_aware() does with non-existent datetimes.

The documentation says:

Setting is_dst to True or False will avoid the exception by moving the hour backwards or forwards by 1 respectively. For example, is_dst=True would change a nonexistent time of 2:30 to 1:30 and is_dst=False would change the time to 3:30.

It appears that pytz in fact keeps the wall-clock time of 2:30 the same, and instead shifts the timezone to make the datetime valid.

I'm not sure if it's better to describe what pytz does more precisely, or just to leave out that degree of detail (since pytz's documentation is itself not very precise on this point).

Change History (8)

comment:1 Changed 9 months ago by Tim Graham

Cc: Josh Smeaton added

Josh, you authored that text in 143255c8bbefe03ff755d7859f029f0bc7a552aa. Does the analysis in this ticket appear correct?

comment:2 Changed 9 months ago by Aymeric Augustin

Triage Stage: UnreviewedAccepted

The sentence you quoted doesn't totally make sense to me, I support rewriting it.

AFAIK (but I didn't double check), is_dst allows you to disambiguate ambiguous datetimes: you can pick the first or the second 2:30am when it happens twice.

However it doesn't help with non existing local time, when 2:30 never happens. (If it does that's super surprising and warrants even more docs).

comment:3 Changed 9 months ago by Kevin Christopher Henry

pytz does actually use is_dst to avoid the NonExistentTimeError (see the linked question for some console tests). Though I agree, Aymeric, that that usage seems less intuitive than in the AmbiguousTimeError case.

I wonder if this calls for less documentation rather than more. The problem is that we're relying on a third-party library who's behavior is not specified. The pytz docs imply that using is_dst will avoid the error, but they don't specify whether that's done by shifting the wall-clock time (as currently documented by Django) or by shifting the timezone offset (as appears to actually be the case).

Would it be sufficient to say:

The pytz.NonExistentTimeError exception is raised if you try to make value aware during a DST transition such that the time never occurred (when entering into DST). Setting is_dst to True or False will avoid the exception by moving the time backwards or forwards by 1 hour respectively.

That's really the most you can say based on pytz's documentation.

comment:4 Changed 9 months ago by Josh Smeaton

I don't think we should be too concerned about the wording of the pytz documentation. We should aim to provide our own docs that makes the most sense for our users. In that regard, perhaps it is worth clarifying the documentation in some way. I'm not exactly sure what that should be though. When I implemented this change, I was also confused about how pytz was representing time during ambiguous and non-existent periods, so I'm not surprised that users are also somewhat confused.

Just quickly, the currently implemented behaviour is correct. Either side of the transition round trips through the database and via UTC to the correct moment in time. But it doesn't sound like that bit is in question. I'd also readily admit that the use case for is_dst=True for non existent times is probably very little - I can't think of one.

Even in the ambiguous case, pytz represents the time as a shift in the timezone.

In [65]: CET = pytz.timezone("Europe/Paris")

In [66]: ambiguous = datetime.datetime(2015, 10, 25, 2, 30)

In [67]: std = timezone.make_aware(ambiguous, timezone=CET, is_dst=False)
    ...: dst = timezone.make_aware(ambiguous, timezone=CET, is_dst=True)
    ...:

In [68]: std
Out[68]: datetime.datetime(2015, 10, 25, 2, 30, tzinfo=<DstTzInfo 'Europe/Paris' CET+1:00:00 STD>)

In [69]: dst
Out[69]: datetime.datetime(2015, 10, 25, 2, 30, tzinfo=<DstTzInfo 'Europe/Paris' CEST+2:00:00 DST>)

The representation is an adjustment in the timezone, but the logical result is a shift 1 hour backwards or 1 hour forwards in the *moment* of time. Neither Ambiguous or NonExistent modify the hour or minute accessible on the datetime instance. But if you save the result and fetch it, or round trip it through UTC, then the hour on the returned datetime object *does* change.

Is it important to tell users that the time change is *represented* by a timezone shift? Conceptually, adjusting the hour is an easier concept to grasp even if the timezone offset is actually the thing changing in the current representation.

To further complicate things:

# Equivalences

In [32]: tz = pytz.timezone('America/Sao_Paulo')
    ...: dt = datetime.datetime(2017, 10, 15, 0)
    ...: dt_plus_one = datetime.datetime(2017, 10, 15, 1)
    ...: dt_minus_one = datetime.datetime(2017, 10, 14, 23)
    ...:

In [33]: std = timezone.make_aware(dt, timezone=tz, is_dst=False)

In [34]: dst = timezone.make_aware(dt, timezone=tz, is_dst=True)

In [35]: after = timezone.make_aware(dt_plus_one, timezone=tz)

In [36]: before = timezone.make_aware(dt_minus_one, timezone=tz)

In [52]: dt_plus_one_aware = timezone.make_aware(dt_plus_one, timezone=tz)

In [53]: dt_plus_one_aware
Out[53]: datetime.datetime(2017, 10, 15, 1, 0, tzinfo=<DstTzInfo 'America/Sao_Paulo' BRST-1 day, 22:00:00 DST>)

In [54]: aware_minus = dt_plus_one_aware - timedelta(hours=1)
Out[54]: datetime.datetime(2017, 10, 15, 0, 0, tzinfo=<DstTzInfo 'America/Sao_Paulo' BRST-1 day, 22:00:00 DST>)

In [60]: dst
Out[60]: datetime.datetime(2017, 10, 15, 0, 0, tzinfo=<DstTzInfo 'America/Sao_Paulo' BRST-1 day, 22:00:00 DST>)

In [61]: before
Out[61]: datetime.datetime(2017, 10, 14, 23, 0, tzinfo=<DstTzInfo 'America/Sao_Paulo' BRT-1 day, 21:00:00 STD>)

In [62]: aware_minus
Out[62]: datetime.datetime(2017, 10, 15, 0, 0, tzinfo=<DstTzInfo 'America/Sao_Paulo' BRST-1 day, 22:00:00 DST>)

In [63]: dst == before == aware_minus
Out[63]: True

comment:5 Changed 9 months ago by Josh Smeaton

Kevin, your proposed wording is less ambiguous (and more correct), but I think having actual examples are useful in conceptualising the time shift because in the case of NonExistent times, it's not really obvious. I'm not sure how to do this without confusing at least someone - it's a confusing topic.

comment:6 Changed 9 months ago by Aymeric Augustin

Sorry for the less than useful first comment :-( What about the following clarification:

For example, is_dst=True would change a nonexistent time of 2:30 to 1:30 pre-transition and is_dst=False would change the time to 3:30 post-transition.

We could also add the following sentence to prevent people from wasting their time trying to understand this calculation.

This change is arbitrary. It doesn't represent a meaningful conversion. It merely modifies the value in order to avoid an exception.

comment:7 Changed 9 months ago by Josh Smeaton

I'm not too keen on the second note you've got there. Here's a trivial but understandable example where you would want make_aware to just fix the issue.

start = datetime.datetime(2017, 10, 14, 23, 0)  # pretend this is just `datetime.now()
finish_at = start + timedelta(hours=1)

# whoops, we're in local time, but we use USE_TZ so we should make it aware before saving to database
start = timezone.make_aware(start, timezone=tz)
finish_at = timezone.make_aware(finish_at, timezone=tz)  # boom

You could certainly make the argument that this is a bug in the program behaviour. You should be working with timezones throughout - not just when saving to the database. Also, you'd have to be aware of the is_dst flag in the first place. Chances are if you're aware of it, then you're also just working with timezones to begin with. If you provide the wrong argument to is_dst, then finish_at == start, and you'd want to flip the flag for non-existent, and ambiguous times. I think I've just listed about 3 examples of why this is a bad idea, almost proving Aymerics point :)

comment:8 Changed 9 months ago by Kevin Christopher Henry

We can definitely all agree that this is a hard subject to document concisely and accurately.

I think the current version is just too concise to be acceptably accurate. It gives the impression that the wall clock time changes, and by not specifying the time zone of the 1:30 and 3:30 risks leaving the impression that they differ by two hours instead of one. Here's an attempt at avoiding that while keeping the example (at the cost of being a bit more verbose):

The pytz.NonExistentTimeError exception is raised if you try to make value aware during a DST transition such that the time never occurred. For example, if the 2:00 hour is skipped during a DST transition, trying to make 2:30 aware in that time zone will lead to an exception. To avoid that you can use is_dst to specify how make_aware should interpret such a non-existing time. If True then the above time would be interpreted as 2:30 DST time (equivalent to 1:30 local time). Conversely, if False the time would be interpreted as 2:30 standard time (equivalent to 3:30 local time).

(Coincidentally, we're passing through a block of NonExistent time here in America/New_York as I write this...)

Note: See TracTickets for help on using tickets.
Back to Top