Opened 9 years ago

Closed 9 years ago

#24955 closed New feature (wontfix)

Allow BC and > 10000 years dates in Django ORM & forms

Reported by: Bertrand Bordage Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Currently, the minimum date we can use in Django is 0001-01-01.
However, some databases allow before Christ dates, extremely useful for historic projects.
In a same fashion, some databases allow dates greater than 9999-12-31, while Django limits year format to a 4-digits number.
Of course, this assumes that all dates are defined using the proleptic Gregorian calendar.

Here’s the DB support:

  • PostgreSQL uses the format 0001-01-01 BC. Dates can go from 4714-11-24 BC to 5874897-12-31. That minimum date follows the suggestion of ISO 8601 for the minimum date of the proleptic Gregorian calendar.
  • SQLite uses the format -0001-01-01. Dates can go from -9999-01-01 to 9999-12-31 if you use the date function before inserting data. However, since SQLite is a scam and has absolutely no constraint, you can store -9999999-13-32 as well as SQLite is a traitor, don’t use it to store important data in a date column.
  • MySQL has no BC support. According to the docs, dates can go from 1000-01-01 to 9999-12-31, but it also states that “although earlier values might work, there is no guarantee”. In practice, we can use dates from 0001-01-01 to 1000-01-01, but we shouldn’t. Another insanity: you can use the format 0001-01-01 BC format. As usual, MySQL will react in the stupidest way possible: it ignores what you write after the date. You can even write 0001-01-01MySQL is rubbish, don’t use it, it’ll happily answer 0001-01-01.
  • Oracle seems to support BC and B.C. mentions in its TO_DATE function, as well as a S0001 format for years before Christ.

In order to resolve this issue, we need to decide how to write BC dates in the ORM and in the admin. I’m in favor of a +/- notation before the year, more natural to me. It’s also suggested by ISO 8601. And it means we don’t have to localize that 'BC' mention. I’m aware that this +/- notation is not ideal because of the other minuses used as separators, but that seems like the best choice anyway.

The first component to change in order to resolve this issue is of course the ORM. Changing the forms will then be easy.


Sources:

Change History (8)

comment:1 by Bertrand Bordage, 9 years ago

Summary: Allow BC dates and > 10000 years in Django ORM & formsAllow BC > 10000 years dates and in Django ORM & forms

comment:2 by Bertrand Bordage, 9 years ago

Summary: Allow BC > 10000 years dates and in Django ORM & formsAllow BC and > 10000 years dates in Django ORM & forms

comment:3 by Aymeric Augustin, 9 years ago

Python's datetime.date and datetime.datetime are bounded by datetime.MINYEAR and datetime.MAXYEAR.

What kind of Python object should values outside of this range be mapped to? What are the backwards-compatibility considerations?

comment:4 by Bertrand Bordage, 9 years ago

That’s indeed a problem.
I guess the good solution is to also modify CPython to allow such dates, and only activate this feature if Python >= 3.5. Or write a compatibility tool in Django for Python < 3.5 to use datetime if year is between 1 and 9999, and a extremely similar data type for out-of-bounds values.

Currently, CPython stores dates using 4 bytes. PostgreSQL also uses 4 bytes, but it’s not limited to the 1-9999 range, as I mentioned in the issue description.
CPython uses an unoptimised way of storing the date: it uses a whole byte to store the month and another one to store the day while 9 bits would be enough to store both, which would leave 23 bits for the year. This unoptimized storage leaves 16 bits for the year, but that’s enough to store the range from -4714 to 60822. Not as much as the optimized PostgreSQL storage, but way better.

So that may not be as hard as it may seem. The easy part will be to modify MINYEAR and MAXYEAR. The harder will be to modify parsers and formatters to allow a leading minus and more than 4-digits years.

comment:5 by Bertrand Bordage, 9 years ago

Or more for CPython 3.6, since a feature freeze has been done recently, if I remember well.

comment:6 by Shai Berger, 9 years ago

Curb your enthusiasm: The real problem with adapting Python to support BC dates is binary backwards compatibility. You can be pretty sure there are tons of modules which rely, for example, on the invariant that year is positive.

There is also the slight issue that there is no year 0 -- the year 1BC is followed by 1AD (e.g. with Postgres, the expression (date '01-01-0001') - (date '01-01-0001 BC') gives 366 days). So something nontrivial needs to happen there for things to work -- e.g. BC years could be represented by a +1 number (1BC represented as 0, 2BC as -1 etc), but that would necessitate special handling everywhere.

See http://stackoverflow.com/questions/15857797/bc-dates-in-python for some alternative libraries for date representation which go beyond Python stdlib's capabilities; I think the best you can do here is pick one of these and build Django fields (models, forms) around it as a 3rd-party package.

I think that, unless some very surprising solution for the problems mentioned above is suggested, we should wontfix the part about BC dates. For the far future, I'd wait for Python to accept the suggested change before Django does.

comment:7 by Bertrand Bordage, 9 years ago

Dependencies are not a problem, it’s their job to follow what’s new in each major CPython version, otherwise that would mean that nothing can be changed in Python. Recently, CPython changed the boolean evaluation of time(0, 0, 0) (see issue 13936), and it didn’t take years to make that. Similarly, in Django we don’t stick to bad choices just because a hundred libraries rely on it.

About the year 0, that’s right, it’s mainly a matter of representation. The real problem is to decide whether we return 0 or -1 for year -1. Returning 0 is more logic if we want to do arithmetic with years, but it’s more misleading. The best solution IMO is to return -1 for year -1 and 1 for year 1. This mean arithmetic with extracted years will be false, but not with dates. Since few people will use BC dates, and it’ll be more likely for historic purposes, and we usually don’t have precise dates BC (at least because of the mess of the multiple calendars that can lead to several years of error). So if someone makes calculus on extracted years (which isn’t recommended), it’ll be false, but this is an acceptable approximation.
For info, PostgreSQL returns -1 when we extract the year of 0001-01-01 BC.

The solution of using Astropy as answered in the stackoverflow issue is a total overkill.

comment:8 by Shai Berger, 9 years ago

Resolution: wontfix
Status: newclosed

Since I had something to do with the boolean-evaluation-of-midnight change, I can tell you that it met with very strong opposition, and in fact it did take years (the bug was pointed out in 2012, and Python 3.5 which fixes it is still not out). You do not pull rugs from under your users for no good reason; at least, Python and Django try hard not to.

I suspect the problem of year 0 and the choice of representation of BC years would be enough to take down the suggestion -- one of the less-often-quoted, but more valuable IMO, pieces of advice in the Zen of Python is "in the face of ambiguity, resist the temptation to guess". But anyway, there's not much use having that discussion here; if you want to change Python, take it to the Python bug tracker or the python-ideas mailing list. If you can convince them, all the power to you, and Django will add support for the feature.

However, until they do, there's no room for this in Django. Feel free to reopen when Python support is available, and feel
free to implement wider date fields as 3rd-party packages until then.

Note: See TracTickets for help on using tickets.
Back to Top