Opened 15 years ago
Last modified 11 years ago
#13339 new Bug
Date(Time)Field.to_python() fails to parse localized month names
Reported by: | Ulrich Petri | Owned by: | nobody |
---|---|---|---|
Component: | Forms | Version: | 1.1 |
Severity: | Normal | Keywords: | i18n l10n |
Cc: | Triage Stage: | Someday/Maybe | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Date(Time)Field.to_python() uses time.strptime to try and parse the input values.
If the input string contains a localized month name (e.g. "Dezember" for December in German) validation fails. This probably is due to strptime relying on the current locale to parse month names.
Test showing this behaviour is attached.
(N.B.: I'm not really clear on why the reverse operation in formats.localize_input() works correctly)
Attachments (1)
Change History (22)
by , 15 years ago
Attachment: | ticket_13339_l10n_month_tests.diff added |
---|
comment:1 by , 15 years ago
milestone: | 1.2 |
---|---|
Resolution: | → wontfix |
Status: | new → closed |
comment:2 by , 15 years ago
milestone: | → 1.2 |
---|---|
Resolution: | wontfix |
Status: | closed → reopened |
I guess there's a missunderstanding here.
What I'm talking about *is* forms.fields.DateField
follow-up: 4 comment:3 by , 15 years ago
milestone: | 1.2 |
---|---|
Resolution: | → invalid |
Status: | reopened → closed |
Well, then there is a *big* misunderstanding - because forms.DateField doesn't have a to_python() method.
comment:4 by , 15 years ago
Replying to russellm:
Well, then there is a *big* misunderstanding - because forms.DateField doesn't have a to_python() method.
??? Yes, it does: http://code.djangoproject.com/browser/django/trunk/django/forms/fields.py#L320
comment:5 by , 15 years ago
milestone: | → 1.2 |
---|---|
Resolution: | invalid |
Status: | closed → reopened |
For my next trick, I will attempt to put my entire lower leg into my mouth.
My apologies - there is a to_python() method in trunk; I was checking against v1.1 source code.
comment:6 by , 15 years ago
Triage Stage: | Unreviewed → Accepted |
---|
Ok - This is closely related to #12986, but it appears to be a separate problem.
comment:7 by , 15 years ago
Ok - to fill in some blanks that have been discussed on IRC:
The issue is that strptime() uses the locale to perform translations. The locale operates across threads (so it isn't threadsafe), and it is expensive to set and unset. Therefore, Python doesn't provide a way to parse dates that is sensitive to a locale of choice - it assumes that an application will require a single locale, not that different threads in a single application will need different locales.
strptime() doesn't provide any hooks to control locale, and the implementation uses all sorts of global variables to control ; the only two solutions are:
- Reimplement strptime with locale sensitive
- Provide a translation layer to convert a user-specified date into a language parseable by strptime()
- Call this a known limitation of the L10N implementation (i.e., that you can't parse dates with %B or any of the text-based date format specifiers).
1 will be a lot of effort, especially at this late stage in development.
2 is very messy, and is also quite complex (since you need to provide LANGUAGE_CODE->Locale translations, not LANGUAGE_CODE->EN translations)
This leaves 3 as the only viable option at this point. Other suggestions welcome, but barring a better suggestion, we'll just have to document the limitation and call this a known issue.
comment:8 by , 15 years ago
I have met this problem dealing with Paypal date format ("%H:%M:%S %b %d, %Y PDT" => 12:34:32 Apr 12, 2010 PDT) That's actually an stupid format, but it's paypal so... Anyway, I work with spanish locale, so it crashed. My workaround was really dirty, but it's the only way I got it to work (for django1.1):
In django/form/fields.py, in DateTimeField clean method, line 387:
for format in self.input_formats: tmp_value = value try: try: return datetime.datetime(*time.strptime(tmp_value, format)[:6]) except: if '%b' in format: tmp_value = tmp_value.replace('Apr','Abr').replace('Aug','Ago').replace('Dec','Dic').replace('Jan','Ene') #translate here your months :( return datetime.datetime(*time.strptime(tmp_value, format)[:6]) except ValueError: continue
comment:9 by , 15 years ago
Just want to point out that bcurtu's workaround has a potential problem. If your locale is such that the set of month names and the set of time zone names intersect, you'll replace the time zone as well as the month. That would not be good.
comment:10 by , 15 years ago
I don't know exactly what is the approach here but I assume that the same locale is used across all threads.
As russellm says, strptime uses some global variables as cache. But actually, there is a dirty hack that you can do if you don't want to set locale
<dirty alert>
>>> import time >>> import _strptime # English here >>> time.strptime('December', '%B') time.struct_time(tm_year=1900, tm_mon=12, tm_mday=1, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=5, tm_yday=335, tm_isdst=-1) >>> _strptime._TimeRE_cache.locale_time.f_month = ['', 'enero', 'febrero', 'marzo', 'abril', 'mayo', 'junio', 'julio', 'agosto', 'septiembre', 'octubre', 'noviembre', 'diciembre'] >>> _strptime._TimeRE_cache.locale_time.a_month = ['', 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec'] >>> _strptime._TimeRE_cache = _strptime.TimeRE(_strptime._TimeRE_cache.locale_time) >>> _strptime._regex_cache = {} # Spanish here >>> time.strptime('Diciembre', '%B') time.struct_time(tm_year=1900, tm_mon=12, tm_mday=1, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=5, tm_yday=335, tm_isdst=-1)
</dirty alert>
comment:11 by , 15 years ago
As the original author of strptime() in Python, Alex Gaynor asked me to comment here. Basically the comments about strptime() are accurate; it assumes a single locale for the entire process (which can be changed, but that isn't useful in a threaded situation) just as Python does thanks to using C-level locale functions.
Allowing strptime() accept a specific locale to parse against could possibly get fixed in a future version of Python if strptime() on the datetime module accepted a locale argument. Then it could (in an non-thread-safe way) change the locale, calculate everything it needs, cache it, switch back to the proper locale, and then do the processing. The problem is it wouldn't be thread-safe, but since locale stuff already isn't already thread-safe then maybe that is not such a big deal. But at best that would be a Python 3.2 or later feature.
comment:12 by , 15 years ago
Cc: | added |
---|
comment:13 by , 15 years ago
comment:14 by , 15 years ago
milestone: | 1.2 |
---|
Moving off the 1.2 milestone; a fix to allow %B/%b (as well as %a/%A and %p for that matter) will require a whole lot more work, but isn't critical for 1.2
comment:15 by , 15 years ago
Related issue: #13437, which also requires a reimplementation of strptime to handle L10N edge cases.
comment:16 by , 15 years ago
Cc: | removed |
---|
comment:17 by , 14 years ago
Severity: | → Normal |
---|---|
Type: | → Bug |
comment:20 by , 12 years ago
Status: | reopened → new |
---|
comment:21 by , 11 years ago
Triage Stage: | Accepted → Someday/Maybe |
---|
It isn't clear to me that Date(Time)Field *should* be using localization in to_python(). L10N belongs in the forms framework because that is the user-facing interface that should be used to gather data. The to_python() method accepts strings for historical purposes, and to support serialization; neither of these uses support the idea of using localization.