Opened 14 years ago

Last modified 10 years ago

#13339 new Bug

Date(Time)Field.to_python() fails to parse localized month names

Reported by: Ulrich Petri Owned by: nobody
Component: Forms Version: 1.1
Severity: Normal Keywords: i18n l10n
Cc: Triage Stage: Someday/Maybe
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Date(Time)Field.to_python() uses time.strptime to try and parse the input values.

If the input string contains a localized month name (e.g. "Dezember" for December in German) validation fails. This probably is due to strptime relying on the current locale to parse month names.

Test showing this behaviour is attached.

(N.B.: I'm not really clear on why the reverse operation in formats.localize_input() works correctly)

Attachments (1)

ticket_13339_l10n_month_tests.diff (2.1 KB ) - added by Ulrich Petri 14 years ago.

Download all attachments as: .zip

Change History (22)

by Ulrich Petri, 14 years ago

comment:1 by Russell Keith-Magee, 14 years ago

milestone: 1.2
Resolution: wontfix
Status: newclosed

It isn't clear to me that Date(Time)Field *should* be using localization in to_python(). L10N belongs in the forms framework because that is the user-facing interface that should be used to gather data. The to_python() method accepts strings for historical purposes, and to support serialization; neither of these uses support the idea of using localization.

comment:2 by Ulrich Petri, 14 years ago

milestone: 1.2
Resolution: wontfix
Status: closedreopened

I guess there's a missunderstanding here.

What I'm talking about *is* forms.fields.DateField

comment:3 by Russell Keith-Magee, 14 years ago

milestone: 1.2
Resolution: invalid
Status: reopenedclosed

Well, then there is a *big* misunderstanding - because forms.DateField doesn't have a to_python() method.

in reply to:  3 comment:4 by Karen Tracey, 14 years ago

Replying to russellm:

Well, then there is a *big* misunderstanding - because forms.DateField doesn't have a to_python() method.

??? Yes, it does: http://code.djangoproject.com/browser/django/trunk/django/forms/fields.py#L320

comment:5 by Russell Keith-Magee, 14 years ago

milestone: 1.2
Resolution: invalid
Status: closedreopened

For my next trick, I will attempt to put my entire lower leg into my mouth.

My apologies - there is a to_python() method in trunk; I was checking against v1.1 source code.

comment:6 by Russell Keith-Magee, 14 years ago

Triage Stage: UnreviewedAccepted

Ok - This is closely related to #12986, but it appears to be a separate problem.

comment:7 by Russell Keith-Magee, 14 years ago

Ok - to fill in some blanks that have been discussed on IRC:

The issue is that strptime() uses the locale to perform translations. The locale operates across threads (so it isn't threadsafe), and it is expensive to set and unset. Therefore, Python doesn't provide a way to parse dates that is sensitive to a locale of choice - it assumes that an application will require a single locale, not that different threads in a single application will need different locales.

strptime() doesn't provide any hooks to control locale, and the implementation uses all sorts of global variables to control ; the only two solutions are:

  1. Reimplement strptime with locale sensitive
  2. Provide a translation layer to convert a user-specified date into a language parseable by strptime()
  3. Call this a known limitation of the L10N implementation (i.e., that you can't parse dates with %B or any of the text-based date format specifiers).

1 will be a lot of effort, especially at this late stage in development.

2 is very messy, and is also quite complex (since you need to provide LANGUAGE_CODE->Locale translations, not LANGUAGE_CODE->EN translations)

This leaves 3 as the only viable option at this point. Other suggestions welcome, but barring a better suggestion, we'll just have to document the limitation and call this a known issue.

comment:8 by bcurtu, 14 years ago

I have met this problem dealing with Paypal date format ("%H:%M:%S %b %d, %Y PDT" => 12:34:32 Apr 12, 2010 PDT) That's actually an stupid format, but it's paypal so... Anyway, I work with spanish locale, so it crashed. My workaround was really dirty, but it's the only way I got it to work (for django1.1):

In django/form/fields.py, in DateTimeField clean method, line 387:

        for format in self.input_formats: 
            tmp_value = value
            try:
                try:
                    return datetime.datetime(*time.strptime(tmp_value, format)[:6])
                except:
                    if '%b' in format:
                        tmp_value = tmp_value.replace('Apr','Abr').replace('Aug','Ago').replace('Dec','Dic').replace('Jan','Ene') #translate here your months :(
                    return datetime.datetime(*time.strptime(tmp_value, format)[:6])
            except ValueError:
                continue

comment:9 by Roy Smith, 14 years ago

Just want to point out that bcurtu's workaround has a potential problem. If your locale is such that the set of month names and the set of time zone names intersect, you'll replace the time zone as well as the month. That would not be good.

comment:10 by Martín Conte Mac Donell, 14 years ago

I don't know exactly what is the approach here but I assume that the same locale is used across all threads.

As russellm says, strptime uses some global variables as cache. But actually, there is a dirty hack that you can do if you don't want to set locale

<dirty alert>

>>> import time
>>> import _strptime

# English here
>>> time.strptime('December', '%B')
time.struct_time(tm_year=1900, tm_mon=12, tm_mday=1, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=5, tm_yday=335, tm_isdst=-1)

>>> _strptime._TimeRE_cache.locale_time.f_month = ['', 'enero', 'febrero', 'marzo', 'abril', 'mayo', 'junio', 'julio', 'agosto', 'septiembre', 'octubre', 'noviembre', 'diciembre']
>>> _strptime._TimeRE_cache.locale_time.a_month = ['', 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec']
>>> _strptime._TimeRE_cache = _strptime.TimeRE(_strptime._TimeRE_cache.locale_time)
>>> _strptime._regex_cache = {}

# Spanish here

>>> time.strptime('Diciembre', '%B')
time.struct_time(tm_year=1900, tm_mon=12, tm_mday=1, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=5, tm_yday=335, tm_isdst=-1)

</dirty alert>

comment:11 by Brett Cannon, 14 years ago

As the original author of strptime() in Python, Alex Gaynor asked me to comment here. Basically the comments about strptime() are accurate; it assumes a single locale for the entire process (which can be changed, but that isn't useful in a threaded situation) just as Python does thanks to using C-level locale functions.

Allowing strptime() accept a specific locale to parse against could possibly get fixed in a future version of Python if strptime() on the datetime module accepted a locale argument. Then it could (in an non-thread-safe way) change the locale, calculate everything it needs, cache it, switch back to the proper locale, and then do the processing. The problem is it wouldn't be thread-safe, but since locale stuff already isn't already thread-safe then maybe that is not such a big deal. But at best that would be a Python 3.2 or later feature.

comment:12 by Collin Anderson, 14 years ago

Cc: Collin Anderson added

comment:13 by Russell Keith-Magee, 14 years ago

(In [13039]) Refs #13339 -- Disable %b/%B-based locale datetime input formats, and document that they are problematic.

comment:14 by Russell Keith-Magee, 14 years ago

milestone: 1.2

Moving off the 1.2 milestone; a fix to allow %B/%b (as well as %a/%A and %p for that matter) will require a whole lot more work, but isn't critical for 1.2

comment:15 by Russell Keith-Magee, 14 years ago

Related issue: #13437, which also requires a reimplementation of strptime to handle L10N edge cases.

comment:16 by Collin Anderson, 14 years ago

Cc: Collin Anderson removed

comment:17 by Julien Phalip, 13 years ago

Severity: Normal
Type: Bug

comment:18 by Aymeric Augustin, 12 years ago

UI/UX: unset

Change UI/UX from NULL to False.

comment:19 by Aymeric Augustin, 12 years ago

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:20 by Aymeric Augustin, 11 years ago

Status: reopenednew

comment:21 by Claude Paroz, 10 years ago

Triage Stage: AcceptedSomeday/Maybe
Note: See TracTickets for help on using tickets.
Back to Top