Opened 6 years ago

Last modified 3 years ago

#13339 new Bug

Date(Time)Field.to_python() fails to parse localized month names

Reported by: Ulrich Petri Owned by: nobody
Component: Forms Version: 1.1
Severity: Normal Keywords: i18n l10n
Cc: Triage Stage: Someday/Maybe
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Date(Time)Field.to_python() uses time.strptime to try and parse the input values.

If the input string contains a localized month name (e.g. "Dezember" for December in German) validation fails. This probably is due to strptime relying on the current locale to parse month names.

Test showing this behaviour is attached.

(N.B.: I'm not really clear on why the reverse operation in formats.localize_input() works correctly)

Attachments (1)

ticket_13339_l10n_month_tests.diff (2.1 KB) - added by Ulrich Petri 6 years ago.

Download all attachments as: .zip

Change History (22)

Changed 6 years ago by Ulrich Petri

comment:1 Changed 6 years ago by Russell Keith-Magee

milestone: 1.2
Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset
Resolution: wontfix
Status: newclosed

It isn't clear to me that Date(Time)Field *should* be using localization in to_python(). L10N belongs in the forms framework because that is the user-facing interface that should be used to gather data. The to_python() method accepts strings for historical purposes, and to support serialization; neither of these uses support the idea of using localization.

comment:2 Changed 6 years ago by Ulrich Petri

milestone: 1.2
Resolution: wontfix
Status: closedreopened

I guess there's a missunderstanding here.

What I'm talking about *is* forms.fields.DateField

comment:3 Changed 6 years ago by Russell Keith-Magee

milestone: 1.2
Resolution: invalid
Status: reopenedclosed

Well, then there is a *big* misunderstanding - because forms.DateField doesn't have a to_python() method.

comment:4 in reply to:  3 Changed 6 years ago by Karen Tracey

Replying to russellm:

Well, then there is a *big* misunderstanding - because forms.DateField doesn't have a to_python() method.

??? Yes, it does: http://code.djangoproject.com/browser/django/trunk/django/forms/fields.py#L320

comment:5 Changed 6 years ago by Russell Keith-Magee

milestone: 1.2
Resolution: invalid
Status: closedreopened

For my next trick, I will attempt to put my entire lower leg into my mouth.

My apologies - there is a to_python() method in trunk; I was checking against v1.1 source code.

comment:6 Changed 6 years ago by Russell Keith-Magee

Triage Stage: UnreviewedAccepted

Ok - This is closely related to #12986, but it appears to be a separate problem.

comment:7 Changed 6 years ago by Russell Keith-Magee

Ok - to fill in some blanks that have been discussed on IRC:

The issue is that strptime() uses the locale to perform translations. The locale operates across threads (so it isn't threadsafe), and it is expensive to set and unset. Therefore, Python doesn't provide a way to parse dates that is sensitive to a locale of choice - it assumes that an application will require a single locale, not that different threads in a single application will need different locales.

strptime() doesn't provide any hooks to control locale, and the implementation uses all sorts of global variables to control ; the only two solutions are:

  1. Reimplement strptime with locale sensitive
  2. Provide a translation layer to convert a user-specified date into a language parseable by strptime()
  3. Call this a known limitation of the L10N implementation (i.e., that you can't parse dates with %B or any of the text-based date format specifiers).

1 will be a lot of effort, especially at this late stage in development.

2 is very messy, and is also quite complex (since you need to provide LANGUAGE_CODE->Locale translations, not LANGUAGE_CODE->EN translations)

This leaves 3 as the only viable option at this point. Other suggestions welcome, but barring a better suggestion, we'll just have to document the limitation and call this a known issue.

comment:8 Changed 6 years ago by bcurtu

I have met this problem dealing with Paypal date format ("%H:%M:%S %b %d, %Y PDT" => 12:34:32 Apr 12, 2010 PDT) That's actually an stupid format, but it's paypal so... Anyway, I work with spanish locale, so it crashed. My workaround was really dirty, but it's the only way I got it to work (for django1.1):

In django/form/fields.py, in DateTimeField clean method, line 387:

        for format in self.input_formats: 
            tmp_value = value
            try:
                try:
                    return datetime.datetime(*time.strptime(tmp_value, format)[:6])
                except:
                    if '%b' in format:
                        tmp_value = tmp_value.replace('Apr','Abr').replace('Aug','Ago').replace('Dec','Dic').replace('Jan','Ene') #translate here your months :(
                    return datetime.datetime(*time.strptime(tmp_value, format)[:6])
            except ValueError:
                continue

comment:9 Changed 6 years ago by Roy Smith

Just want to point out that bcurtu's workaround has a potential problem. If your locale is such that the set of month names and the set of time zone names intersect, you'll replace the time zone as well as the month. That would not be good.

comment:10 Changed 6 years ago by Martín Conte Mac Donell

I don't know exactly what is the approach here but I assume that the same locale is used across all threads.

As russellm says, strptime uses some global variables as cache. But actually, there is a dirty hack that you can do if you don't want to set locale

<dirty alert>

>>> import time
>>> import _strptime

# English here
>>> time.strptime('December', '%B')
time.struct_time(tm_year=1900, tm_mon=12, tm_mday=1, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=5, tm_yday=335, tm_isdst=-1)

>>> _strptime._TimeRE_cache.locale_time.f_month = ['', 'enero', 'febrero', 'marzo', 'abril', 'mayo', 'junio', 'julio', 'agosto', 'septiembre', 'octubre', 'noviembre', 'diciembre']
>>> _strptime._TimeRE_cache.locale_time.a_month = ['', 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec']
>>> _strptime._TimeRE_cache = _strptime.TimeRE(_strptime._TimeRE_cache.locale_time)
>>> _strptime._regex_cache = {}

# Spanish here

>>> time.strptime('Diciembre', '%B')
time.struct_time(tm_year=1900, tm_mon=12, tm_mday=1, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=5, tm_yday=335, tm_isdst=-1)

</dirty alert>

comment:11 Changed 6 years ago by Brett C.

As the original author of strptime() in Python, Alex Gaynor asked me to comment here. Basically the comments about strptime() are accurate; it assumes a single locale for the entire process (which can be changed, but that isn't useful in a threaded situation) just as Python does thanks to using C-level locale functions.

Allowing strptime() accept a specific locale to parse against could possibly get fixed in a future version of Python if strptime() on the datetime module accepted a locale argument. Then it could (in an non-thread-safe way) change the locale, calculate everything it needs, cache it, switch back to the proper locale, and then do the processing. The problem is it wouldn't be thread-safe, but since locale stuff already isn't already thread-safe then maybe that is not such a big deal. But at best that would be a Python 3.2 or later feature.

comment:12 Changed 6 years ago by Collin Anderson

Cc: Collin Anderson added

comment:13 Changed 6 years ago by Russell Keith-Magee

(In [13039]) Refs #13339 -- Disable %b/%B-based locale datetime input formats, and document that they are problematic.

comment:14 Changed 6 years ago by Russell Keith-Magee

milestone: 1.2

Moving off the 1.2 milestone; a fix to allow %B/%b (as well as %a/%A and %p for that matter) will require a whole lot more work, but isn't critical for 1.2

comment:15 Changed 6 years ago by Russell Keith-Magee

Related issue: #13437, which also requires a reimplementation of strptime to handle L10N edge cases.

comment:16 Changed 6 years ago by Collin Anderson

Cc: Collin Anderson removed

comment:17 Changed 5 years ago by Julien Phalip

Severity: Normal
Type: Bug

comment:18 Changed 5 years ago by Aymeric Augustin

UI/UX: unset

Change UI/UX from NULL to False.

comment:19 Changed 5 years ago by Aymeric Augustin

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:20 Changed 4 years ago by Aymeric Augustin

Status: reopenednew

comment:21 Changed 3 years ago by Claude Paroz

Triage Stage: AcceptedSomeday/Maybe
Note: See TracTickets for help on using tickets.
Back to Top