Opened 19 years ago
Closed 18 years ago
#3630 closed (invalid)
Newforms clean_data causes encoding issue with database
| Reported by: | Baptiste | Owned by: | Adrian Holovaty |
|---|---|---|---|
| Component: | Forms | Version: | dev |
| Severity: | Keywords: | accents | |
| Cc: | mir@… | Triage Stage: | Unreviewed |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
"form" is a newform and "Article" is a model.
new_article = Article(text=form.clean_data['article'])
If the field article contains accents, they are replaced by "?" in the database (same thing if they are displayed with a view/template).
Encoding of the page is utf8.
That would be interesting to test if it is a newforms issue by trying to make an Article with accents in the admin of thenewforms-admin branche. I couldn't make it work (admin said I didn't have rights) to test by myself.
Change History (7)
comment:1 by , 19 years ago
comment:2 by , 19 years ago
Oops, I have forgotten to say that the "\xc3\xa9" was transformed to "\xe9" with the decode.
comment:3 by , 19 years ago
If I transform the fields of my database in UTF8 and that I transform the encoding of the data :
for i in data :
data[i] = data[i].decode("latin1","replace")
The form created with these data is okay and I can do :
new = Article(
name_user=form.clean_data['author_name'],
...
)
Accents are well recorded in the database. But not well displayed anymore, so I need to keep a reference to the old data and to display a form which call them.
Not sexy...
comment:4 by , 19 years ago
For me works this way:
for i in data : if isinstance(data[i], unicode): data[i] = data[i].encode(settings.DEFAULT_CHARSET)
Although is ugly.
comment:5 by , 19 years ago
| Cc: | added |
|---|
comment:6 by , 19 years ago
| Summary: | Newforms and accents don't like each others → Newforms clean_data causes encoding issue with database |
|---|
comment:7 by , 18 years ago
| Resolution: | → invalid |
|---|---|
| Status: | new → closed |
This should be fixed since after the unicode merge - can someone reopen this, if this is NOT fixed.
Some informations :
If I "print" the POST informations, accents are encoded like that :
\xc3\xa9
(é)
I tried to convert them to UTF8 :
for i in data : data[i] = data[i].decode("utf8","replace") print dataBut now the string recorded in the db stops to the first accent."Espérance" is recorded as "Esp" ; and the server gives me this message :