Opened 17 years ago

Closed 17 years ago

#3630 closed (invalid)

Newforms clean_data causes encoding issue with database

Reported by: Baptiste Owned by: Adrian Holovaty
Component: Forms Version: dev
Severity: Keywords: accents
Cc: mir@… Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

"form" is a newform and "Article" is a model.

new_article = Article(text=form.clean_data['article'])

If the field article contains accents, they are replaced by "?" in the database (same thing if they are displayed with a view/template).
Encoding of the page is utf8.

That would be interesting to test if it is a newforms issue by trying to make an Article with accents in the admin of thenewforms-admin branche. I couldn't make it work (admin said I didn't have rights) to test by myself.

Change History (7)

comment:1 by Baptiste, 17 years ago

Some informations :
If I "print" the POST informations, accents are encoded like that :
\xc3\xa9
(é)
I tried to convert them to UTF8 :

		for i in data :
			data[i] = data[i].decode("utf8","replace")
		print data

But now the string recorded in the db stops to the first accent."Espérance" is recorded as "Esp" ; and the server gives me this message :

/usr/lib/python2.4/site-packages/django/db/backends/mysql/base.py:42: Warning: Data truncated for column 'username' at row 1
  return self.cursor.execute(sql, params)

comment:2 by Baptiste, 17 years ago

Oops, I have forgotten to say that the "\xc3\xa9" was transformed to "\xe9" with the decode.

comment:3 by anonymous, 17 years ago

If I transform the fields of my database in UTF8 and that I transform the encoding of the data :

		for i in data :
			data[i] = data[i].decode("latin1","replace")

The form created with these data is okay and I can do :

		   	new = Article(
			        	name_user=form.clean_data['author_name'],
                                        ...
                                         )

Accents are well recorded in the database. But not well displayed anymore, so I need to keep a reference to the old data and to display a form which call them.
Not sexy...

comment:4 by Nuno Mariz, 17 years ago

For me works this way:

for i in data :
	if isinstance(data[i], unicode):
		data[i] = data[i].encode(settings.DEFAULT_CHARSET)

Although is ugly.

comment:5 by anonymous, 17 years ago

Cc: mir@… added

comment:6 by Adrian Holovaty, 17 years ago

Summary: Newforms and accents don't like each othersNewforms clean_data causes encoding issue with database

comment:7 by Simon G. <dev@…>, 17 years ago

Resolution: invalid
Status: newclosed

This should be fixed since after the unicode merge - can someone reopen this, if this is NOT fixed.

Note: See TracTickets for help on using tickets.
Back to Top