Opened 12 years ago

Closed 11 years ago

#3630 closed (invalid)

Newforms clean_data causes encoding issue with database

Reported by: Baptiste Owned by: Adrian Holovaty
Component: Forms Version: master
Severity: Keywords: accents
Cc: mir@… Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


"form" is a newform and "Article" is a model.

new_article = Article(text=form.clean_data['article'])

If the field article contains accents, they are replaced by "?" in the database (same thing if they are displayed with a view/template).
Encoding of the page is utf8.

That would be interesting to test if it is a newforms issue by trying to make an Article with accents in the admin of thenewforms-admin branche. I couldn't make it work (admin said I didn't have rights) to test by myself.

Change History (7)

comment:1 Changed 12 years ago by Baptiste

Some informations :
If I "print" the POST informations, accents are encoded like that :
I tried to convert them to UTF8 :

		for i in data :
			data[i] = data[i].decode("utf8","replace")
		print data

But now the string recorded in the db stops to the first accent."Espérance" is recorded as "Esp" ; and the server gives me this message :

/usr/lib/python2.4/site-packages/django/db/backends/mysql/ Warning: Data truncated for column 'username' at row 1
  return self.cursor.execute(sql, params)

comment:2 Changed 12 years ago by Baptiste

Oops, I have forgotten to say that the "\xc3\xa9" was transformed to "\xe9" with the decode.

comment:3 Changed 12 years ago by anonymous

If I transform the fields of my database in UTF8 and that I transform the encoding of the data :

		for i in data :
			data[i] = data[i].decode("latin1","replace")

The form created with these data is okay and I can do :

		   	new = Article(

Accents are well recorded in the database. But not well displayed anymore, so I need to keep a reference to the old data and to display a form which call them.
Not sexy...

comment:4 Changed 12 years ago by Nuno Mariz

For me works this way:

for i in data :
	if isinstance(data[i], unicode):
		data[i] = data[i].encode(settings.DEFAULT_CHARSET)

Although is ugly.

comment:5 Changed 12 years ago by anonymous

Cc: mir@… added

comment:6 Changed 12 years ago by Adrian Holovaty

Summary: Newforms and accents don't like each othersNewforms clean_data causes encoding issue with database

comment:7 Changed 11 years ago by Simon G. <dev@…>

Resolution: invalid
Status: newclosed

This should be fixed since after the unicode merge - can someone reopen this, if this is NOT fixed.

Note: See TracTickets for help on using tickets.
Back to Top