Code

Opened 7 years ago

Closed 7 years ago

#3630 closed (invalid)

Newforms clean_data causes encoding issue with database

Reported by: Baptiste Owned by: adrian
Component: Forms Version: master
Severity: Keywords: accents
Cc: mir@… Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

"form" is a newform and "Article" is a model.

new_article = Article(text=form.clean_data['article'])

If the field article contains accents, they are replaced by "?" in the database (same thing if they are displayed with a view/template).
Encoding of the page is utf8.

That would be interesting to test if it is a newforms issue by trying to make an Article with accents in the admin of thenewforms-admin branche. I couldn't make it work (admin said I didn't have rights) to test by myself.

Attachments (0)

Change History (7)

comment:1 Changed 7 years ago by Baptiste

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

Some informations :
If I "print" the POST informations, accents are encoded like that :
\xc3\xa9
(é)
I tried to convert them to UTF8 :

		for i in data :
			data[i] = data[i].decode("utf8","replace")
		print data

But now the string recorded in the db stops to the first accent."Espérance" is recorded as "Esp" ; and the server gives me this message :

/usr/lib/python2.4/site-packages/django/db/backends/mysql/base.py:42: Warning: Data truncated for column 'username' at row 1
  return self.cursor.execute(sql, params)

comment:2 Changed 7 years ago by Baptiste

Oops, I have forgotten to say that the "\xc3\xa9" was transformed to "\xe9" with the decode.

comment:3 Changed 7 years ago by anonymous

If I transform the fields of my database in UTF8 and that I transform the encoding of the data :

		for i in data :
			data[i] = data[i].decode("latin1","replace")

The form created with these data is okay and I can do :

		   	new = Article(
			        	name_user=form.clean_data['author_name'],
                                        ...
                                         )

Accents are well recorded in the database. But not well displayed anymore, so I need to keep a reference to the old data and to display a form which call them.
Not sexy...

comment:4 Changed 7 years ago by Nuno Mariz

For me works this way:

for i in data :
	if isinstance(data[i], unicode):
		data[i] = data[i].encode(settings.DEFAULT_CHARSET)

Although is ugly.

comment:5 Changed 7 years ago by anonymous

  • Cc mir@… added

comment:6 Changed 7 years ago by adrian

  • Summary changed from Newforms and accents don't like each others to Newforms clean_data causes encoding issue with database

comment:7 Changed 7 years ago by Simon G. <dev@…>

  • Resolution set to invalid
  • Status changed from new to closed

This should be fixed since after the unicode merge - can someone reopen this, if this is NOT fixed.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.