Opened 17 years ago

Closed 17 years ago

#2810 closed defect (duplicate)

[patch] mysql encoding broken after upgrade from <4.1 to 5.0

Reported by: dummy@… Owned by: Adrian Holovaty
Component: contrib.admin Version:
Severity: normal Keywords:
Cc: farcepest@… Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hi,

I had a django-database on mysql 4.0.21 and tables encoded in latin-1. After upgrading to mysql 5.0.x django things that the mysql use tables in utf-8, but the encoding hasn't changed.

I think that a configurable database-encoding would be fix this. The DEFAULT should be utf-8 since this won't break the current behavior of django.

Regards,
Dirk

Attachments (5)

mysql-encoding.diff (1.4 KB ) - added by dummy@… 17 years ago.
names-utf8-browser-utf8.png (37.4 KB ) - added by dummy@… 17 years ago.
hardcopy NAMES utf8, browser encoding utf8
names-utf8-browser-iso88591.png (39.4 KB ) - added by dummy@… 17 years ago.
SET NAMES utf8, browser encoding iso-8859-1
names-latin1-browser-utf8.png (35.1 KB ) - added by dummy@… 17 years ago.
SET NAMES latin1, browser encoding utf8
names-latin1-browser-iso88591.png (37.5 KB ) - added by dummy@… 17 years ago.
SET NAMES latin1, browser encoding iso-8859-1

Download all attachments as: .zip

Change History (12)

by dummy@…, 17 years ago

Attachment: mysql-encoding.diff added

comment:1 by Andy Dustman <farcepest@…>, 17 years ago

Cc: farcepest@… added

SET NAMES only changes the character set the client uses to talk to the server; it doesn't affect the character set of existing databases, tables, or columns, and the server transcodes into the correct character set. Since you are upgrading from 4.0 to 5.0, you may have to check your existing schema and make sure they are really using latin-1.

Are you getting an error?

by dummy@…, 17 years ago

Attachment: names-utf8-browser-utf8.png added

hardcopy NAMES utf8, browser encoding utf8

by dummy@…, 17 years ago

SET NAMES utf8, browser encoding iso-8859-1

by dummy@…, 17 years ago

SET NAMES latin1, browser encoding utf8

by dummy@…, 17 years ago

SET NAMES latin1, browser encoding iso-8859-1

comment:2 by dummy@…, 17 years ago

I made some hardcopies to show the different behavior of 'SET NAMES utf8/latin1' and browser encoding 'utf-8/iso-8859-1'.

The normal encoding for django pages in the browser is 'utf-8'.
The MySQL-Tables were created at encoding latin-1/iso-8859-1

Since every output is fine in the combination 'SET NAMES latin1', browser encoding 'utf-8' I made my suggestions for the patch.

There are no errors, only wrong encoded characters.

comment:3 by Andy Dustman <farcepest@…>, 17 years ago

Can you try my patch on #2635? I have previously been suspicious of using SET NAMES to change the character set (it really doesn't work right with the MySQLdb internals) and this may be a case that demonstrates it. The patched version uses an API call to set the character set in both directions, and I think from re-reading the docs today that SET NAMES probably only sets the character set from client to server and not the reverse direction, whereas db.set_character_set() should do both. Note that you will need MySQLdb-1.2.1 or newer (1.2.2b1) for this to work.

comment:4 by dummy@…, 17 years ago

I tried your patch for django and mysql5 today. It has the same problem as it has with 'SET NAMES utf8'.

If I change two lines of your code, my problem is solved in the same way as I did it with the patch above: 'use_unicode': False, 'charset': 'latin1',

I would suggested configuring the DATABASE_ENCODING for mysql backend.

comment:5 by lakin@…, 17 years ago

I'm using a legacy database (not my choice) that is MySQL 4.1. It has the encoding set to latin1 by default for the databsaes, tables, and server. Currently the svn code will not work with it as it uses SET NAMES 'utf8', which sets character_set_client, character_set_results and character_set_connection to 'utf8' [1]. Problem is that the server is using latin1, which causes collation errors, because character_set_connection = 'utf8' also sets the collation_connection to the default collation for 'utf8':

OperationalError at /
(1267, "Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation 'like'")

If I change mysql/base.py to use SET CHARACTER SET 'utf8', it works, because it sets the collation_connection to the collation_database value which is correct [1]. And it still sets the character_set_client and character_set_results to 'utf8'.


[1] - http://dev.mysql.com/doc/refman/4.1/en/charset-connection.html

comment:6 by lakin@…, 17 years ago

As an update for this. I've looked a bit further at this problem, and I'm not longer certain that my suggested change is appropriate. See: #2896

comment:7 by Adrian Holovaty, 17 years ago

Resolution: duplicate
Status: newclosed

Duplicate of #952.

Note: See TracTickets for help on using tickets.
Back to Top