Opened 3 years ago

Closed 2 years ago

Last modified 2 years ago

#19194 closed Bug (duplicate)

dumpdata cannot support utf-8 encoding in mysql5.5

Reported by: tony.li@… Owned by: nobody
Component: Core (Management commands) Version: 1.4
Severity: Normal Keywords: dumpdata
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I have tried using dumpdata to get testing data for unit testing,
When I load the .json / xml fixture files inside the test.py,
the console always prompt warning about the incorrect string value

Warning: Incorrect string value: '\xE7\x94\xB7\xE8\xA3\x9D' for column 'DIS_NAME' at row 1

when I try assertEqual the value, they only found a '?' char, which I think the database did not get the parsing correct at the very beginning.

ps. The 'DIS_NAME' is in chinese

and so I tried decode and encode back shown in http://stackoverflow.com/questions/2137501/django-dumpdata-utf-8-unicode,
it show me :
DeserializationError: Invalid control character at: line 1 column 4793 (char 4793)

The production or dev sever are running with no problem... it just happen in unit test fixture.

Attachments (3)

init_data.json (60.5 KB) - added by tony.li@… 3 years ago.
fixture by dumpdata --format=json
init_data.xml (115.3 KB) - added by tony.li@… 3 years ago.
fixture by dumpdata --format=xml
dump.xml (6.2 KB) - added by tony.li@… 3 years ago.
mysql dump of the table

Download all attachments as: .zip

Change History (7)

comment:1 Changed 3 years ago by claudep

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

Would it be possible for you to provide us with some sample data? You talk about dumpdata, but then explain loading issues. Currently, the ticket does not contain enough information for us to debug your problem.

Changed 3 years ago by tony.li@…

fixture by dumpdata --format=json

Changed 3 years ago by tony.li@…

fixture by dumpdata --format=xml

Changed 3 years ago by tony.li@…

mysql dump of the table

comment:2 Changed 3 years ago by tony.li@…

when I try :
class LoginTest(TestCase):

fixtures = ['/path/to/fixtures/init_data.json',]

...

./manage.py test myNew
Creating test database for alias 'default'...
Problem installing fixture '/path/to/fixtures/initial_data.json': Traceback (most recent call last):

...
...

Warning: Incorrect string value: '\xE7\x94\xB7\xE8\xA3\x9D' for column 'DIS_NAME' at row 1

comment:3 Changed 2 years ago by claudep

  • Resolution set to duplicate
  • Status changed from new to closed

This is probably a duplicate of #18392. There seems to be a possible workaround (read comments 12 and 13 in that ticket).

comment:4 Changed 2 years ago by tony.li@…

Thanks for the info.

Problem solved, this actually the mistake by me.

It cause be missing
'TEST_CHARSET': 'utf8mb4', #'utf8' will also do
inside setting.py i.e:

DATABASES = {

'default': {

'ENGINE': 'django.db.backends.mysql', # Add 'postgresql_psycopg2', 'mysql', 'sqlite3' or 'oracle'.
...
'STORAGE_ENGINE': 'INNODB',
'OPTIONS': {'charset': 'utf8mb4'},
'TEST_CHARSET': 'utf8mb4',

}

}

I wonder why the TEST_CHARSET isn't determine by the connector but need user specify.
and what is the Default value if not TEST_CHARSET has not been set?

Note: See TracTickets for help on using tickets.
Back to Top