#4227 closed (fixed)
dumpdata/loaddata serializer ignores encoding settings
Reported by: | Owned by: | Adrian Holovaty | |
---|---|---|---|
Component: | Core (Other) | Version: | dev |
Severity: | Keywords: | unicode-branch | |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
When dumping data from a UTF-8-encoded database, get_string_value()
in core/serializers/base.by
casts CharField/TextField values with str()
, which results in a decode error when encountering non-ASCII characters. The settings.DEFAULT_CHARSET value and the 'encoding' option are ignored.
Setup to reproduce:
PostgreSQL 8.1.8
Python 2.4.4
Django SVN revision 5152
create new project and app for testing (names used in this example are 'myproject' and 'myapp')
cat myapp/models.py
:
from django.db import models class Thingy(models.Model): my_column = models.CharField(maxlength=255)
python manage.py --plain shell
>>> from myproject.myapp import models >>> thing = models.Thingy(my_column='äöü') # non-ASCII string entered as literal here, locale is set to UTF-8 >>> thing.save() >>> thing.my_column '\xc3\xa4\xc3\xb6\xc3\xbc' # UTF-8 string
python manage.py dumpdata --format=xml myapp
Unable to serialize database: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)
Using the JSON format to dump works, but then fails with a similar encoding error on 'loaddata'.
A partial fix is attached as patch (this now works for me with XML-serialization).
Attachments (1)
Change History (8)
by , 18 years ago
Attachment: | core_serializers.diff added |
---|
comment:2 by , 18 years ago
Has patch: | unset |
---|---|
Summary: | dumpdata/loaddata serializer ignores encoding settings → [unicode] dumpdata/loaddata serializer ignores encoding settings |
Triage Stage: | Unreviewed → Accepted |
This is being fixed in a slightly different way in the unicode branch. We won't apply this patch to trunk, since all those problems are being fixed on the branch in a unified fashion. However, I'll leave the ticket open until the branch is merged to give us a double-check that we fix all reported problems.
comment:3 by , 18 years ago
comment:4 by , 18 years ago
Keywords: | unicode-branch added |
---|---|
Summary: | [unicode] dumpdata/loaddata serializer ignores encoding settings → dumpdata/loaddata serializer ignores encoding settings |
This was fixed in the unicode branch in [5248]. I'll close this ticket when the branch is merged back into trunk.
comment:5 by , 18 years ago
As much as it's basically irrelevant - FWIW this patch worked perfectly on rev 4992 - it let me successfully export data from MySQL in xml format whereas previously only json worked (but on the other end - where I'm loading data into a postgres database - json was giving invalid data types on boolean fields)
Thank you, Caspar!
comment:6 by , 18 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
output of svn diff in trunk/django/core/serializers/