#4227 closed (fixed)
dumpdata/loaddata serializer ignores encoding settings
| Reported by: | Owned by: | Adrian Holovaty | |
|---|---|---|---|
| Component: | Core (Other) | Version: | dev |
| Severity: | Keywords: | unicode-branch | |
| Cc: | Triage Stage: | Accepted | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
When dumping data from a UTF-8-encoded database, get_string_value() in core/serializers/base.by casts CharField/TextField values with str(), which results in a decode error when encountering non-ASCII characters. The settings.DEFAULT_CHARSET value and the 'encoding' option are ignored.
Setup to reproduce:
PostgreSQL 8.1.8
Python 2.4.4
Django SVN revision 5152
create new project and app for testing (names used in this example are 'myproject' and 'myapp')
cat myapp/models.py :
from django.db import models
class Thingy(models.Model):
my_column = models.CharField(maxlength=255)
python manage.py --plain shell
>>> from myproject.myapp import models >>> thing = models.Thingy(my_column='äöü') # non-ASCII string entered as literal here, locale is set to UTF-8 >>> thing.save() >>> thing.my_column '\xc3\xa4\xc3\xb6\xc3\xbc' # UTF-8 string
python manage.py dumpdata --format=xml myapp
Unable to serialize database: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)
Using the JSON format to dump works, but then fails with a similar encoding error on 'loaddata'.
A partial fix is attached as patch (this now works for me with XML-serialization).
Attachments (1)
Change History (8)
by , 19 years ago
| Attachment: | core_serializers.diff added |
|---|
comment:2 by , 19 years ago
| Has patch: | unset |
|---|---|
| Summary: | dumpdata/loaddata serializer ignores encoding settings → [unicode] dumpdata/loaddata serializer ignores encoding settings |
| Triage Stage: | Unreviewed → Accepted |
This is being fixed in a slightly different way in the unicode branch. We won't apply this patch to trunk, since all those problems are being fixed on the branch in a unified fashion. However, I'll leave the ticket open until the branch is merged to give us a double-check that we fix all reported problems.
comment:3 by , 18 years ago
comment:4 by , 18 years ago
| Keywords: | unicode-branch added |
|---|---|
| Summary: | [unicode] dumpdata/loaddata serializer ignores encoding settings → dumpdata/loaddata serializer ignores encoding settings |
This was fixed in the unicode branch in [5248]. I'll close this ticket when the branch is merged back into trunk.
comment:5 by , 18 years ago
As much as it's basically irrelevant - FWIW this patch worked perfectly on rev 4992 - it let me successfully export data from MySQL in xml format whereas previously only json worked (but on the other end - where I'm loading data into a postgres database - json was giving invalid data types on boolean fields)
Thank you, Caspar!
comment:6 by , 18 years ago
| Resolution: | → fixed |
|---|---|
| Status: | new → closed |
output of svn diff in trunk/django/core/serializers/