Opened 8 years ago
Closed 6 years ago
#29249 closed New feature (fixed)
Make serializers consistently unicode by default.
| Reported by: | hakib | Owned by: | Hasan Ramezani |
|---|---|---|---|
| Component: | Core (Management commands) | Version: | dev |
| Severity: | Normal | Keywords: | dumpdata, unicode |
| Cc: | Triage Stage: | Ready for checkin | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
It is currently not (easily) possible to use the dumpdata managemet command on models with unicode data.
The JSON serializer used by dumpdata is not accepting the ensure_ascii argument used by json.dumps as an argument.
Since ensure_ascii=True is the default, I suggest adding a --dont-ensure-ascii flag to the dumpdata managemet command so it will be easier to use dumpdata with unicode.
./manage.py dumpdata app.model --dont-ensure-ascii
I'm not sure what are the implications on other serializers such as YAML, XML etc.
Change History (10)
comment:1 by , 8 years ago
| Component: | Utilities → Core (Management commands) |
|---|---|
| Summary: | Add option to dumpdata with unicode data → Add option to dumpdata with unicode JSON |
comment:2 by , 8 years ago
The JSON serializer has a ensure_ascii attribute and the YAML serializer has a allow_unicode attribute. I already sumitted a [PR](https://github.com/django/django/pull/9818#issuecomment-375694612) implemeting the flag in both serializers.
I haven't looked at the XML serializer yet but i'm sure it will be possible there as well.
As someone who works with unicode as the primary language for most apps (as i'm sure a lot of other developers do) it's a very a usefull feature to be able to dump fixtures directly from local db in a readbale format.
comment:3 by , 8 years ago
| Has patch: | set |
|---|---|
| Patch needs improvement: | set |
| Triage Stage: | Unreviewed → Accepted |
Okay. My main concern is that an option calls --allow_unicode may suggest to readers that all serializers prohibit unicode by default. That may not be true. Your patch also needs documentation.
comment:4 by , 8 years ago
| Summary: | Add option to dumpdata with unicode JSON → Add option to dumpdata to allow unicode JSON or YAML |
|---|
comment:5 by , 6 years ago
| Owner: | changed from to |
|---|---|
| Patch needs improvement: | unset |
| Status: | new → assigned |
comment:6 by , 6 years ago
| Has patch: | unset |
|---|---|
| Summary: | Add option to dumpdata to allow unicode JSON or YAML → Make serializers consistently unicode by default. |
| Version: | 2.0 → master |
Current behavior is inconsistent. XML serializer use Unicode by default, on the other hand YAML and JSON serializers force ASCII. I think we should make this behavior consistent instead of adding a new serializer-specific option, i.e. pass allow_unicode=True to yaml.dump() and ensure_ascii=False to json.dump().
comment:8 by , 6 years ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
I believe this would only apply to the JSON serializer, and I'm not sure about adding a dumpdata option that's specific to a particular serializer. I think your best solution is to subclass the JSON serializer, register it as a custom format, and then use that format in
dumpdata.