Opened 15 years ago

Closed 12 years ago

#12321 closed Cleanup/optimization (fixed)

CharField default is a str and not unicode

Reported by: bjourne Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

In [7]: from django.db.models import CharField

In [8]: CharField().get_default()
Out[8]: ''

So the default value for CharField and most other model fields are strings and not unicode objects. It seems inconsistent with django's "unicode everywhere" policy. But if it actually is supposed to be that way, and default return value is not an oversight, then I think it should be documented somewhere why strings are returned and not unicodes. I can't find it discussed anywhere so I assume it must be a bug.

Attachments (1)

unicode_charfield.diff (2.2 KB ) - added by kgrandis 15 years ago.
patch and tests

Download all attachments as: .zip

Change History (12)

comment:1 by Russell Keith-Magee, 15 years ago

milestone: 1.2
Triage Stage: UnreviewedAccepted

comment:2 by kgrandis, 15 years ago

Owner: changed from nobody to kgrandis
Status: newassigned

by kgrandis, 15 years ago

Attachment: unicode_charfield.diff added

patch and tests

comment:3 by kgrandis, 15 years ago

Has patch: set

The CharField to_python() had two logical exit nodes; one was returning unicode and the other could return either a string or unicode. Additionally, the Field-level default was defined as a str.

The attached patch standardizes the method's output to only return unicode and sets the Field-level default. I've also included a test to ensure the output type is unicode as well as a series of basic CharField tests that seemed to be missing.

comment:4 by Russell Keith-Magee, 15 years ago

milestone: 1.21.3

Not critical for 1.2

comment:5 by Claude Paroz, 14 years ago

Patch needs improvement: set

The fact that CharField to_python doesn't force a bytestring to unicode is explained by some edge cases with MySQL where the collation for the field is set to utf8_bin. I wouldn't touch this part of the code.

See http://docs.djangoproject.com/en/1.2/ref/databases/#collation-settings

in reply to:  5 comment:6 by kgrandis, 14 years ago

Owner: changed from kgrandis to nobody
Status: assignednew

Replying to claudep:

The fact that CharField to_python doesn't force a bytestring to unicode is explained by some edge cases with MySQL where the collation for the field is set to utf8_bin. I wouldn't touch this part of the code.

See http://docs.djangoproject.com/en/1.2/ref/databases/#collation-settings

It would be helpful if you could include a test that demonstrates this failure. The included patch passed with MySQL when it was created.

comment:7 by Julien Phalip, 14 years ago

Severity: Normal
Type: Cleanup/optimization

comment:8 by Jacob, 13 years ago

milestone: 1.3

Milestone 1.3 deleted

comment:11 by Aymeric Augustin, 13 years ago

UI/UX: unset

Change UI/UX from NULL to False.

comment:12 by Aymeric Augustin, 13 years ago

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:13 by Claude Paroz, 12 years ago

Resolution: fixed
Status: newclosed

After generalization of unicode_literals in Django (#18269), I think this is not a problem any more.

Note: See TracTickets for help on using tickets.
Back to Top