Opened 7 years ago

Closed 4 years ago

#12321 closed Cleanup/optimization (fixed)

CharField default is a str and not unicode

Reported by: bjourne Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

In [7]: from django.db.models import CharField

In [8]: CharField().get_default()
Out[8]: ''

So the default value for CharField and most other model fields are strings and not unicode objects. It seems inconsistent with django's "unicode everywhere" policy. But if it actually is supposed to be that way, and default return value is not an oversight, then I think it should be documented somewhere why strings are returned and not unicodes. I can't find it discussed anywhere so I assume it must be a bug.

Attachments (1)

unicode_charfield.diff (2.2 KB) - added by kgrandis 7 years ago.
patch and tests

Download all attachments as: .zip

Change History (12)

comment:1 Changed 7 years ago by Russell Keith-Magee

milestone: 1.2
Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset
Triage Stage: UnreviewedAccepted

comment:2 Changed 7 years ago by kgrandis

Owner: changed from nobody to kgrandis
Status: newassigned

Changed 7 years ago by kgrandis

Attachment: unicode_charfield.diff added

patch and tests

comment:3 Changed 7 years ago by kgrandis

Has patch: set

The CharField to_python() had two logical exit nodes; one was returning unicode and the other could return either a string or unicode. Additionally, the Field-level default was defined as a str.

The attached patch standardizes the method's output to only return unicode and sets the Field-level default. I've also included a test to ensure the output type is unicode as well as a series of basic CharField tests that seemed to be missing.

comment:4 Changed 7 years ago by Russell Keith-Magee

milestone: 1.21.3

Not critical for 1.2

comment:5 Changed 6 years ago by Claude Paroz

Patch needs improvement: set

The fact that CharField to_python doesn't force a bytestring to unicode is explained by some edge cases with MySQL where the collation for the field is set to utf8_bin. I wouldn't touch this part of the code.

See http://docs.djangoproject.com/en/1.2/ref/databases/#collation-settings

comment:6 in reply to:  5 Changed 6 years ago by kgrandis

Owner: changed from kgrandis to nobody
Status: assignednew

Replying to claudep:

The fact that CharField to_python doesn't force a bytestring to unicode is explained by some edge cases with MySQL where the collation for the field is set to utf8_bin. I wouldn't touch this part of the code.

See http://docs.djangoproject.com/en/1.2/ref/databases/#collation-settings

It would be helpful if you could include a test that demonstrates this failure. The included patch passed with MySQL when it was created.

comment:7 Changed 5 years ago by Julien Phalip

Severity: Normal
Type: Cleanup/optimization

comment:8 Changed 5 years ago by Jacob

milestone: 1.3

Milestone 1.3 deleted

comment:11 Changed 5 years ago by Aymeric Augustin

UI/UX: unset

Change UI/UX from NULL to False.

comment:12 Changed 5 years ago by Aymeric Augustin

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:13 Changed 4 years ago by Claude Paroz

Resolution: fixed
Status: newclosed

After generalization of unicode_literals in Django (#18269), I think this is not a problem any more.

Note: See TracTickets for help on using tickets.
Back to Top