Opened 6 years ago

Closed 3 years ago

#12321 closed Cleanup/optimization (fixed)

CharField default is a str and not unicode

Reported by: bjourne Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

In [7]: from django.db.models import CharField

In [8]: CharField().get_default()
Out[8]: ''

So the default value for CharField and most other model fields are strings and not unicode objects. It seems inconsistent with django's "unicode everywhere" policy. But if it actually is supposed to be that way, and default return value is not an oversight, then I think it should be documented somewhere why strings are returned and not unicodes. I can't find it discussed anywhere so I assume it must be a bug.

Attachments (1)

unicode_charfield.diff (2.2 KB) - added by kgrandis 5 years ago.
patch and tests

Download all attachments as: .zip

Change History (12)

comment:1 Changed 5 years ago by russellm

  • milestone set to 1.2
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

comment:2 Changed 5 years ago by kgrandis

  • Owner changed from nobody to kgrandis
  • Status changed from new to assigned

Changed 5 years ago by kgrandis

patch and tests

comment:3 Changed 5 years ago by kgrandis

  • Has patch set

The CharField to_python() had two logical exit nodes; one was returning unicode and the other could return either a string or unicode. Additionally, the Field-level default was defined as a str.

The attached patch standardizes the method's output to only return unicode and sets the Field-level default. I've also included a test to ensure the output type is unicode as well as a series of basic CharField tests that seemed to be missing.

comment:4 Changed 5 years ago by russellm

  • milestone changed from 1.2 to 1.3

Not critical for 1.2

comment:5 follow-up: Changed 5 years ago by claudep

  • Patch needs improvement set

The fact that CharField to_python doesn't force a bytestring to unicode is explained by some edge cases with MySQL where the collation for the field is set to utf8_bin. I wouldn't touch this part of the code.

See http://docs.djangoproject.com/en/1.2/ref/databases/#collation-settings

comment:6 in reply to: ↑ 5 Changed 5 years ago by kgrandis

  • Owner changed from kgrandis to nobody
  • Status changed from assigned to new

Replying to claudep:

The fact that CharField to_python doesn't force a bytestring to unicode is explained by some edge cases with MySQL where the collation for the field is set to utf8_bin. I wouldn't touch this part of the code.

See http://docs.djangoproject.com/en/1.2/ref/databases/#collation-settings

It would be helpful if you could include a test that demonstrates this failure. The included patch passed with MySQL when it was created.

comment:7 Changed 4 years ago by julien

  • Severity set to Normal
  • Type set to Cleanup/optimization

comment:8 Changed 4 years ago by jacob

  • milestone 1.3 deleted

Milestone 1.3 deleted

comment:11 Changed 3 years ago by aaugustin

  • UI/UX unset

Change UI/UX from NULL to False.

comment:12 Changed 3 years ago by aaugustin

  • Easy pickings unset

Change Easy pickings from NULL to False.

comment:13 Changed 3 years ago by claudep

  • Resolution set to fixed
  • Status changed from new to closed

After generalization of unicode_literals in Django (#18269), I think this is not a problem any more.

Note: See TracTickets for help on using tickets.
Back to Top