Opened 10 years ago

Closed 10 years ago

#2765 closed defect (fixed)

MySQL with utf8_bin collation for case-sensitive comparison goodness requires MySQLdb 1.2.1

Reported by: dave AT avaragado.org Owned by: Adrian Holovaty
Component: Core (Other) Version: master
Severity: normal Keywords: mysql mysqldb utf8 case
Cc: Triage Stage: Design decision needed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

By default, MySQL 4.1 (at least) uses a case-insensitive collation, utf8_general_ci. This means that checks for equality are, annoyingly, case-insensitive. You can set the collation for a database in the CREATE DATABASE command, or you can set it per-server with an option in the MySQL my.cnf file. You can also set it per-column if you want.

Sadly, using a collation like utf8_bin, which gives case-sensitive comparisons, causes unexpected results through the ORM.

To reproduce:

  • Edit your my.cnf file to include the line
    collation-server = utf8_bin
    
  • Restart mysqld

Run "python manage.py test" such that it tests this models.py:

from django.db import models

class Dummy(models.Model):
    """
    >>> d = Dummy(ok='ok', notok='not ok')
    >>> d.save()

    >>> d = Dummy.objects.all()[0]
    >>> d.ok, d.notok
    ('ok', 'not ok')
    """
    ok = models.CharField(maxlength=100)
    notok = models.TextField()

You'll get something like this (irrelevant info removed):

Failed example:
    d.ok, d.notok
Expected:
    ('ok', 'not ok')
Got:
    ('ok', array('c', 'not ok'))

Without the collation-server line in my.cnf, the tests pass. Even without that line the tests fail if the CREATE DATABASE line specifies utf8_bin as the collation (and I imagine other binary collations), but the test framework uses the server defaults for those.

My own scrabbling around discovered that MySQLdb (version 1.2.0, at least) includes a suspicious class in init.py:

def Binary(x):
    from array import array
    return array('c', x)

... but that's as far as my limited knowledge goes - I have no idea how it gets used.

Change History (6)

comment:1 Changed 10 years ago by django@…

Resolution: invalid
Status: newclosed

I had this problem, too.
Upgrade to the last version of mysqldb and the issue should be gone. Its not a django bug.

comment:2 Changed 10 years ago by dave AT avaragado.org

Component: Core frameworkDocumentation
Resolution: invalid
Status: closedreopened
Summary: MySQL with utf8_bin collation for case-sensitive comparison goodness causes TextFields to be returned as arraysMySQL with utf8_bin collation for case-sensitive comparison goodness requires MySQLdb 1.2.1

Sigh. Is there any documentation that says the minimum supported version of MySQLdb is 1.2.1?
(Or at least, it is if you want to use binary collations.) I can't find any.

If not, then there should be. It might stop others spending hours trying to debug this same problem.
Reopening this as a documentation ticket, changing summary.

comment:3 Changed 10 years ago by Simon G. <dev@…>

Component: DocumentationCore framework
Keywords: mysql mysqldb utf8 case added
Triage Stage: UnreviewedDesign decision needed

Marked as requiring a design decision - do we need to work around this or can we just document that users should MySQLdb > 1.2.1 ?

comment:4 Changed 10 years ago by Michael Radziej <mir@…>

There are other issues with python-mysqldb-1.2.0: threading does not work (#3279). I propose to discourage use of this version.

comment:5 Changed 10 years ago by Marc Fargas <telenieko@…>

+1 to require at least 1.2.1 (1.2.1p2 as per #3279)

comment:6 Changed 10 years ago by Michael Radziej <mir@…>

Resolution: fixed
Status: reopenedclosed

This ticket has become obsolete since we now use 1.2.1 (unless you use mysql_old, but it's more or less deprecated).

Note: See TracTickets for help on using tickets.
Back to Top