Opened 18 years ago

Closed 17 years ago

#2765 closed defect (fixed)

MySQL with utf8_bin collation for case-sensitive comparison goodness requires MySQLdb 1.2.1

Reported by: dave AT avaragado.org Owned by: Adrian Holovaty
Component: Core (Other) Version: dev
Severity: normal Keywords: mysql mysqldb utf8 case
Cc: Triage Stage: Design decision needed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

By default, MySQL 4.1 (at least) uses a case-insensitive collation, utf8_general_ci. This means that checks for equality are, annoyingly, case-insensitive. You can set the collation for a database in the CREATE DATABASE command, or you can set it per-server with an option in the MySQL my.cnf file. You can also set it per-column if you want.

Sadly, using a collation like utf8_bin, which gives case-sensitive comparisons, causes unexpected results through the ORM.

To reproduce:

  • Edit your my.cnf file to include the line
    collation-server = utf8_bin
    
  • Restart mysqld

Run "python manage.py test" such that it tests this models.py:

from django.db import models

class Dummy(models.Model):
    """
    >>> d = Dummy(ok='ok', notok='not ok')
    >>> d.save()

    >>> d = Dummy.objects.all()[0]
    >>> d.ok, d.notok
    ('ok', 'not ok')
    """
    ok = models.CharField(maxlength=100)
    notok = models.TextField()

You'll get something like this (irrelevant info removed):

Failed example:
    d.ok, d.notok
Expected:
    ('ok', 'not ok')
Got:
    ('ok', array('c', 'not ok'))

Without the collation-server line in my.cnf, the tests pass. Even without that line the tests fail if the CREATE DATABASE line specifies utf8_bin as the collation (and I imagine other binary collations), but the test framework uses the server defaults for those.

My own scrabbling around discovered that MySQLdb (version 1.2.0, at least) includes a suspicious class in init.py:

def Binary(x):
    from array import array
    return array('c', x)

... but that's as far as my limited knowledge goes - I have no idea how it gets used.

Change History (6)

comment:1 by django@…, 18 years ago

Resolution: invalid
Status: newclosed

I had this problem, too.
Upgrade to the last version of mysqldb and the issue should be gone. Its not a django bug.

comment:2 by dave AT avaragado.org, 18 years ago

Component: Core frameworkDocumentation
Resolution: invalid
Status: closedreopened
Summary: MySQL with utf8_bin collation for case-sensitive comparison goodness causes TextFields to be returned as arraysMySQL with utf8_bin collation for case-sensitive comparison goodness requires MySQLdb 1.2.1

Sigh. Is there any documentation that says the minimum supported version of MySQLdb is 1.2.1?
(Or at least, it is if you want to use binary collations.) I can't find any.

If not, then there should be. It might stop others spending hours trying to debug this same problem.
Reopening this as a documentation ticket, changing summary.

comment:3 by Simon G. <dev@…>, 17 years ago

Component: DocumentationCore framework
Keywords: mysql mysqldb utf8 case added
Triage Stage: UnreviewedDesign decision needed

Marked as requiring a design decision - do we need to work around this or can we just document that users should MySQLdb > 1.2.1 ?

comment:4 by Michael Radziej <mir@…>, 17 years ago

There are other issues with python-mysqldb-1.2.0: threading does not work (#3279). I propose to discourage use of this version.

comment:5 by Marc Fargas <telenieko@…>, 17 years ago

+1 to require at least 1.2.1 (1.2.1p2 as per #3279)

comment:6 by Michael Radziej <mir@…>, 17 years ago

Resolution: fixed
Status: reopenedclosed

This ticket has become obsolete since we now use 1.2.1 (unless you use mysql_old, but it's more or less deprecated).

Note: See TracTickets for help on using tickets.
Back to Top