Code

Opened 8 years ago

Closed 7 years ago

#2765 closed defect (fixed)

MySQL with utf8_bin collation for case-sensitive comparison goodness requires MySQLdb 1.2.1

Reported by: dave AT avaragado.org Owned by: adrian
Component: Core (Other) Version: master
Severity: normal Keywords: mysql mysqldb utf8 case
Cc: Triage Stage: Design decision needed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

By default, MySQL 4.1 (at least) uses a case-insensitive collation, utf8_general_ci. This means that checks for equality are, annoyingly, case-insensitive. You can set the collation for a database in the CREATE DATABASE command, or you can set it per-server with an option in the MySQL my.cnf file. You can also set it per-column if you want.

Sadly, using a collation like utf8_bin, which gives case-sensitive comparisons, causes unexpected results through the ORM.

To reproduce:

  • Edit your my.cnf file to include the line
    collation-server = utf8_bin
    
  • Restart mysqld

Run "python manage.py test" such that it tests this models.py:

from django.db import models

class Dummy(models.Model):
    """
    >>> d = Dummy(ok='ok', notok='not ok')
    >>> d.save()

    >>> d = Dummy.objects.all()[0]
    >>> d.ok, d.notok
    ('ok', 'not ok')
    """
    ok = models.CharField(maxlength=100)
    notok = models.TextField()

You'll get something like this (irrelevant info removed):

Failed example:
    d.ok, d.notok
Expected:
    ('ok', 'not ok')
Got:
    ('ok', array('c', 'not ok'))

Without the collation-server line in my.cnf, the tests pass. Even without that line the tests fail if the CREATE DATABASE line specifies utf8_bin as the collation (and I imagine other binary collations), but the test framework uses the server defaults for those.

My own scrabbling around discovered that MySQLdb (version 1.2.0, at least) includes a suspicious class in init.py:

def Binary(x):
    from array import array
    return array('c', x)

... but that's as far as my limited knowledge goes - I have no idea how it gets used.

Attachments (0)

Change History (6)

comment:1 Changed 8 years ago by django@…

  • Resolution set to invalid
  • Status changed from new to closed

I had this problem, too.
Upgrade to the last version of mysqldb and the issue should be gone. Its not a django bug.

comment:2 Changed 8 years ago by dave AT avaragado.org

  • Component changed from Core framework to Documentation
  • Resolution invalid deleted
  • Status changed from closed to reopened
  • Summary changed from MySQL with utf8_bin collation for case-sensitive comparison goodness causes TextFields to be returned as arrays to MySQL with utf8_bin collation for case-sensitive comparison goodness requires MySQLdb 1.2.1

Sigh. Is there any documentation that says the minimum supported version of MySQLdb is 1.2.1?
(Or at least, it is if you want to use binary collations.) I can't find any.

If not, then there should be. It might stop others spending hours trying to debug this same problem.
Reopening this as a documentation ticket, changing summary.

comment:3 Changed 7 years ago by Simon G. <dev@…>

  • Component changed from Documentation to Core framework
  • Keywords mysql mysqldb utf8 case added
  • Triage Stage changed from Unreviewed to Design decision needed

Marked as requiring a design decision - do we need to work around this or can we just document that users should MySQLdb > 1.2.1 ?

comment:4 Changed 7 years ago by Michael Radziej <mir@…>

There are other issues with python-mysqldb-1.2.0: threading does not work (#3279). I propose to discourage use of this version.

comment:5 Changed 7 years ago by Marc Fargas <telenieko@…>

+1 to require at least 1.2.1 (1.2.1p2 as per #3279)

comment:6 Changed 7 years ago by Michael Radziej <mir@…>

  • Resolution set to fixed
  • Status changed from reopened to closed

This ticket has become obsolete since we now use 1.2.1 (unless you use mysql_old, but it's more or less deprecated).

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.