Opened 8 years ago

Closed 3 years ago

Last modified 3 years ago

#5725 closed Bug (fixed)

Inspectdb makes too long CharFields

Reported by: anonymous Owned by: nobody
Component: Core (Management commands) Version: master
Severity: Normal Keywords: introspection mysql inspectdb
Cc: rokclimb15@…, django@… Triage Stage: Ready for checkin
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Using mysql5.0 and python2.4, the maxlength of a CharField is three times as big as the varchar column's definition says in the table.

Attachments (1)

mysql_varchar_length.diff (2.6 KB) - added by ferdonline 3 years ago.
Patch fixing the incorrectly detected varchar field length

Download all attachments as: .zip

Change History (16)

comment:1 Changed 7 years ago by Simon G <dev@…>

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

Huh. Confirmed on Python 2.4, Mysql 5.0.45, @6851

Models.py says this:

from django.db import models

# Create your models here.
class Fudge(models.Model):
    snork = models.CharField(max_length=10, blank=True)

The table is created in MySQL like so:

mysql> describe t5725_Fudge;
+-------+-------------+------+-----+---------+----------------+
| Field | Type        | Null | Key | Default | Extra          |
+-------+-------------+------+-----+---------+----------------+
| id    | int(11)     | NO   | PRI | NULL    | auto_increment | 
| snork | varchar(10) | NO   |     |         |                | 
+-------+-------------+------+-----+---------+----------------+
2 rows in set (0.03 sec)

& inspectdb gives this -

class T5725Fudge(models.Model):
    id = models.IntegerField(primary_key=True)
    snork = models.CharField(max_length=30)
    class Meta:
        db_table = u't5725_fudge'

comment:2 Changed 7 years ago by brockweaver@…

  • Triage Stage changed from Accepted to Design decision needed

Somewhere near line 174 of django/db/backends/mysql/base.py, the charset is hardcoded to 'utf8':

    def _cursor(self, settings):
        if not self._valid_connection():
            kwargs = {
                'conv': django_conversions,
                'charset': 'utf8',  # this bad boy, right here
                'use_unicode': True,
            }

If your collation is not set to this in MySql, it will report the wrong size. In my case, my table is configured to be 'latin1'. Changing the charset to 'latin1' in base.py caused inspectdb to report the correct length. However, that's obviously not a general solution. It would be best to make this caller-configurable (or better yet detected and altered when pulling the description off of the cursor object).

Honestly though, this hardcoded default is a very safe idea. Look at the problems this guy had when converting from a latin1 table to a utf8 table:

http://www.oreillynet.com/onlamp/blog/2006/01/turning_mysql_data_in_latin1_t.html

comment:3 Changed 7 years ago by brockweaver@…

Sorry, I meant charset above where I said collation. My bad.

comment:4 Changed 7 years ago by mtredinnick

  • Triage Stage changed from Design decision needed to Accepted

I think this is a case of "we take patches". If somebody wants to work out how to extract the server side's encoding for each table automatically (and remember that they could be different for each table) and factor that in, go ahead and we'll how it looks. I think we should include a pretty stern warning in the comments of the generated model or something, though, if the encoding isn't a safe one like UTF-8 or UTF-16. Things will go wrong in interesting and difficult to diagnose ways if/when Django passes through Unicode data that cannot be squeezed back into ASCII or Latin-1 or whatever. So tell the inspectdb user of the excitement they're in for in this case and they can make the judgement call.

This should only be done for inspectdb, though. Normal Django code assumes you can store the data you're submitting in the database and it's up to you to ensure that. If your database isn't in UTF-8, that's not our fault.

comment:5 follow-up: Changed 5 years ago by rokclimb15

There is definitely something more to this bug.

I am using Python2.6 and MySQL server 5.1.37 with MySQLdb 1.2.2. My database is utf8 and so are my tables. When I introspect, I get the symptoms described above with charfields being 3x too long. If I change the connection charset to latin1, it works properly. I am wondering if there is an additional option that should be passed to the connection to indicate collation, or perhaps just a bug in MySQLdb with cursor.description.

Can anyone else confirm this behavior?

comment:6 in reply to: ↑ 5 Changed 5 years ago by kmtracey

Replying to rokclimb15:

Can anyone else confirm this behavior?

Yes. This thread: http://groups.google.com/group/django-users/browse_thread/thread/d1f88e2d3f725e87/ describes what I found when I looked into this.

comment:7 Changed 5 years ago by rokclimb15

  • Cc rokclimb15@… added

comment:8 Changed 4 years ago by gabrielhurley

  • Component changed from django-admin.py inspectdb to Core (Management commands)

comment:9 Changed 4 years ago by gabrielhurley

  • Severity set to Normal
  • Type set to Bug

comment:10 Changed 4 years ago by ramiro

  • Easy pickings unset
  • Keywords inspectdb added
  • UI/UX unset

Changed 3 years ago by ferdonline

Patch fixing the incorrectly detected varchar field length

comment:11 Changed 3 years ago by andrewgodwin

  • Triage Stage changed from Accepted to Ready for checkin

comment:12 Changed 3 years ago by jedie

  • Cc django@… added

comment:13 Changed 3 years ago by Claude Paroz <claude@…>

  • Resolution set to fixed
  • Status changed from new to closed

In [879b245baa727090e242c482a8f12ee52f169642]:

Fixed #5725 -- Fixed varchar column size introspection for MySQL

Thanks ferdonline for the initial patch and Karen Tracey for the
related post on django-users.

comment:14 Changed 3 years ago by mark0978

Looks like this one is back in 1.4.2

A similar problem happens with mysql 5.5 where the size is 3x greater than that specified, given the SQL:

CREATE TABLE primaryfields (

fpost int(11) NOT NULL,
index1911 char(36) DEFAULT NULL,
ssn char(11) DEFAULT NULL,
middleinitial char(1) DEFAULT NULL,
firstname char(30) DEFAULT NULL,
lastname char(30) DEFAULT NULL,
employeeid char(10) DEFAULT NULL,
PRIMARY KEY (fpost),
CONSTRAINT primaryfields_ibfk_1 FOREIGN KEY (fpost) REFERENCES filenames (fpost)

) ENGINE=InnoDB DEFAULT CHARSET=latin1 |

You get the Python:

class Primaryfields(models.Model):

fpost = models.ForeignKey(Filenames, primary_key=True, db_column='fpost')
index1911 = models.CharField(max_length=108, blank=True)
ssn = models.CharField(max_length=33, blank=True)
middleinitial = models.CharField(max_length=3, blank=True)
firstname = models.CharField(max_length=90, blank=True)
lastname = models.CharField(max_length=90, blank=True)
employeeid = models.CharField(max_length=30, blank=True)
class Meta:

db_table = u'primaryfields'

comment:15 Changed 3 years ago by claudep

This was not backported to 1.4, so please test with current code (1.5alpha or master).

Note: See TracTickets for help on using tickets.
Back to Top