Django

Code

Ticket #5725 (new)

Opened 1 year ago

Last modified 5 months ago

Inspectdb makes too long CharFields

Reported by: anonymous Assigned to: nobody
Milestone: Component: django-admin.py inspectdb
Version: SVN Keywords: introspection mysql
Cc: Triage Stage: Accepted
Has patch: 0 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 0

Description

Using mysql5.0 and python2.4, the maxlength of a CharField is three times as big as the varchar column's definition says in the table.

Attachments

Change History

12/02/07 19:18:15 changed by Simon G <dev@simon.net.nz>

  • needs_better_patch changed.
  • stage changed from Unreviewed to Accepted.
  • needs_tests changed.
  • needs_docs changed.

Huh. Confirmed on Python 2.4, Mysql 5.0.45, @6851

Models.py says this:

from django.db import models

# Create your models here.
class Fudge(models.Model):
    snork = models.CharField(max_length=10, blank=True)

The table is created in MySQL like so:

mysql> describe t5725_Fudge;
+-------+-------------+------+-----+---------+----------------+
| Field | Type        | Null | Key | Default | Extra          |
+-------+-------------+------+-----+---------+----------------+
| id    | int(11)     | NO   | PRI | NULL    | auto_increment | 
| snork | varchar(10) | NO   |     |         |                | 
+-------+-------------+------+-----+---------+----------------+
2 rows in set (0.03 sec)

& inspectdb gives this -

class T5725Fudge(models.Model):
    id = models.IntegerField(primary_key=True)
    snork = models.CharField(max_length=30)
    class Meta:
        db_table = u't5725_fudge'

06/26/08 12:36:16 changed by brockweaver@gmail.com

  • stage changed from Accepted to Design decision needed.

Somewhere near line 174 of django/db/backends/mysql/base.py, the charset is hardcoded to 'utf8':

    def _cursor(self, settings):
        if not self._valid_connection():
            kwargs = {
                'conv': django_conversions,
                'charset': 'utf8',  # this bad boy, right here
                'use_unicode': True,
            }

If your collation is not set to this in MySql?, it will report the wrong size. In my case, my table is configured to be 'latin1'. Changing the charset to 'latin1' in base.py caused inspectdb to report the correct length. However, that's obviously not a general solution. It would be best to make this caller-configurable (or better yet detected and altered when pulling the description off of the cursor object).

Honestly though, this hardcoded default is a very safe idea. Look at the problems this guy had when converting from a latin1 table to a utf8 table:

http://www.oreillynet.com/onlamp/blog/2006/01/turning_mysql_data_in_latin1_t.html

06/26/08 12:42:37 changed by brockweaver@gmail.com

Sorry, I meant charset above where I said collation. My bad.

07/06/08 06:26:43 changed by mtredinnick

  • stage changed from Design decision needed to Accepted.

I think this is a case of "we take patches". If somebody wants to work out how to extract the server side's encoding for each table automatically (and remember that they could be different for each table) and factor that in, go ahead and we'll how it looks. I think we should include a pretty stern warning in the comments of the generated model or something, though, if the encoding isn't a safe one like UTF-8 or UTF-16. Things will go wrong in interesting and difficult to diagnose ways if/when Django passes through Unicode data that cannot be squeezed back into ASCII or Latin-1 or whatever. So tell the inspectdb user of the excitement they're in for in this case and they can make the judgement call.

This should only be done for inspectdb, though. Normal Django code assumes you can store the data you're submitting in the database and it's up to you to ensure that. If your database isn't in UTF-8, that's not our fault.


Add/Change #5725 (Inspectdb makes too long CharFields)




Change Properties
Action