Opened 14 years ago

Closed 15 months ago

#14094 closed New feature (fixed)

Cannot define CharField with unlimited length

Reported by: Daniel Miller Owned by: Adrian Torres
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: joe@…, mightyiam, unai@…, Aron Podrigal, Rich Rauenzahn, Anvesh Mishra, Adrian Torres Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Model validation throws an error on CharField with a null max_length:

class Test(Model):
    char_field = CharField(max_length=None)


One or more models did not validate:
test.test: "char_field": CharFields require a "max_length" attribute that is a positive integer.

CharField should allow max_length=None, which intuitively means there is no maximum length. This is a perfectly valid use case. Postgres, for example, supports varchar/text columns without a length limit, but Django appears to have no way to define such a column in a model class.

The model validation code looks like this (django/core/management/validation.py:40):

    if isinstance(f, models.CharField):
        try:
            max_length = int(f.max_length)
            if max_length <= 0:
                e.add(opts, '"%s": CharFields require a "max_length" attribute that is a positive integer.' % f.name)
        except (ValueError, TypeError):
            e.add(opts, '"%s": CharFields require a "max_length" attribute that is a positive integer.' % f.name)

It should be changed to something this:

    if isinstance(f, models.CharField) and f.max_length is not None:
        ...

The FileField does not happen to throw this error because it is not a derivative of CharField. However, the SQL generated for FileField is not correct when max_length=None, so that would need to be addressed as well.

Change History (29)

comment:1 by Anssi Kääriäinen, 14 years ago

Resolution: invalid
Status: newclosed

You should use models.TextField, which is unlimited in size.

comment:2 by Ryan Fugger, 13 years ago

You can subclass CharField to get rid of max_length, like in this snippet:

http://djangosnippets.org/snippets/2328/

Alternatively, you can subclass TextField to override the form fields it generates by default so you don't get big textareas.

comment:3 by joe@…, 13 years ago

Cc: joe@… added
Easy pickings: unset
Resolution: invalid
Severity: Normal
Status: closedreopened
Type: New feature
UI/UX: unset

PostgreSQL allows char fields (which are distinct from Text fields) which are "character varying" with no preset length.

It only makes sense that these should map to a Django CharField, with max_length=None, rather than be forced to use a Django TextField, only because of this limitation of Django.

In common PostgreSQL usage, these are small text fields of 20 or 50 characters or less, NOT large text fields.

Fixing this issue would also make the introspection of existing postgres databases *much* smoother.

comment:4 by Aymeric Augustin, 13 years ago

Triage Stage: UnreviewedDesign decision needed

comment:5 by Jacob, 11 years ago

Triage Stage: Design decision neededAccepted

We should do this. Postgres, at least, has no performance or storage space penalty for using varchar over varchar(n) -- the length requirement is basically a constraint, that's all. I imagine other modern DBs are similar, but even if not it's still a good idea to let users make the call, not do it unilaterally.

comment:6 by Aymeric Augustin, 11 years ago

I can't comment on performance, but I've always found choosing the length of CharFields incredibly annoying.

"How long can this be? 100 chars? Fields are always too short, let's make it 200. Oh why not 255, everyone does that..."

I'm in favor of removing the requirement to define max_length on CharFields, and making them unlimited by default — assuming this isn't a Gun Aimed At Feet.

Last edited 11 years ago by Aymeric Augustin (previous) (diff)

comment:7 by Anssi Kääriäinen, 11 years ago

Oracle for example uses nvarchar2 for CharField, NCLOB for TextField, MySQL has varchar and longtext... And these "unlimited" size text fields really are different than limited size text field without the limit.

How about just having a "large" default value, like 1024 chars?

comment:8 by Aymeric Augustin, 11 years ago

Status: reopenednew

comment:9 by mightyiam, 11 years ago

Cc: mightyiam added

comment:10 by Unai Zalakain, 10 years ago

Cc: unai@… added

comment:11 by Ben Davis, 10 years ago

I'm curious why this ticket is 4 years old. Everyone seems to be in agreement that this should change, and it seems like a simple one (unless I'm mistaken), and it's backward compatible.

I completely agree with aaugustin (comment:6). The max_length argument should be reserved for use cases where someone wants to enforce the length. Instead I'm having to pick arbitrary values for things that don't need enforcing. It adds indirection to my models because it implies that the value should be enforced when that's not really the case.

For databases like MySQL that require a length, the db field should be created as VARCHAR(255) by default. The documentation already has a note that says "some backends have restrictions on max length", so we would just need to expand on that to explain the behavior when max_length is None.

Also, using TextField is not the right answer -- it has a different meaning in Django, regardless of how it's stored in the db.

comment:12 by Aymeric Augustin, 10 years ago

The answer may be disappointing -- no one was sufficiently interested in this issue to write a patch, that's all...

comment:13 by Ben Davis, 10 years ago

Ugh. I took a stab at it and fell into the black hole that is the Django ORM. I'll try to summarize what I found here in case anyone is interested.

Fixing the validation was easy, just needed to delete a few lines of code. Now, we could just override CharField's db_type() method to return "varchar" and be done with it, but that doesn't seem like the right solution to me, especially since CharField is a core field type. So, I sought out to find the right way to fix it, but only uncovered more hurdles.

The problem starts with how django db backends define their basic mappings from django's "internal types" to the actual SQL definition. The mappings all look something like "varchar(%(max_length)s)" and are used to format a string given the field's __dict__. So the first hurdle was that Field.db_type() was returning "varchar(None)". I made some changes to get it to replace None values with ''. But then I found that "varchar()" is invalid syntax in postgres -- you have to use just "varchar" alone.

So, like Dante, I ventured deeper. The db_type() method relies on the backend's DatabaseCreation class, which is where these mappings are defined. One option I thought of was to have two internal types for CharField:

    data_types = {
        #...
        'CharField': 'varchar(%(max_length)s)',
        'CharFieldNoLimit': 'varchar',
        #..
    }

But, there's no precident right now for having two internal types for the same model field. Besides, these mappings don't help us differentiate between "data type" and "modifier", which is what really needed to happen. So, I ventured even deeper.

The base Field defines db_type(), db_parameters(), and db_type_suffix(). I would argue that Field shouldn't even implement these methods, as they produce back-end specific results. Those methods should really live in a back-end specific class. That's where I pretty much threw my hands up. Django's whole schema system is in flux right now due to migrations, and it's not really clear what the right answer is.

Well, that was fun. C'est la vie.

comment:14 by Sasha Romijn, 10 years ago

Thanks for digging into that. However, I think this isn't the biggest problem. If I read the oracle docs correctly, specifying a length is mandatory for VARCHAR2, and the maximum is 4000 bytes. For MySQL, it's 65K bytes, for PostgreSQL and SQLite it's very large.

Therefore, we can't set it to an unlimited length for all databases: we must pick something up to 4000 as maximum for Oracle. I don't think it's acceptable to only require max_length for Oracle backends. Essentially, I only see one option: default max_length to to the Oracle maximum. This works the same in all databases, and will work for many cases. Anyone else can still override it trivially. I'm not too enthusiastic about this solution though.

Another challenge with this solution that, if I remember correctly, max_length defines characters, whereas the oracle limit is 4000 bytes. How characters translate to bytes depends on characters and encoding. However, this might be solved already somewhere: after all, if you define max_length on a field now, that value must not end up as the limit in varchar2(n) due to the same concern.

I should add that I know very little about Oracle, so it's entirely possible this is not the complete story for Oracle.

To summarise:

  • An explicit max_length defined somewhere (either in your own code or in Django) is always required for Oracle compatibility, if I read the docs correctly.
  • The maximum size can be no longer than 4000 bytes, which may not fit 4000 characters, depending on encoding. We may already have code in place that resolves this concern.
  • My suggestion is to give max_length on CharField a default that matches 4000 bytes, but I'm not sure whether that's the best solution.

comment:15 by Simon Charette, 10 years ago

My suggestion is to give max_length on CharField a default that matches 4000 bytes, but I'm not sure whether that's the best solution.

What about providing a backend specific max_length just like we do with IntegerField and friends? See #12030 and 1506c71a95cd7f58fbc6363edf2ef742c58d2487.

in reply to:  15 comment:16 by Sasha Romijn, 10 years ago

Replying to charettes:

What about providing a backend specific max_length just like we do with IntegerField and friends? See #12030 and 1506c71a95cd7f58fbc6363edf2ef742c58d2487.

Yes, that makes sense to me as well.

comment:17 by Aymeric Augustin, 10 years ago

Component: Core (Other)Database layer (models, ORM)

comment:18 by Aron Podrigal, 8 years ago

Cc: Aron Podrigal added
Has patch: set
Owner: changed from nobody to Aron Podrigal
Status: newassigned
Version: 1.2master

I started to work on this here https://github.com/django/django/compare/master...ar45:max_length_optional I would appriciate any comments.

comment:19 by Aymeric Augustin, 8 years ago

Perhaps you should wait until the discussion on django-developers comes to a consensus; new ideas are still being proposed.

Last edited 8 years ago by Tim Graham (previous) (diff)

comment:20 by Tim Graham, 8 years ago

Patch needs improvement: set

The consensus on the mailing list seems to be that making max_length optional isn't a good idea.

comment:21 by Sven R. Kunze, 8 years ago

Recently, I helped a friend bootstrapping a new Django project.

This issue (TextField vs CharField without max_length) again popped up because I had trouble explaining why exactly there needs to be a limit. Because I have some reasonable experience with Django I know about the internal relationship between models and forms which again lead to some questionable design decisions in the past.

My point here is there is not easy way for (new) Django users to decide for either a CharField or a TextField.

CharField -> one line text data but limited
TextField -> multi-line text data

What most projects actually need is
CharField -> one line, unlimited text data

comment:22 by geekscrapy, 8 years ago

Agree that this should be made optional, but at the very least the docs for 1.9 need amending as it describes this attribute as optional.

comment:23 by Tim Graham, 8 years ago

Please be more specific about the docs inaccuracy. I see the following:

CharField has one extra required argument: ... CharField.max_length

Perhaps you're looking at the form field docs where the argument is indeed option.

comment:24 by Rich Rauenzahn, 5 years ago

Cc: Rich Rauenzahn added

comment:25 by Anvesh Mishra, 21 months ago

Cc: Anvesh Mishra added

comment:26 by Adrian Torres, 16 months ago

Cc: Adrian Torres added
Owner: changed from Aron Podrigal to Adrian Torres
Patch needs improvement: unset

comment:27 by Mariusz Felisiak, 15 months ago

Patch needs improvement: set

comment:28 by Mariusz Felisiak, 15 months ago

Patch needs improvement: unset
Triage Stage: AcceptedReady for checkin

comment:29 by Mariusz Felisiak <felisiak.mariusz@…>, 15 months ago

Resolution: fixed
Status: assignedclosed

In 7eee1dc:

Fixed #14094 -- Added support for unlimited CharField on PostgreSQL.

Co-authored-by: Mariusz Felisiak <felisiak.mariusz@…>

Note: See TracTickets for help on using tickets.
Back to Top