Code

Opened 4 years ago

Last modified 3 years ago

#14904 new Bug

TextField with unique (or in unique_together) constraint breaks for large inputs in Postgres

Reported by: jorn Owned by: nobody
Component: Database layer (models, ORM) Version: 1.2
Severity: Normal Keywords: postgresql, index, textfield
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hi,

i have a model like this one:

class foo(Model):
  a = CharField(max_length=8)
  b = TextField()
  
  class Meta:
    unique_together = (("a", "b"),)

Now this leads to a sql create table statement like the following in Postgresql:

CREATE TABLE "myapp_foo" (
    "id" serial NOT NULL PRIMARY KEY,
    "a" varchar(8) NOT NULL,
    "b" text NOT NULL,
    UNIQUE ("a", "b")
);

The problem here is that b is a textfield which can get arbitrarily big (that's what they are there for), while values in a btree index can not:
If the input for b is above a certain size postgresql will complain:

django.db.utils.DatabaseError: index row size 3032 exceeds maximum 2712 for index "myapp_a_key"
HINT:  Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.

The problem here is that the used unique constraint as of Postgres 8.4.5 uses a btree internally, which as stated above doesn't work with values above a certain size.

The solution for this problem would be: If a TextField has a / is part of a unique constraint or has an explicit index_db=True: _always_ index an md5 or better sha256 sum of TextField columns only (note that sha256 is in pgcrypto module only).
In this case:
UNIQUE ("a", md5("b"))

As mentioned this also applies to the universal index_db=True field argument.

Attachments (0)

Change History (3)

comment:1 Changed 4 years ago by ramiro

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

It would be helpful to have a complete simple sample case description especially a hint at how much text content should be assigned to the field for this condition triggering, and thus can be reproduced and/or tested against other postgresql versions.

comment:2 Changed 3 years ago by jaddison

  • Severity set to Normal
  • Type set to Bug

comment:3 Changed 3 years ago by anonymous

  • Easy pickings unset
  • UI/UX unset

Have the same problem with mysql,

_mysql_exceptions.OperationalError: (1071, 'Specified key was too long; max key length is 767 bytes')

With

name = models.CharField(null=True, blank=True, max_length=300)
user = models.ForeignKey(User,null=True, blank=True)    

class Meta :
        unique_together = (('name', 'user'),)

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as new
The owner will be changed from nobody to anonymous. Next status will be 'assigned'
as The resolution will be set. Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.