Code

Opened 6 years ago

Closed 5 years ago

Last modified 3 years ago

#9253 closed (fixed)

auto-generated db identifiers (eg. constraint names) should be of consistent length

Reported by: crucialfelix Owned by: krzych
Component: Database layer (models, ORM) Version: 1.0
Severity: Keywords:
Cc: crucialfelix@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: no
Easy pickings: UI/UX:

Description

where hashes are used to create auto-generated db identifier names, the hash should produce a string of consistent length regardless of the environment or OS.

otherwise SQL generated on one system may work, while generating it on another system won't (mysql does not allow identifiers longer than 64 chars).

eg.
in django.db.creation.backends.py line 127

r_name = '%s_refs_%s_%x' % (r_col, col, abs(hash((r_table, table))))

and all other places like that

# mac
ALTER TABLE `website_artistmailout_lists` ADD CONSTRAINT artistmailout_id_refs_absmail_ptr_id_1347fdf FOREIGN KEY (`artistmailout_id`) REFERENCES `website_artistmailout` (`absmail_ptr_id`);
ALTER TABLE `website_artistmailout_lists` ADD CONSTRAINT accountlist_id_refs_id_7f9cd4c6 FOREIGN KEY (`accountlist_id`) REFERENCES `mailings_accountlist` (`id`);
CREATE TABLE `website_artistmailout_users` (
    `id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY,
    `artistmailout_id` integer NOT NULL,
    `user_id` integer NOT NULL,
    UNIQUE (`artistmailout_id`, `user_id`)
)

#ubuntu
ALTER TABLE `website_artistmailout_lists` ADD CONSTRAINT artistmailout_id_refs_absmail_ptr_id_607b768a01347fdf FOREIGN KEY (`artistmailout_id`) REFERENCES `website_artistmailout` (`absmail_ptr_id`);
ALTER TABLE `website_artistmailout_lists` ADD CONSTRAINT accountlist_id_refs_id_695205280632b3a FOREIGN KEY (`accountlist_id`) REFERENCES `mailings_accountlist` (`id`);
CREATE TABLE `website_artistmailout_users` (
    `id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY,
    `artistmailout_id` integer NOT NULL,
    `user_id` integer NOT NULL,
    UNIQUE (`artistmailout_id`, `user_id`)
)

note: in this particular case the contraint name is below 64 chars, but only because I changed the names of the models to avoid the problem.
the ouch was that it previously worked on the dev server but not the deployment.

Attachments (2)

hexdigest.patch (1.6 KB) - added by krzych 5 years ago.
9253-consistent_short_constraint_names.diff (2.4 KB) - added by ramiro 5 years ago.
Patch from krzych updated to also fix sql_remove_table_constraints()

Download all attachments as: .zip

Change History (14)

comment:1 Changed 5 years ago by jacob

  • milestone set to 1.1
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

comment:2 Changed 5 years ago by mboersma

Is the length of the identifier the real issue here, or the fact that hash() might not return the same value on different systems? Django does take into account the max length for a given database backend, but we probably wouldn't want to limit everything to the lowest common denominator (Oracle, which only allows 30-character identifiers).

Perhaps using md5.hexdigest() (as we do in django.db.backends.util.truncate_name) instead of the builtin hash() function would be more portable?

comment:3 Changed 5 years ago by mtredinnick

The problem is that Python's hash() function returns a value that is essentially based on the system's integer size. So moving on 64-bit systems, problems arise. The solution is to move to something with a consistent length, such as hexdigest(), as you observer.

comment:4 Changed 5 years ago by anonymous

  • Owner changed from nobody to ales.zoulek@…

Changed 5 years ago by krzych

comment:5 Changed 5 years ago by krzych

  • Owner changed from ales.zoulek@… to krzych

comment:6 Changed 5 years ago by krzych

  • Has patch set

comment:7 Changed 5 years ago by krzych

  • Resolution set to fixed
  • Status changed from new to closed

comment:8 Changed 5 years ago by Alex

  • Resolution fixed deleted
  • Status changed from closed to reopened

Please don't close tickets because you uploaded a patch, an issue is not fixed until the code is in Django itself.

comment:9 Changed 5 years ago by ramiro

  • Needs tests set

Changed 5 years ago by ramiro

Patch from krzych updated to also fix sql_remove_table_constraints()

comment:10 Changed 5 years ago by russellm

There's almost no way to handle this in a backwards compatible way. The constraint names are generated when tables are created; they are then generated again to produce the SQL for a management reset/sqlreset. This means that the way we generate constraint names needs to be backwards compatible.

As best as I can make out, our options are:

  1. Do nothing. This means there will be tables that are valid on 32 bit machines that won't be valid on 64 bit machines. However, any existing working installs will be unaffected.
  1. Change to using md5()[:8] instead of hash(). This would produce a digest of consistent length across platforms, but any historically created constraint will become invisible to the new code.
  1. Change to using hash() % 232. This would be backwards compatible for 32 bit installs, but backwards incompatible for 64 bit machines. However, it would mean that a database instantiated using a 32 bit machine could be reset using a 64 bit machine.

If we were starting with a clean sheet of paper, (2) would be the right choice, but implementing that option now screws absolutely everybody.

In the interests of retaining _some_ backwards compatibility, (3) is probably the best option. I've discussed this with Malcolm and Jacob in private, and they concurred. Patch coming soon.

comment:11 Changed 5 years ago by russellm

  • Resolution set to fixed
  • Status changed from reopened to closed

(In [10966]) Fixed #9253 -- Modified the method used to generate constraint names so that it is consistent regardless of machine word size.

NOTE: This change is backwards incompatible for some users. If you are using a 32-bit platform, you will observe no differences as a result of this change. However, users on 64-bit platforms may experience some problems using the reset management command.

Prior to this change, 64-bit platforms would generate a 64-bit, 16 character digest in the constraint name; for example:

ALTER TABLE myapp_sometable ADD CONSTRAINT object_id_refs_id_5e8f10c132091d1e FOREIGN KEY ...

Following this change, all platforms, regardless of word size, will generate a 32-bit, 8 character digest in the constraint name; for example:

ALTER TABLE myapp_sometable ADD CONSTRAINT object_id_refs_id_32091d1e FOREIGN KEY ...

As a result of this change, you will not be able to use the reset management command on any table created with 64-bit constraints. This is because the the new generated name will not match the historically generated name; as a result, the SQL constructed by the reset command will be invalid.

If you need to reset an application that was created with 64-bit constraints, you will need to manually drop the old constraint prior to invoking reset.

comment:12 Changed 3 years ago by jacob

  • milestone 1.1 deleted

Milestone 1.1 deleted

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.