#9253 closed (fixed)
auto-generated db identifiers (eg. constraint names) should be of consistent length
Reported by: | crucialfelix | Owned by: | krzych |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.0 |
Severity: | Keywords: | ||
Cc: | crucialfelix@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | yes | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
where hashes are used to create auto-generated db identifier names, the hash should produce a string of consistent length regardless of the environment or OS.
otherwise SQL generated on one system may work, while generating it on another system won't (mysql does not allow identifiers longer than 64 chars).
eg.
in django.db.creation.backends.py line 127
r_name = '%s_refs_%s_%x' % (r_col, col, abs(hash((r_table, table))))
and all other places like that
# mac ALTER TABLE `website_artistmailout_lists` ADD CONSTRAINT artistmailout_id_refs_absmail_ptr_id_1347fdf FOREIGN KEY (`artistmailout_id`) REFERENCES `website_artistmailout` (`absmail_ptr_id`); ALTER TABLE `website_artistmailout_lists` ADD CONSTRAINT accountlist_id_refs_id_7f9cd4c6 FOREIGN KEY (`accountlist_id`) REFERENCES `mailings_accountlist` (`id`); CREATE TABLE `website_artistmailout_users` ( `id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY, `artistmailout_id` integer NOT NULL, `user_id` integer NOT NULL, UNIQUE (`artistmailout_id`, `user_id`) ) #ubuntu ALTER TABLE `website_artistmailout_lists` ADD CONSTRAINT artistmailout_id_refs_absmail_ptr_id_607b768a01347fdf FOREIGN KEY (`artistmailout_id`) REFERENCES `website_artistmailout` (`absmail_ptr_id`); ALTER TABLE `website_artistmailout_lists` ADD CONSTRAINT accountlist_id_refs_id_695205280632b3a FOREIGN KEY (`accountlist_id`) REFERENCES `mailings_accountlist` (`id`); CREATE TABLE `website_artistmailout_users` ( `id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY, `artistmailout_id` integer NOT NULL, `user_id` integer NOT NULL, UNIQUE (`artistmailout_id`, `user_id`) )
note: in this particular case the contraint name is below 64 chars, but only because I changed the names of the models to avoid the problem.
the ouch was that it previously worked on the dev server but not the deployment.
Attachments (2)
Change History (14)
comment:1 by , 16 years ago
milestone: | → 1.1 |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:2 by , 16 years ago
comment:3 by , 16 years ago
The problem is that Python's hash()
function returns a value that is essentially based on the system's integer size. So moving on 64-bit systems, problems arise. The solution is to move to something with a consistent length, such as hexdigest()
, as you observer.
comment:4 by , 16 years ago
Owner: | changed from | to
---|
by , 16 years ago
Attachment: | hexdigest.patch added |
---|
comment:5 by , 16 years ago
Owner: | changed from | to
---|
comment:6 by , 16 years ago
Has patch: | set |
---|
comment:7 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:8 by , 16 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Please don't close tickets because you uploaded a patch, an issue is not fixed until the code is in Django itself.
comment:9 by , 16 years ago
Needs tests: | set |
---|
by , 16 years ago
Attachment: | 9253-consistent_short_constraint_names.diff added |
---|
Patch from krzych updated to also fix sql_remove_table_constraints()
comment:10 by , 15 years ago
There's almost no way to handle this in a backwards compatible way. The constraint names are generated when tables are created; they are then generated again to produce the SQL for a management reset/sqlreset. This means that the way we generate constraint names needs to be backwards compatible.
As best as I can make out, our options are:
- Do nothing. This means there will be tables that are valid on 32 bit machines that won't be valid on 64 bit machines. However, any existing working installs will be unaffected.
- Change to using md5()[:8] instead of hash(). This would produce a digest of consistent length across platforms, but any historically created constraint will become invisible to the new code.
- Change to using hash() % 232. This would be backwards compatible for 32 bit installs, but backwards incompatible for 64 bit machines. However, it would mean that a database instantiated using a 32 bit machine could be reset using a 64 bit machine.
If we were starting with a clean sheet of paper, (2) would be the right choice, but implementing that option now screws absolutely everybody.
In the interests of retaining _some_ backwards compatibility, (3) is probably the best option. I've discussed this with Malcolm and Jacob in private, and they concurred. Patch coming soon.
comment:11 by , 15 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
(In [10966]) Fixed #9253 -- Modified the method used to generate constraint names so that it is consistent regardless of machine word size.
NOTE: This change is backwards incompatible for some users. If you are using a 32-bit platform, you will observe no differences as a result of this change. However, users on 64-bit platforms may experience some problems using the reset management command.
Prior to this change, 64-bit platforms would generate a 64-bit, 16 character digest in the constraint name; for example:
ALTER TABLE myapp_sometable ADD CONSTRAINT object_id_refs_id_5e8f10c132091d1e FOREIGN KEY ...
Following this change, all platforms, regardless of word size, will generate a 32-bit, 8 character digest in the constraint name; for example:
ALTER TABLE myapp_sometable ADD CONSTRAINT object_id_refs_id_32091d1e FOREIGN KEY ...
As a result of this change, you will not be able to use the reset management command on any table created with 64-bit constraints. This is because the the new generated name will not match the historically generated name; as a result, the SQL constructed by the reset command will be invalid.
If you need to reset an application that was created with 64-bit constraints, you will need to manually drop the old constraint prior to invoking reset.
Is the length of the identifier the real issue here, or the fact that hash() might not return the same value on different systems? Django does take into account the max length for a given database backend, but we probably wouldn't want to limit everything to the lowest common denominator (Oracle, which only allows 30-character identifiers).
Perhaps using md5.hexdigest() (as we do in django.db.backends.util.truncate_name) instead of the builtin hash() function would be more portable?