Refactored manage.py inspectdb
|Reported by:||Daniel Pope <dan@…>||Owned by:||nobody|
|Component:||Core (Management commands)||Version:||master|
|Has patch:||yes||Needs documentation:||no|
|Needs tests:||yes||Patch needs improvement:||yes|
I have inherited a database developed in Microsoft Access that contains 1985 columns in 85 tables.
To import this into Django, I tried the existing inspectdb tool, but I found numerous bugs that would be too time-consuming to fix manually.
I have refactored the inspectdb tool to do a better job of importing. Aside from better ensuring that output is syntactically valid, the tool now constructs object-oriented structure which is then post-processed and serialized, rather than serializing on-the-fly. This approach makes it easier to procedurally rename models, and to include heuristics to fix up semantic problems with the models.
- model class names are sanitized and converted to CamelCase
- model field names are sanitized and converted to lowercase_with_underscores
- db_column is always specified: this allows refactoring of field names in the generated code
- no comment is issued where field names differ from database column names. This should now be assumed for all fields but the use of db_column makes this explicit
- AutoFields are always explicit: again, this is useful when refactoring
- blank lines are added in the output, so that fields definitions are visually grouped into keys, relations and other fields
- write a warning comment if no primary_key is detected, as such models cannot be used with Django
I've added the following heuristics. These heuristics are not guaranteed to produce the right results, but they should produce the right results more often than if they were not included. For very large database schemas this is very important.
- heuristically detect field names that are potentially keys but not defined as such, grouping these too (this is safe)
- if no primary key is set, heuristically choose a candidate field if one matches the name of the model (this is unsafe but a warning comment is issued; besides, the alternative cannot work)
- heuristically, upgrade IntegerFields with primary_key=True to AutoField if there is none (this is unsafe. A warning comment is issued.)
This tool still does not guarantee to resolve conflicts between field or model names (a very insidious issue should it occur), although the refactor was designed partly this this goal in mind.
Change History (10)
Changed 7 years ago by Daniel Pope <dan@…>
comment:1 follow-up: ↓ 2 Changed 7 years ago by mtredinnick
- Needs documentation unset
- Needs tests unset
- Patch needs improvement set
- Triage Stage changed from Unreviewed to Accepted
comment:6 Changed 4 years ago by gabrielhurley
- Component changed from django-admin.py inspectdb to Core (Management commands)