Changes between Version 33 and Version 34 of SchemaEvolution
- Timestamp:
- Jul 20, 2007, 12:38:45 AM (17 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
SchemaEvolution
v33 v34 1 {{{2 #!rst3 Django Schema evolution4 =======================5 6 1 Schema migration is one of those "hard" problems that I think will be impossible 7 2 to get right for all the cases. What follows are my thoughts on how to get … … 10 5 (please leave comments at the bottom and not inline to the proposal! Also, please add your name or some handle after the comment; everybody using "I" and then not identifying themselves is terribly unclear.) 11 6 12 .. contents:: 13 14 Prior art 15 --------- 7 == Prior art == 16 8 17 9 A few bits of prior arts I examined. … … 22 14 * http://divmod.org/trac/wiki/DivmodAxiom/Reference#Upgraders 23 15 24 Use cases 25 --------- 16 == Use cases == 26 17 27 18 1. Alice has a blog application written in Django. A blog entry looks 28 19 like:: 29 20 21 {{{ 30 22 class Entry(models.Model): 31 23 title = models.CharField(maxlength=30) 32 24 body = models.TextField() 25 }}} 33 26 34 27 After using this to write some blog entries, she realizes that she … … 42 35 2. Ben has a simple ticket tracker with a ticket class that contains:: 43 36 37 {{{ 44 38 class Ticket(models.Model): 45 39 reporter = models.EmailField() … … 47 41 status = models.IntegerField(choices=STATUS_CHOICES) 48 42 description = models.TextField() 49 43 }}} 50 44 After two years of use, Ben wants to add product information to his ticket 51 45 tracker and has decided he doesn't care about who reported a ticket, so 52 46 he writes these models:: 53 47 48 {{{ 54 49 class Product(models.Model): 55 50 ... … … 60 55 description = models.TextField() 61 56 product = models.ForeignKey(Product) 62 57 }}} 63 58 Ben knows quite a bit of SQL; he's also the company DBA. 64 59 65 60 3. Carol supports a newsroom with a set of models that look like:: 66 61 62 {{{ 67 63 class Reporter(models.Model): 68 64 ... … … 75 71 stories = models.ManyToManyField(Article) 76 72 ... 77 73 }}} 78 74 As it turns out, her conception of ``Section`` was naive, and on top of 79 75 that she wants to separate certain types of articles into multiple 80 76 classes. She rewrites her models to look like:: 81 77 78 {{{ 82 79 class Reporter(models.Model): 83 80 ... … … 95 92 96 93 class WeddingAnnouncement(models.Model): 97 ... 94 95 ... 98 96 99 97 class Section(models.Model): 100 98 included_categories = models.ManyToManyField(Category) 101 99 ... 102 100 }}} 103 101 Carol obviously needs to keep all the data in her system. Some articles 104 102 will need to be moved into ``Obituary`` or ``WeddingAnnouncement``, … … 113 111 for almost ten years now. 114 112 115 Possible solutions 116 ------------------ 117 118 Write SQL 119 ````````` 113 == Possible solutions == 114 115 === Write SQL === 120 116 121 117 This is the current situation: you write a bunch of SQL and bung it into … … 132 128 to deal with the data migration. 133 129 134 Automatic db introspection 135 `````````````````````````` 130 === Automatic db introspection === 136 131 137 132 In this scenario, Django inspects your models and your database, and … … 140 135 like:: 141 136 137 {{{ 142 138 $ ./manage.py sqlupdate | psql mydb 143 139 }}} 144 140 Ramifications: 145 141 This works for Alice; she does ``django-admin sqlupdate | mysql mydb`` and … … 158 154 the SQL she writes quickly. 159 155 160 Automatically applied migration code 161 ```````````````````````````````````` 156 === Automatically applied migration code === 162 157 163 158 In this scenario, you give your models a version. When it comes time to upgrade, … … 190 185 since she has lots of custom code she needs to run. 191 186 192 Introspection + migration 193 ````````````````````````` 187 === Introspection + migration === 194 188 195 189 This approach is a combination of `Automatic db introspection`_ and … … 206 200 slightly, but not all that much, really. 207 201 208 Conclusions 209 ----------- 202 == Conclusions == 210 203 211 204 * "Just write SQL" sucks and is obviously wrong and needs to be fixed. … … 220 213 bonus of getting Alice to save her migrations in source control. 221 214 222 Proposal 223 -------- 215 == Proposal == 224 216 225 217 So. … … 229 221 * An optional module, ``django.contrib.evolution`` provides a 230 222 way to store versions in the database:: 231 223 {{{ 232 224 class ModelVersion(models.Model): 233 225 content_type = models.ForeignKey(ContentType) 234 226 version = models.PositiveIntegerField() 235 227 }}} 236 228 This makes evolution optional and thus doesn't clutter up your tablespace 237 229 if you don't need it. 238 230 239 231 * Models grow a ``version`` attribute:: 240 232 {{{ 241 233 class MyModel(models.Model): 242 234 ... … … 244 236 class Meta: 245 237 version = 3 246 238 }}} 247 239 * ``django-admin syncdb`` applies evolutions using the following steps: 248 240 … … 270 262 package(s). Users will then upgrade the version number and run ``syncdb``. 271 263 272 Comments/questions 273 ------------------ 264 == Comments/questions == 274 265 275 266 Asked by nwp in the chat room: "what are the rules for checking whether you are … … 291 282 292 283 293 Something Completely Different 294 ------------------------------ 284 == Something Completely Different == 295 285 296 286 I'd like to suggest a different approach altogether. … … 312 302 .. _RefactoringDatabases: http://www.bookpool.com/sm/0321293533 313 303 314 Why something different 315 ----------------------- 304 == Why something different == 316 305 317 306 I agree with "Something Completely Different" above. I worked on a Java-based product that used a similar approach to numbered schema versioning (although we worked with serialized objects in the backend instead of a relational DB). We had trouble merging changes with conflicting version numbers. This was not initially a huge problem for us, because we had very little overlap within and between our projects. However, the times that it happened it was a big deal, and as soon as we started real parallel development, it became a critical issue. I think for Django teams this is a killer, because deployments are so quite common and testing tends to be a part of the developers roles. … … 319 308 An example is svnA user creates schema version 5, while concurrently svnB makes version 5. Everything is fine until merge, then either user has an unusable database. This is bad enough, but consider branches! It gets ugly, and can easily break the production database at unexpected times. 320 309 321 Handling concurrent development -- multiple users and branches 322 -------------------------------------------------------------- 310 == Handling concurrent development -- multiple users and branches == 323 311 324 312 Concurrent development requires communication, especially with a shared data store like the database. The above example should never break the production database -- if two developers commit conflicting code, it's up to them to work out the conflict. Preventing the conflicting code from reaching production is an organizational policy issue, not a web framework issue. … … 326 314 You do raise a good point regarding concurrent development with branches. It's still not possible to get rid of communication -- when it comes time to do a branch merge, discussing database schema will be mandatory. However, the tool should allow for branch merges; why not store a branch name AND a version number for each schema revision? When planning for a branch merge, one can implement a "from branch 1 version 6 to branch 2 version 3" migration. Obviously, migration points will need to be well defined, but supporting branches at least allows for the possibility. 327 315 328 Another way to concurrent development -- migrations as distributed version control systems 329 ------------------------------------------------------------------------------------------ 316 == Another way to concurrent development -- migrations as distributed version control systems == 330 317 331 318 One of the issues with data base migrations, one raised by the previous section and one which has often bitten users of Rails migrations, is that they don't scale with the number of users: while rails-like migrations are wonderful for single users or very small teams (up to 3 or so members), the point is soon reached when different programmers will create different migrations with the same version number, and hell ensues (manual merges/renames of some of the migrations etc...). … … 337 324 Since the problem has already been solved, this should be handled by learning from the various decentralized VCS (darcs, Git, Mercurial, Bazaar) and introducing one of their patch-merging and diff-tracking strategies into the migration tool. Of course this wouldn't necessarily resolve all the issues of incompatible migrations, but conflicting ones could at least be tracked. 338 325 339 Automatic interactive db introspection that generates migration scripts 340 ----------------------------------------------------------------------- 326 == Automatic interactive db introspection that generates migration scripts == 327 341 328 I think that the nicest way for the user that would work for 99% of the cases is an automatic and (if needed) interactive command, that will generate the migration steps it took and store them in some script. Also, it could automatically update the version number in the suggestion on migration scripts above. So for a model that changes from:: 342 329 {{{ 343 330 class Ticket(models.Model): 344 331 reporter = models.EmailField() 345 332 owner = models.EmailField() 346 333 stat = models.IntegerField(choices=STATUS_CHOICES) 347 334 }}} 348 335 to:: 349 336 {{{ 350 337 class Ticket(models.Model): 351 338 reporter = models.EmailField() 352 339 status = models.IntegerField(choices=STATUS_CHOICES) 353 340 description = models.TextField() 354 341 }}} 355 342 This is one removed field (owner), one added field (description), and one renamed field (stat->status). The renamed is the trickiest one obviously since it looks like one has been deleted and another added. So in this case the user should be given the menu:: 356 343 {{{ 357 344 The following fields are new: 358 345 1 status … … 373 360 Warning: removing the field owner will permanently delete data. Write "delete" and press enter to continue: delete 374 361 Database update complete. Modifications written to Ticket-migration-1.sql 375 362 }}} 376 363 If there are only additions, then the entire menu can be totally skipped and just the additions added. If there are no renames the entire thing also becomes much shorter. I think this covers most cases, and any other cases can be handled manually by writing custom SQL migration scripts as above. 377 364 … … 382 369 Mike H (mike@mugwuffin.com) 383 370 384 You can't handle stupidity 385 ------------------------------ 371 === You can't handle stupidity === 372 386 373 To be honest I would not like a system that tried to cater to Carol. She is an idiot and needs to learn more. If the answer to 387 374 something is "you don't have to know what you are doing" then its not a good answer. Django.. to use Django, its not that you don't … … 407 394 -- Anders Hovmöller 408 395 409 }}} 396 410 397 411 398 == Implementation: Automatic DB Introspection ==