|Version 25 (modified by 8 years ago) (diff),|
Multiple Database Support
The most recent (September 2008) discussion about multi-db support can be seen on this thread: http://groups.google.com/group/django-developers/browse_thread/thread/9f0353fe0682b73
An ideal solution would address all of the following:
- Different models/applications living on different databases, for example a 'blog' application on db1 and a forum application on db2. This should include the ability to assign a different database to an existing application without modifying it, e.g. telling Django where to keep the django.contrib.auth.User table.
- Master-slave replication, where writes go to a single master while reads are spread across one or more slaves. Due to replication lag it may sometimes be necessary for reads that directly follow a related write to be directed to the master.
- Talking to existing or legacy databases while still allowing newer functionality to be developed on a different database dedicated solely to Django.
- Sharding, where rows within a single model are spread across multiple databases for improved write performance.
- Moving models between databases - e.g. for converting from MySQL to Postgres, or for shuffling items between shards.
- Transparently handling database failure - many large applications like to handle unavailable databases in their application code, switching to another slave if the first one is unavailable for example. See Digg Database Architecture for a high profile example.
Problems to solve
- Connection definitions - how are multiple database connections defined? Does this new method replace Django's existing DATABASE_* family of settings?
- Connection selection - what is the API for telling Django which database a query should be executed against?
- Associating connections with models - for the common case where a model is assigned to a different database, how is that assignment made? How can existing apps such as contrib.auth be assigned to a different database in a clean way? This is one facet of the connection selection problem.
- Related managers - how do we deal with the case when the User model lives on one database but its related BlogEntrys live on a different database?
- Joins across different databases - do we try to get these working? If not, how do we detect them and what kind of error or warning do we present?
- Should models be aware of which database they were loaded from? - if so, model.save() will be able to automatically remember the correct database connection. The ability to over-ride this (e.g. with
model.save(using='archive')) would be useful for moving models between databases.
- Jan Oberst demonstrates part of his sharding API
- mysql_replicated implements master/slave replication triggered by HTTP GET v.s. POST as a custom database backend, works against Django 1.0 with no patching required.
- Django Multiple Database support - most recent (last modified May 2008)
- PreviousMultipleDatabaseBranch - obsolete branch, last modified 2006
p4.diff (36.2 KB) - added by 10 years ago.
Current (incomplete) patch as of 2006-06-08
naive-transaction.diff (3.1 KB) - added by 10 years ago.
naive first pass at allowing transactions to work
p5.diff (52.8 KB) - added by 10 years ago.
Updated patch including transaction support
naive-related.diff (2.4 KB) - added by 10 years ago.
naive first pass at getting related fields to work
multidbsample.tar.gz (5.9 KB) - added by 10 years ago.
Small sample of connecting to two databases, and pulling data back from both of them.
djangomultidbhowto.doc (52.0 KB) - added by 10 years ago.
A guide to installing the Multiple DB Branch, and then creating a project to work with multiple legacy databases.
Download all attachments as: .zip