Version 8 (modified by limodou@…, 10 years ago) (diff)

update to 1.5

Database Dump Script

  • Author: limodou <limodou AT>
  • Current Version: 1.4 2007-01-20


This tool is used for dump and restore database of Django. And it can also support some simple situations for Model changes, so it can also be used in importing data after the migration of Model.

It includes: dump and restore.


Command Line:

    python [-svdh] [--settings] dump [applist]

If applist is ignored,then it means that all app will be dumped. applist can be one or more app name.

Description of options:

  • -s Output will displayed in console, default is writing into file
  • -v Display execution infomation, default is does not display
  • -d Directory of output, default is datadir in current directory. If the path is not existed, it'll be created automatically.
  • -h Display help information.
  • --settings settings model, default is in current directory.

It can only support Python format for now. It'll create a standard python source file, for example:

    dump = {'table': 'tablename', 'records': [[...]], 'fields': [...]} 

table' is table name in database, records is all records of the table, it's a list of list, that is each record is a list. fields` is the fields name of the table.


Command Line:

    python [-svdrh] [--settings] load [applist]

You can refer to the above description for same option. Others is:

  • -r Does not empty the table as loading the data, default is empty the table first then load the data

Using this tool, you can not only restore the database, but also can deal with the simple changes of database. It can select the suitable field from the backup data file according to the changed Model automatically, and it can also deal with the default value define in Model, such as default parameter and auto_now and auto_now_add parameter for Date-like field. And you can even edit the backup data file manually, and add a default key for specify the default value for some fields, the basic format is:

    'default':{'fieldname':('type', 'value')}

default is a dict, the key will be the field name of the table, the value will be a two element tuple, and the first element of this tuple is type field, the second element is its value. Below is a description of type field:

type value 说明
'value' real value using the value field directly
'reference' referred field name the value of this filed will use the value of referred field. It'll be used when the field name is changed
'date' 'now'|'yyyy-mm-dd' It's a date date type, if the value field is 'now', then the value be current time. Otherwise, it'll be a string, it's format is 'yyyy-mm-dd'
'datetime' 'now'|'yyyy-mm-dd hh:mm:ss' The same as above

The strategy of selection of default value of a field is: first, create a default value dict according the Model, then update it according the default key of backup data file. So you can see if there is a same definition of a field in both Model and backup data file, it'll use the one in backup data file.

According the process of default value, this tool will suport these changes, such as: change of field name, add or remove field name, etc. So you can use this tool to finish some simple update work of database.

But I don't give it too much test, and my situation is in sqlite3. So download and test are welcome, and I hope you can give me some improve advices.

You can also find this file in my personal project SharePlat, which is at



Write down what you think


# Author: limodou (
# This tool is used for dump and reload data from and into database
# You can see the help info through:
#     python -h
# For now, it only support .py format, so the output result will 
# be saved as python source code, and you can import it.
# Version 1.5 2007-02-08
# Update 1.0 2007-01-18
# Update 1.1 2007-01-19
#    * if no arguments after, then it'll show help infomation
# Update 1.2 2007-01-20
#    * change dumpdb to use model info but not cursor.description,
#      because some database backend does not support cursor.description
# Update 1.3 2007-01-20
#    * change the output format of data file, and improve the process
#      effective of dumpping and loading
# Update 1.4 2007-01-21
#    * support mysql
# Update 1.5 2007-02-08
#    * If the filename is not exists, then skip it

import os, sys
from optparse import OptionParser
import datetime

def _get_table_order(app_labels):
    from django.db.models import get_app, get_apps, get_models
    from django.db.models import ForeignKey, OneToOneField

    if not app_labels:
        app_list = get_apps()
        app_list = [get_app(app_label) for app_label in app_labels] 
    models = {}
    for app in app_list: 
        for model in get_models(app): 
            models[model._meta.db_table] = model
    s = []      
    rules = [] 
    def order(s, rule):
        a, b = rule
            i = s.index(a)
                j = s.index(b)
                if j<i:
                    del s[i]
                    s.insert(j, a)
                j = s.index(b)
                del s[j]
    for i, table in enumerate(models.keys()[:]):
        for field in models[table]._meta.fields:
            if isinstance(field, (ForeignKey, OneToOneField)):
                tname =
                if not models.has_key(tname) or tname == table:
                rules.append((tname, table))
                order(s, (tname, table))

    n = []
    for k, v in models.items():
        if s.count(k) == 0:
    return [models[k] for k in s+n]

def _find_key(d, key):
    if not d:
        return None
    for k, v in d.items()[:]:
        if k == key:
            return d
            result = _find_key(v, key)
            if result is not None:
                return result

def loaddb(app_labels, format, options):
    from django.db import connection, transaction, backend

    if options.verbose: 
        print "Begin to load data for %s format...\n" % format 
    models = _get_table_order(app_labels)

    cursor = connection.cursor()

    errornum = 0

    if not options.remain and not options.stdout:
        m = models[:]
        for model in m:
            cursor.execute('DELETE FROM %s WHERE 1=1;' % backend.quote_name(model._meta.db_table))
            for table, fields in get_model_many2many_stru(model):
                cursor.execute('DELETE FROM %s WHERE 1=1;' % backend.quote_name(table))
    success = True
    for model in models: 
            load_model(cursor, model, format, options)
            for table, fields in get_model_many2many_stru(model):
                load_model(cursor, (table, fields), format, options)
        except Exception, e: 
            success = False
            errornum += 1
    if success:
    if errornum:
        print "There are %d errors found! The database has been rollbacked!" % errornum
        print "Successful!"
def load_model(cursor, model, format, options): 
    from django.db import backend

    datadir, verbose, stdout = options.datadir, options.verbose, options.stdout
    sql = 'INSERT INTO %s (%s) VALUES (%s);'

    if isinstance(model, (tuple, list)):
        filename = os.path.join(datadir, model[0] + '.%s' % format)
        fields, default = model[1], {}
        opts = model._meta
        filename = os.path.join(datadir, opts.db_table + '.%s' % format)
        fields, default = get_model_stru(model)
    if verbose:
        print '..Dealing %s for %s format...\n' % (filename, format)
    if not os.path.exists(filename):
        if verbose:
            print '..%s does not exists, so Skip it..\n' % filename
        objs = {}
        if format == 'py':
            s = []
            f = file(filename, 'rb')
            for line in f:
                varname = line.split('=')[0]
                if varname.strip() != 'records':
                    d = {}
                    exec ''.join(s) in d
                    objs['table'] = d.get('table', '')
                    objs['fields'] = d.get('fields', [])
                    objs['default'] = d.get('default', {})
                    objs['records'] = f
#            f = file(filename, 'rb') 
#            objs =
#            records = objs['records']
#            f.close()
            raise 'Not support this format %s' % format
        fs = objs['fields']
        table = objs['table']
        default.update(objs.get('default', {}))
        count = 0
        for row in objs["records"]:
            if row.strip() == ']':
            row = eval(row)
            d = dict(zip(fs, row))
            sql_fields = []
            sql_values = []
            for fd in fields:
                v = None
                if d.has_key(fd):
                    v = d[fd]
                    if default.get(fd, None) is not None:
                        kind, value = default[fd]
                        if not kind or kind == 'value':
                            v = value
                        elif kind == 'reference':
                                v = d[value]
                            except KeyError:
                                sys.stderr.write("Referenced field [%s] does not exist\n" % value) 
                        elif kind == 'date':
                            if not value or value == 'now':
                                v ='%Y-%m-%d')
                                v = value
                        elif kind == 'datetime':
                            if not value or value == 'now':
                                v ='%Y-%m-%d %H:%M:%S')
                                v = value
                            raise Exception, "Cann't support this default type [%s]\n" % kind
                if v is not None:
            e_sql = sql % (backend.quote_name(table), 
                ','.join(map(backend.quote_name, sql_fields)), ','.join(['%s'] * len(sql_fields)))
            if stdout:
                print e_sql, sql_values, '\n'
                    cursor.execute(e_sql, sql_values)
                    count += 1
                    sys.stderr.write("Error sql: %s %s\n" % (e_sql, sql_values))
        if verbose:
            print '(Total %d records)\n' % count
    except Exception, e:
        import traceback
        sys.stderr.write("Problem loading %s format '%s' : %s\n" %  
                 (format, filename, str(e))) 

def get_model_stru(model):
    from django.db.models.fields import DateField, DateTimeField, TimeField
    fields = []
    default = {}
    for f in model._meta.fields:
        v = f.get_default()
        if v is not None:
            default[f.column] = ('value', v)
        if isinstance(f, (DateTimeField, DateField, TimeField)):
            if f.auto_now or f.auto_now_add:
                v =
                default[f.column] = ('value', f.get_db_prep_save(v))
    return fields, default

def get_model_many2many_stru(model):
    from django.db.models import GenericRel
    opts = model._meta
    for f in opts.many_to_many:
        fields = []
        if not isinstance(f.rel, GenericRel):
            yield f.m2m_db_table(), fields
def dumpdb(app_labels, format, options): 
    from django.db.models import get_app, get_apps, get_models

    datadir, verbose, stdout = options.datadir, options.verbose, options.stdout
    if verbose: 
        print "Begin to dump data for %s format...\n" % format 
    if len(app_labels) == 0: 
        app_list = get_apps() 
        app_list = [get_app(app_label) for app_label in app_labels] 
    if not os.path.exists(datadir):
    errornum = 0
    for app in app_list: 
        for model in get_models(app): 
                write_result(dump_model(model), format, options)

                for result in dump_many2many(model):
                    write_result(result, format, options)
            except Exception, e: 
                import traceback
                sys.stderr.write("Unable to dump database: %s\n" % e) 
                errornum += 1

    if errornum:
        print "There are %d errors found!" % errornum
        print "Successful!"

def dump_model(model):
    from django.db import connection, backend

    opts = model._meta
    cursor = connection.cursor()
    fields, default = get_model_stru(model)
    cursor.execute('select %s from %s' % 
        (','.join(map(backend.quote_name, fields)), backend.quote_name(opts.db_table)))
    return call_cursor(opts.db_table, fields, cursor)

def call_cursor(table, fields, cursor):
    yield table
    yield fields
    while 1:
        rows = cursor.fetchmany(100)
        if rows:
            for row in rows:
                yield _pre_data(row)

def _pre_data(row):
    row = list(row)
    for i, fd in enumerate(row):
        if isinstance(fd, datetime.datetime):
            row[i] = row[i].strftime('%Y-%m-%d %H:%M:%S') # + '.' + str(row[i].microsecond).rstrip('0')
        elif isinstance(fd,
            row[i] = row[i].strftime('%Y-%m-%d')
    return row

def dump_many2many(model):
    from django.db import connection, backend
    cursor = connection.cursor()

    for table, fields in get_model_many2many_stru(model):
        cursor.execute('select %s from %s' % 
            (','.join(map(backend.quote_name, fields)), backend.quote_name(table)))
        yield call_cursor(table, fields, cursor)

def write_result(result, format, options):
    table =
    fields =
    filename = os.path.join(options.datadir, table + '.%s' % format)
    if options.verbose:
        print '..Dumping %s ...\n' % filename
    if not options.stdout:
        f = file(filename, 'wb')
        f = sys.stdout
    print >>f, 'table = %r' % table
    print >>f, 'fields = %r' % fields
    print >>f, 'records = ['
    i = 0
    for t in result:
        print >>f, repr(t)
        i += 1
    print >>f, ']'
    if options.verbose:
        print '(Total %d records)\n' % i
    if not options.stdout:
def get_usage():
    usage = """
  %prog [options] action [applist]:
      action: dump load
    return usage

def execute_from_command_line(argv=None):
    # Use sys.argv if we've not passed in a custom argv
    if argv is None:
        argv = sys.argv

    # Parse the command-line arguments. optparse handles the dirty work.
    parser = OptionParser(usage=get_usage())
        help='Python path to settings module, e.g. "myproject.settings.main". If this isn\'t provided, the DJANGO_SETTINGS_MODULE environment variable will be used.')
    parser.add_option('-d', '--dir', help='Output/Input directory.', default="datadir", dest="datadir")
#    parser.add_option('-f', '--format', help='Data format(json, xml, python).', type="choice",
#        choices=['json', 'xml', 'python'], default='json')
    parser.add_option('-v', '--verbose', help='Verbose mode', action='store_true')
    parser.add_option('-s', '--stdout', help='Output the data to stdout', action='store_true')
    parser.add_option('-r', '--remain', help='Remain the records of the tables, default will delete all the records. Only used for loading.', action='store_true')

    options, args = parser.parse_args(argv[1:])
    if len(args) == 0:
    action = args[0]
    apps = args[1:]
    if options.settings:
        os.environ['DJANGO_SETTINGS_MODULE'] = options.settings
        from import setup_environ
            import settings
        except ImportError:
            print "You don't appear to have a settings file in this directory!"
            print "Please run this from inside a project directory"
    if action == 'dump':
        dumpdb(apps, 'py', options)
    elif action == 'load':
        loaddb(apps, 'py', options)

if __name__ == '__main__':
Back to Top