id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
29801	Allow using fixtures in data migrations	Calvin DeBoer	nobody	"=== Background
Django's current recommendation for adding data to a application is through the well documented ""DataMigration"" via `RunSQL` and `RunPython` calls. This represents an improvement over the fixture loading for a variety of reasons (it doesn't force developers to keep their fixtures up to date, control over overwriting data ,etc). Fixtures would be great if they didn't have to be kept up to date with the DB's current state (more on that later). As the applications we've been building with Django have grown in complexity we've seen an increase in the need to have control over which data is created in different application environments. For example, in production we want to add a large dataset `foo` (production data), but in development we only want to load a smaller dataset `bar` (for Developer Experience); furthermore, in tests we don't want to add ANY data because we want the tests, `setUp` method to granularly control what data is available in the DB with that set of tests. For the test piece, we investigated setting the `MIGRATION_MODULES` setting to None for all the apps in a `test_settings` file, but that seemed heavy handed, and also only handled a single application environment.


=== Suggestion

1. add a method called `loaddata` that takes a fixture and `apps` <- (the model context within a migration from which one can obtain a `FakeModel` instance representing the Model at state of that migration.

{{{#!python
import os
from django.db import migrations
import environize

PATH = 'path/to/fixtures/'

def load_fixture(apps, schema_editor):
    fixture_file = os.path.join(PATH, 'myfixture.json')
    environize.loaddata(apps, fixture_file)


class Migration(migrations.Migration):

    dependencies = [
        ('app', '0003_auto_20180916_1122'),
    ]

    operations = [
        migrations.RunPython(load_fixture, lambda x, y: None)
    ]

}}}

2. allow `RunPython` and `RunSQL` to take an additional keyword that will tell django which envs to include or exclude in a `DataMigration`.

{{{#!python
class Migration(migrations.Migration):

    dependencies = [
        ('app', '0002_auto_20180916_1122'),
    ]

    operations = [
        migrations.RunPython(add_prod_data, remove_hams, only_in=['production']),
        migrations.RunSQL(""INSERT blah;"", ""REMOVE blah;"", except_in=['test'])
    ]
}}}

=== Implementation

https://github.com/cgdeboer/environize

Here is a hastily put together library to allow environments inside migrations and have state specific `loaddata` . The environment stuff is implemented as decorators on the function at the moment, probably makes more sense to just inherit `RunPython` and `RunSQL` and provide the keywords shows above in example 2.

Probably makes the most sense to continue to develop this as a package outside of Django, but before I do that, thought I'd see if there is any appetite for it in core django.
"	New feature	closed	Migrations	dev	Normal	duplicate	environments migrations	Alex Dehnert	Unreviewed	0	0	0	0	0	0