Opened 3 years ago

Last modified 17 months ago

#24778 assigned New feature

Data Migration from Fixture

Reported by: Eugene Owned by: Victor
Component: Migrations Version: master
Severity: Normal Keywords:
Cc: Eugene, vccabral@… Triage Stage: Someday/Maybe
Has patch: yes Needs documentation: yes
Needs tests: yes Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by Eugene)

Providing data via fixtures has been deprecated. In the past, we used to execute the loaddata manually. After Django introduce migration, the recommended way to import data is to create an empty migration and use RunPython migration operations to load the data.

This is a very common use case for data migration via fixture. We create the function just to call_command loaddata. http://stackoverflow.com/a/25981899/764592

In my opinion, instead of having to create the function, we can actually simplify this into a migration operation on its own.

As follow:

# Module: django.db.migrations.operations.base.special
from django.core.management import call_command                                 
class LoadFixture(Operation):                                                   
    reduces_to_sql = False                                                      
    reversible = False                                                          
                                                                                
    def __init__(self, *fixtures):                                              
        self.fixtures = fixtures                                                
                                                                                
    def state_forwards(self, app_label, state):                                 
        pass                                                                    
                                                                                
    def database_forwards(self, app_label, schema_editor, from_state, to_state):
        for fixture in self.fixtures:                                           
            call_command('loaddata', fixture, app_label=app_label)              
                                                                                
    def database_backwards(self, app_label, schema_editor, from_state, to_state):
        pass                                                                    
                                                                                
    def describe(self):                                                         
        return "Load Fixture Operation" 

The implication of LoadFixture operations can be shown in the following example:

Assuming we have the fixture in foobar/fixtures/book_data.json

# File: foobar/migrations/0002_auto_load_book.py
class Migration(migrations.Migration):
    dependencies = [
        ('foobar', '0001_initial'),
    ]
    operations = [
        migrations.LoadFixture('book_data'),
    ]

The migration script is now much simpler.

PS: This is my first time creating a ticket and involved in Django internal. Let me know if I should make a PR for this feature.

Change History (10)

comment:1 Changed 3 years ago by Eugene

Cc: Eugene added
Description: modified (diff)

comment:2 Changed 3 years ago by Eugene

Description: modified (diff)

comment:3 Changed 3 years ago by Markus Holtermann

Needs documentation: set
Needs tests: set
Patch needs improvement: set
Triage Stage: UnreviewedSomeday/Maybe

I'm a bit torn about this feature. I can understand peoples request to easily load (existing) fixtures. It's convenient. On the other hand this will inevitably lead to the very same problem we wanted to prevent by using RunPython in its intended form:

MyModel = apps.get_model('myapp', 'MyModel')
MyModel.objects.create(...)

Apart from that, as soon as a model changes your LoadFixture operation will fail: call_command will use the model myapp.models.MyModel whereas the database has an older state because the respective changes to the database would happen in "0003_add_somefield".

comment:4 Changed 3 years ago by Simon Charette

Markus, what do think about allowing an apps kwarg to be passed to the loaddata command and serializers' initializers?

If supplied the command and the serializers would either use the provided apps or default to django.apps.apps.

From that point we could provide a RunPython subclass that simply calls call_command with apps=apps?

Last edited 3 years ago by Simon Charette (previous) (diff)

comment:5 Changed 3 years ago by Markus Holtermann

That would probably work. I even started going down that rabbit hole of adding apps to the serializers, but revoked the changes because I didn't see the benefit and it got kind of ugly.

comment:6 Changed 3 years ago by Markus Holtermann

And there are even already apps out there that try to implement fixture loading during migrations with, of course, the exact same problem I mentioned before:

There even is a library out there that provides that feature for South: https://github.com/sebleier/django-alpaca

comment:7 Changed 17 months ago by Victor

I ran into the app state not matching the model.py state while loading fixtures and ended up creating a branch of django to handle this use case. I issued a PR to a feature branch in my fork of django. I am more than willing to work on this issue given the appropriate guidance. I will finish up reading the contributor guidelines.

https://github.com/vccabral/django/pull/2

comment:8 Changed 17 months ago by Victor

Cc: vccabral@… added
Owner: changed from nobody to Victor
Status: newassigned

comment:9 Changed 17 months ago by Tim Graham

You should probably write to the DevelopersMailingList about this. I'm uncertain about the design and whether or not it should be included in Django.

comment:10 Changed 17 months ago by Markus Holtermann

As mentioned above in comments 3 to 5 and suggested by charettes, the "right" approach to this issue is likely adding apps as an optional argument to the serializers and using that in a next step.

A command that dumps a fixture file or a database as a database migrations feels ugly here. You're essentially creating (arbitrary) python code that follows absolutely no patterns (how do you handle multi line strings, byte strings, UUIDs, ...)?

Note: See TracTickets for help on using tickets.
Back to Top