Opened 11 years ago

Closed 10 years ago

Last modified 10 years ago

#22436 closed Bug (fixed)

migrations fail on custom upload_to on ImageField

Reported by: David Binetti <dbinetti@…> Owned by: Andrew Godwin
Component: Migrations Version: 1.7-beta-2
Severity: Release blocker Keywords:
Cc: info@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

One of my models has an ImageField with a dynamically created upload_to:

class Photo(models.Model):


    account = models.ForeignKey(Account)
    
    ...    

    def upload_thumb(instance, filename):
        return 'photos/{0}/dropbox{1}'.format(instance.account.id, filename)

    thumbnail = models.ImageField(upload_to=upload_thumb, null=True)

When I run makemigrations it produces the following (without error):

# encoding: utf8
from django.db import models, migrations
import apps.dinadesa.models


class Migration(migrations.Migration):

    dependencies = [
        ('dinadesa', '0006_auto_20140414_0836'),
    ]

    operations = [
        migrations.AlterField(
            model_name='photo',
            name='thumbnail',
            field=models.ImageField(null=True, upload_to=apps.dinadesa.models.upload_thumb),
        ),
    ]

(I keep my apps in an apps directory in the project root, in case that matters.)

When I try to run the migration, I get:

(dinadesa)$ django-admin migrate
Traceback (most recent call last):
  File "/Users/dbinetti/.virtualenvs/dinadesa/bin/django-admin", line 11, in <module>
    sys.exit(execute_from_command_line())
  File "/Users/dbinetti/.virtualenvs/dinadesa/lib/python2.7/site-packages/django/core/management/__init__.py", line 427, in execute_from_command_line
    utility.execute()
  File "/Users/dbinetti/.virtualenvs/dinadesa/lib/python2.7/site-packages/django/core/management/__init__.py", line 419, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/dbinetti/.virtualenvs/dinadesa/lib/python2.7/site-packages/django/core/management/base.py", line 288, in run_from_argv
    self.execute(*args, **options.__dict__)
  File "/Users/dbinetti/.virtualenvs/dinadesa/lib/python2.7/site-packages/django/core/management/base.py", line 337, in execute
    output = self.handle(*args, **options)
  File "/Users/dbinetti/.virtualenvs/dinadesa/lib/python2.7/site-packages/django/core/management/commands/migrate.py", line 62, in handle
    executor = MigrationExecutor(connection, self.migration_progress_callback)
  File "/Users/dbinetti/.virtualenvs/dinadesa/lib/python2.7/site-packages/django/db/migrations/executor.py", line 14, in __init__
    self.loader = MigrationLoader(self.connection)
  File "/Users/dbinetti/.virtualenvs/dinadesa/lib/python2.7/site-packages/django/db/migrations/loader.py", line 48, in __init__
    self.build_graph()
  File "/Users/dbinetti/.virtualenvs/dinadesa/lib/python2.7/site-packages/django/db/migrations/loader.py", line 145, in build_graph
    self.load_disk()
  File "/Users/dbinetti/.virtualenvs/dinadesa/lib/python2.7/site-packages/django/db/migrations/loader.py", line 103, in load_disk
    migration_module = import_module("%s.%s" % (module_name, migration_name))
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
  File "/Users/dbinetti/Repos/dinadesa/project/apps/dinadesa/migrations/0007_auto_20140414_0850.py", line 6, in <module>
    class Migration(migrations.Migration):
  File "/Users/dbinetti/Repos/dinadesa/project/apps/dinadesa/migrations/0007_auto_20140414_0850.py", line 16, in Migration
    field=models.ImageField(null=True, upload_to=apps.dinadesa.models.upload_thumb),
AttributeError: 'module' object has no attribute 'upload_thumb'

Which I can work around by changing the field line in my migration to:

field=models.ImageField(null=True, upload_to=apps.dinadesa.models.Photo.upload_thumb),   # Note the addition of the `Photo` model in the path.

Then, the migrate command works fine.

BUT if I then try to run another migration it detects the altered field, and I have to do it all again.

I hope this is sufficient information. It is my first bug report and I'm not certain how to create a test case, but it is 100% reproduce-able on my system.

Thanks.

Change History (18)

comment:1 by Simon Charette, 11 years ago

Owner: changed from nobody to Simon Charette
Status: newassigned
Triage Stage: UnreviewedAccepted

comment:2 by chris cauley, 11 years ago

Owner: changed from Simon Charette to chris cauley

comment:3 by chris cauley, 11 years ago

Your specific problem can be easily fixed by moving the upload_thumb method outside of the Photo class. Unless Photo.upload_thumb is called anywhere there's no difference between the two. There's nothing wrong with how you did it before, but there's no way to implement migrations as typed above (except manually typing qualname for Photo.upload_to like you did). I'll expand on this in a second comment, but for now here's the solution.

def upload_thumb(instance, filename):
    return 'photos/{0}/dropbox{1}'.format(instance.account.id, filename)

class Photo(models.Model):

    account = models.ForeignKey(Account)

    ...

    thumbnail = models.ImageField(upload_to=upload_thumb, null=True)
Last edited 11 years ago by chris cauley (previous) (diff)

comment:4 by chris cauley, 11 years ago

The migration serializer previously was using the __name__ of the function and the module, which is not the full import path if the function is a method of a class.

            ...
            else:
                module = value.__module__
                return"%s.%s" % (module, six.get_qualname(value)), set(["import %s" % module])
                # formerly
                # return"%s.%s" % (module, value.__name__), set(["import %s" % module])

six.get_qualname(func) is a new function that returns func.__qualname__ for python 3.X and fakes it for 2.X (sort of). It relies on func.im_class, which isn't always available in python 2.X:

def upload1(instance,self):
    return 'photos/{0}/dropbox{1}'.format(instance.account.id, filename)
class NotAModel:
    def upload2(instance,self):
        return 'photos/{0}/dropbox{1}'.format(instance.account.id, filename)
    class Subclass:
        def upload3(instance,self):
            return 'photos/{0}/dropbox{1}'.format(instance.account.id, filename)

class MyModel(models.Model):
    def upload4(instance,self):
        return 'photos/{0}/dropbox{1}'.format(instance.account.id, filename)
    photo_1 = models.ImageField(upload_to=upload1)
    photo_2 = models.ImageField(upload_to=NotAModel.upload2)
    photo_3 = models.ImageField(upload_to=NotAModel.Subclass.upload3)
    photo_4 = models.ImageField(upload_to=upload4)

Here photo_1 and photo_2 will work in Python 2 and 3 (with the modifications on the branch mentioned below). photo_3 can't be serialized in 2.X because there's no way to access NotAModel from Subclass. photo_4 can't be serialized because upload_to is passed in the unbound function upload4 which doesn't have the attribute im_class.

But this all kind of unnecessary because migrations don't actually need upload_to. Additionally, if you rename upload_to it will cause past migrations to break. The best way out I can see is to set the upload_to="IN_MIGRATIONS" in the migrations file and modify the FileField to ignore upload_to when in migrations. If anyone is interested, here is a link to the changes listed above and six.get_qualname:

https://github.com/chriscauley/django/commit/2a6f27451f9cfe2f76f92a8806b2939dc46ce2a5

Last edited 11 years ago by chris cauley (previous) (diff)

comment:5 by David Binetti <dbinetti@…>, 11 years ago

Thank you for responding. I did move my function outside of the class, and it works (although it did break migrations as you indicated it would.) I'm still in dev so no harm, no foul.

As an aside, I am using upload_to to create a dynamically generated path which is respected by django-storages (for S3, in my case.) Other attempts to create a path from the object itself failed (for reasons I can't remember -- but a properly configured upload_to was the solution.)

Thanks again.

comment:6 by chris cauley, 11 years ago

It sounds like you fixed your issue, but here's the solution for anyone who has this problem before it is patched.

The upload_to field should not be used in the migrations at all. If you go in and change the upload_to=path.to.some.function to a string in the migrations they will work again. The important thing is that the upload to in the last migration matches the current state since then inspector will (wrongly) think a change in upload_to is a change in the database. You can also do python manage.py squashmigrations APPNAME to nuke old broken migrations.

comment:7 by chris cauley, 11 years ago

Has patch: set
Needs tests: set

https://github.com/chriscauley/django/commit/5f1707eee60d9a742cfd7f86fb81bebdfca9b01c

That's my proposed solution. The only place where Field.deconstruct() is used is in migrations. Since schema migrations don't use upload_to it's best to replace it with a dummy function. This is future safe (in the event that the field's upload_to is altered) and doesn't have the qualname issues that Python 2.X introduces.

Let me know if this is acceptable and which tests to write and I'll write them and make a pull request.

Last edited 11 years ago by chris cauley (previous) (diff)

comment:8 by Markus Holtermann, 11 years ago

Cc: info@… added
Severity: NormalRelease blocker
Version: 1.7-beta-1master

Django's migrations should represent the exact state the model has at a given time. This includes all arguments defined on fields. I therefore disagree with your change. I'd stick with defining a function outside the class.

I furthermore set this as a release blocker, as @andrewgodwin should have a look at this as well.

comment:9 by chris cauley, 11 years ago

Thank you for upping the severity. I don't understand why it should "represent the exact state the model has at a given time", and I can think of strong arguments as to why it shouldn't. Using django==1.3 and south I can create a model that uses a URLField. If I change the value of the verify_exists keyword argument, that does not require me to create a new migration file. If I go up to django==1.4 then suddenly the verify exists keyword argument is removed and I have to go through and delete it from every instance that uses it. No biggy though, just one line here and there. However, I don't have to go back and change my migrations because verify_exists has no impact on the database and therefore doesn't trigger a migration.

Now imagine that migrations were built into django back in 1.3 and that they were made to "represent the exact state the model has at a given time". I create a URLField, make a migration. I decide I don't like verify_exists and set it to False, make another migration (not a huge deal, but pointless). Then I update to 1.4 and so I go through and remove verify_exists. A migration is now impossible because models.URLField(verify_exists=False) (which is the previous state of the model) is not valid django any more (and raises a TypeError as of django==1.5). In order to make my migrations work again I need to go back and remove every reference to verify_exists from every migration file. The migration where I went from verify_exists=True to False now does nothing! Then again the migration did nothing in the first place because verify_exists doesn't change the database.

One of my django projects (an online news magazine) has 425 migrations. grep URLField * -r|grep migrations/0| wc -l shows that this represents 6439 lines of code for the 50 URLFields in this project. That's insanity. That project might still be on 1.3 if upgrading it to 1.4 required modifying 6.5k lines of code.

This is just one example but I can come up with plenty other like it. In every case we either have to support legacy code indefinitely (for example leave verify_exists as a do-nothing key word argument) or else modifying the Field's API would require going back and editing every existing migration file. Neither option is pleasant and either way none of this has any effect on the database. I think migrations should be minimal and independent of non-database code.

comment:10 by Markus Holtermann, 11 years ago

For the why and how to prevent problems with custom fields, I want to point you to #21499

comment:11 by Andrew Godwin, 11 years ago

chriscauley: The contract is more precise than that; when fields are used in a migration you're guaranteeing that they'll be backwards-compatible for as long as the migration exists. That's easy enough for Django itself, especially now we have tools to let you reset migrations and remove old declarations; you'd be able to easy move from your 425 to just a few new initial ones.

Migrations just being for the database is also not true; as we support data migrations we need to support all arguments, even those which seem useless at first (e.g. upload_to, validators). These fields are needed to have the field truly represented in the migration, otherwise a migration that added new rows wouldn't be able to do it properly.

Plus, what if a subclass of FileField actually uses upload_to to affect the database? We can't rule it out for everything.

It's painful, but we've got literally everything else in Django working well in this regard, pointers to functions on classes is one of the very few missing things - your suggested use of six.get_qualname seems sensible and I'll investigate it in a bit.

comment:12 by Andrew Godwin, 11 years ago

Owner: changed from chris cauley to Andrew Godwin

I can't find this six.get_qualname function, and even then, if it uses im_func it's not going to work on function references passed in from within the class body (as these are not yet methods).

I think this may be impossible to solve on Python 2; we might have to add a check where the serializer tries to re-import it and fails if it can't find it, so we can at least give a nice error message.

comment:13 by Althalus, 11 years ago

I'm using 1.7 now and found that migrations fail too because my upload_to function is generated dynamically.
Consider the following example:

# utils/files.py
def gen_upload_to(fieldname, path, .......):
    def upload_to(instance, filename):
        pass  # some custom logic
    return upload_to

# app/models.py
class MyModel(models.Model):
    file = models.FileField(upload_to=gen_upload_to('file', 'path/to/files'))

# app/migrations/0001_initial.py
class Migration(migrations.Migration):
    operations = [
        migrations.CreateModel(
            name='MyModel',
            fields=[
                ('file', models.FileField(upload_to=utils.files.upload_to)),
            ],
        ),
    ]

Obviously my utils/files.py module doesn't contain upload_to function with results in errors.
I had to create dummy upload_to in there to prevent import errors, but anyway every time I use makemigrations command django detect changes in my file field.

Version 2, edited 11 years ago by Althalus (previous) (next) (diff)

comment:14 by Aymeric Augustin, 11 years ago

Version: master1.7-beta-2

comment:15 by Ilya Semenov, 10 years ago

@Althalus, a very ugly workaround for this would be:

# utils/files.py
def gen_upload_to(fieldname, path, .......):
    import sys
    if len(sys.argv) > 1 and sys.argv[1] in ('makemigrations', 'migrate'):
        return None # Hide ourselves from Django migrations
    def upload_to(instance, filename):
        pass  # some custom logic
    return upload_to

Too bad Django monitors non-db-related field attributes for no apparent reason. It's not only upload_to. For instance, it creates a useless migration even when a field's help_text is changed.

comment:16 by Andrew Godwin, 10 years ago

You do need the upload_to attribute in migrations, as well as help text, as they can affect the usage of fields inside RunPython segments, or other field subclasses might use help_text for something that's not just display. South tried the whitelist and blacklist approaches here and both were flawed.

I've consulted with people and there's literally no sane way to get a pointer to a function like that if it's reused directly from the class body; the fix is to move the function outside of the class body.

The fix for this will be a check which makes sure the referenced function is importable and gives the above advice if it is not.

comment:17 by Andrew Godwin <andrew@…>, 10 years ago

Resolution: fixed
Status: assignedclosed

In 6fd455adfcc85f6bd390bce784a1b5dfe5d610ea:

Fixed #22436: More careful checking on method ref'ce serialization

comment:18 by Andrew Godwin <andrew@…>, 10 years ago

In 98949e3b108100a40cb6ad59be78f33580439fef:

[1.7.x] Fixed #22436: More careful checking on method ref'ce serialization

Note: See TracTickets for help on using tickets.
Back to Top