Opened 2 years ago

Closed 19 months ago

#20143 closed Bug (fixed)

Lazy loading of related fields does not work for non-loaded models

Reported by: andreas_pelme Owned by: nobody
Component: Database layer (models, ORM) Version: 1.5
Severity: Normal Keywords: app-loading
Cc: djangoproject.com@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

There is a problem with the lazy loading mechanism of related fields.

Steps to reproduce (reproduced with 1.4 and 1.5):

$ django-admin.py startproject lazy_model_loading
$ cd lazy_model_loading/
lazy_model_loading $ python manage.py startapp foo
lazy_model_loading $ python manage.py startapp bar
lazy_model_loading $ echo "class Foo(models.Model): bar = models.ForeignKey('bar.Bar')" >> foo/models.py
lazy_model_loading $ echo "class Bar(models.Model): pass" >> bar/models.py
lazy_model_loading $ echo "INSTALLED_APPS = ['foo', 'bar']" >> lazy_model_loading/settings.py

This is the expected behavior (this is the way everything should work)

lazy_model_loading $ python manage.py shell --plain
Python 2.7.2 (default, Jun 20 2012, 16:23:33)
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from foo.models import Foo
>>> Foo()
<Foo: Foo object>

However, when running python directly, the issue is triggered:

lazy_model_loading $ DJANGO_SETTINGS_MODULE=lazy_model_loading.settings python
Python 2.7.2 (default, Jun 20 2012, 16:23:33)
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from foo.models import Foo
>>> Foo()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/andreas/.virtualenvs/removeme/lib/python2.7/site-packages/django/db/models/base.py", line 397, in __init__
    val = field.get_default()
  File "/Users/andreas/.virtualenvs/removeme/lib/python2.7/site-packages/django/db/models/fields/related.py", line 1038, in get_default
    if isinstance(field_default, self.rel.to):
TypeError: isinstance() arg 2 must be a class, type, or tuple of classes and types

The difference here is that manage.py shell calls get_models which forces all models to load (which, judging by the comment is an old workaround for another lazy load bug):
https://github.com/django/django/blob/master/django/core/management/commands/shell.py#L58

If get_models is removed from the shell management command - it will show the same problem.

This is probably the same bug as described here:

http://stackoverflow.com/questions/5776047/django-modelform-giving-isinstance-arg-2-must-be-a-class-type-or-tuple-of-cl

http://stackoverflow.com/questions/14386536/instantiating-django-model-raises-typeerror-isinstance-arg-2-must-be-a-class (altough the accepted answer does not seem to be 100 % correct - I am pretty sure it does this bug does not depend on settings.DEBUG)

This bug can lead to very hard-to-debug problems since it depends on the the implicit import order before the model being initiated. The exception that is thrown is also not very helpful.

I would be happy to provide a fix, but I am not sure where to start fixing this... Pointers would be appreciated.

Change History (12)

comment:1 Changed 2 years ago by nip3o

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

Reproduced on Ubuntu 12.04 LTS, Django 1.4.3 and Django 1.5

$ DJANGO_SETTINGS_MODULE=lazy_model_loading.settings python
Python 2.7.3 (default, Aug  1 2012, 05:14:39) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from foo.models import Foo
>>> Foo()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/base.py", line 397, in __init__
    val = field.get_default()
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/fields/related.py", line 1038, in get_default
    if isinstance(field_default, self.rel.to):
TypeError: isinstance() arg 2 must be a class, type, or tuple of classes and types

comment:2 Changed 2 years ago by andreas_pelme

I have created a pull request that fixes these issues by calling get_model() when it is not yet loaded in ForeignKey.get_default and in ReverseSingleRelatedObjectDescriptor.__set__: https://github.com/django/django/pull/966

ReverseSingleRelatedObjectDescriptor.__set__ suffers from the same problem, although it is not really an issue in practice

from foo.models import Foo
Foo(bar=1234)

The expected behavior here is to get the "ValueError: Cannot assign "1234": "Foo.bar" must be a "Bar" instance.". The user code is clearly already broken and this error will probably be fixed in the user code before this shows up. Nonetheless, it is a bug and it should also be fixed.

ReverseSingleRelatedObjectDescriptor.__get__ also references rel.to, but with the __set__ fix and given that __get__ can not really be called before __set__, this can likely not be a problem.

rel.to is used in other places than these, it is hard to figure out where it might lead to bugs. The approach in my pull request is to carefully load models where needed, but it might be a slippery slope... Is there a better approach altogheter?

All tests pass with the changes in the pull request, but how can this be tested reliably?

comment:3 Changed 2 years ago by andreas_pelme

  • Summary changed from Lazy loading of related fields does not work to Lazy loading of related fields does not work for non-loaded models

comment:4 Changed 2 years ago by andreas_pelme

I think the workaround in the shell management command just makes this much worse since things seems to be working when playing in the shell. runserver (and all management commands which uses model validation) also makes this a non-problem, since the model-validation forcefully loads all models modules).

This means that when this actually becomes an issue, it will probably be in production. I did a quick test with python manage.py runserver vs gunicorn lazy_model_loading.wsgi and the same models and this urlconf:

from django.conf.urls import patterns
from django.http import HttpResponse

from foo.models import Foo

urlpatterns = patterns('', ('', lambda request: HttpResponse(unicode(Foo()))))

This runs just fine with runserver, but crashes with gunicorn.

comment:5 Changed 2 years ago by andreas_pelme

When digging into this, I also found another similar problem with reverse relations.

This works as expected since all models are forcefully-loaded:

lazy_model_loading $ python manage.py shell
Python 2.7.2 (default, Jun 20 2012, 16:23:33)
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from bar.models import Bar
>>> Bar().foo_set
<django.db.models.fields.related.RelatedManager object at 0x1080cea10>

This does not work:

lazy_model_loading $ DJANGO_SETTINGS_MODULE=lazy_model_loading.settings python
Python 2.7.2 (default, Jun 20 2012, 16:23:33)
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from bar.models import Bar
>>> Bar().foo_set
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Bar' object has no attribute 'foo_set'

Shall I open a new ticket for this, or is it better to track it here too?

comment:6 Changed 2 years ago by andreas_pelme

  • Cc djangoproject.com@… added

comment:7 Changed 2 years ago by andreas_pelme

The pull request is now updated with a test case which shows this problem for .get_default().

comment:8 Changed 2 years ago by akaariai

This seems to be another instance of app-loading problems. Still, I think the two changes of doing a get_model() are correct - Django should safe-guard that you don't get the errors from still unloaded/partially unloaded apps. Another approach is to make sure that you can't end up in the problematic spots with partially unloaded state, but that seems hard to do.

The tests should not muck with app-cache state, instead use subprocess to run Python scripts in the tests. I think admin_scripts has some code that can be used for this.

comment:9 Changed 2 years ago by akaariai

  • Triage Stage changed from Unreviewed to Accepted

comment:10 Changed 2 years ago by timo

  • Has patch set
  • Patch needs improvement set

comment:11 Changed 19 months ago by timo

I'm not sure if this ticket is still valid after the merge of app loading. The patch messes around with AppCache which is no longer a thing.

comment:12 Changed 19 months ago by aaugustin

  • Keywords app-loading added
  • Resolution set to fixed
  • Status changed from new to closed

Django 1.7 requires you to run django.setup() when you don't go through manage.py or wsgi.py, precisely to avoid this problem.

Note: See TracTickets for help on using tickets.
Back to Top