Opened 16 years ago

Closed 16 years ago

Last modified 13 years ago

#8193 closed (fixed)

`__import__(mod, {}, {}, [''])` causes double import of modules ('module' and 'module' + '.')

Reported by: Ivan Illarionov Owned by: Jacob
Component: Uncategorized Version: dev
Severity: Keywords: import
Cc: Gonzalo Saavedra Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

As discussed at http://groups.google.com/group/django-developers/browse_thread/thread/d27261561bc36d96
and http://bugs.python.org/issue2090, dynamically loaded modules with fromlist=[''] are initialized twice (as 'module' and 'module.').

Attachments (8)

import_fixes_8272.diff (15.1 KB ) - added by Ivan Illarionov 16 years ago.
Fixes all imports
import_fixes_8272.2.diff (15.7 KB ) - added by Ivan Illarionov 16 years ago.
Added another import fix to the previous patch
import_fixes_8272.3.diff (17.1 KB ) - added by Ivan Illarionov 16 years ago.
Added simple test
import_fixes.2.diff (15.9 KB ) - added by mrts 16 years ago.
update patch to r9333
import_fixes.5.diff (16.2 KB ) - added by mrts 16 years ago.
missing init.py in tests/regressiontests/dynamic_imports/
import_fixes-with-sys.modules.diff (19.1 KB ) - added by mrts 16 years ago.
patch with the sys.modules[modname] idiom, against trunk
import_fixes-with-sys.modules_updated-test.diff (19.1 KB ) - added by mrts 16 years ago.
update the test as well
importlib.diff (34.0 KB ) - added by Brett Cannon 16 years ago.
Add importlib to django.utils and move all import usage over to it.

Download all attachments as: .zip

Change History (48)

by Ivan Illarionov, 16 years ago

Attachment: import_fixes_8272.diff added

Fixes all imports

by Ivan Illarionov, 16 years ago

Attachment: import_fixes_8272.2.diff added

Added another import fix to the previous patch

comment:1 by Ivan Illarionov, 16 years ago

Needs tests: set

comment:2 by mrts, 16 years ago

Thanks Ivan. Don't you think a wrapper would be more adequate, DRY?

Note that the DRY argument directly hits your patch as you have redundant brackets around module_path.split() everywhere (split() always returns a list). It's lots of trouble to fix that without a central point for doing it. Moreover, the overhead of the extra function call is negligible.

django/core/utils/import_wrapper:

def import_module(module_path, module_name=None):
  if not module_name: # note that we land here on '' as well
    return __import__(module_path, fromlist=module_path.split('.')[-1])
  return __import__(module_path, fromlist=[module_name])

Example use:

---
django/conf/__init__.py:
---

from django.utils.import_wrapper import import_module
...
    mod =  import_module(self.SETTINGS_MODULE)
...

---
django/db/__init__.py
---

from django.utils.import_wrapper import import_module
...
   backend = import_module('%s%s.base' % (_import_path, settings.DATABASE_ENGINE), 'base') 
...

As for tests, something should be done at module level that affects a particular testcase's state so that duplicate initialization can be caught. Nontrivial IMHO, I'd say not worth the trouble.

comment:3 by mrts, 16 years ago

Oh, well... your [] was of course correct. So the wrapper should be:

django/core/utils/import_wrapper.py:

def import_module(module_path, module_name=None):
  if not module_name: # note that we land here on '' as well
    return __import__(module_path, fromlist=[module_path.split('.')[-1]])
  return __import__(module_path, fromlist=[module_name])

Perhaps using a wrapper is overdoing it and your patch should remain as-is.

by Ivan Illarionov, 16 years ago

Attachment: import_fixes_8272.3.diff added

Added simple test

in reply to:  3 comment:4 by Ivan Illarionov, 16 years ago

Needs tests: unset

Replying to mrts:

Oh, well... your [] was of course correct. So the wrapper should be:

fromlist keyword argument is Python 2.5 only

It should be:

django/core/utils/import_wrapper.py:

def import_module(module_path, module_name=None):
    if not module_name:
        return __import__(module_path, {}, {}, [module_path.split('.')[-1]])
    return __import__(module_path, {}, {}, [module_name])

Perhaps using a wrapper is overdoing it and your patch should remain as-is.

I'm -0 for using the wrapper.

comment:5 by Malcolm Tredinnick, 16 years ago

milestone: 1.01.0 maybe
Triage Stage: UnreviewedAccepted

Note that this isn't actually a bug, since modules should be able to handle being imported multiple times in any case. That happens quite a bit in Django (since __import__ is equivalent to Python's reload()). Thus this is probably worth doing at some point, but it's not a 1.0 showstopper.

comment:6 by Ivan Illarionov, 16 years ago

I should add that without this patch we can see really scary things if we print

[k for k in sys.modules if k.endswith('.') or '..' in k]

inside any Django app.

in reply to:  5 comment:7 by Ivan Illarionov, 16 years ago

Replying to mtredinnick:

Note that this isn't actually a bug, since modules should be able to handle being imported multiple times in any case. That happens quite a bit in Django (since __import__ is equivalent to Python's reload()). Thus this is probably worth doing at some point, but it's not a 1.0 showstopper.

Multiple imports isn't a bug, but extra keys with dots in sys.modules is a bug.

comment:8 by Ivan Illarionov, 16 years ago

Summary: `__import__(mod, {}, {}, [''])` causes double initialization of modules`__import__(mod, {}, {}, [''])` causes double import of modules ('module' and 'module' + '.')

comment:9 by Jacob, 16 years ago

milestone: 1.0 maybepost-1.0
Patch needs improvement: set

I agree with Malcolm -- this is a bit ugly, but it's not a bug. Moving post-1.0.

Also, the patch has some, um, bugs: think about what happens if settings.SESSION_ENGINE doesn't contain a ".", for example.

in reply to:  9 comment:10 by Ivan Illarionov, 16 years ago

Replying to jacob:

Also, the patch has some, um, bugs: think about what happens if settings.SESSION_ENGINE doesn't contain a ".", for example.

No bugs here.

>>> 'string_without_dots'.split('.')
['string_without_dots']
>>> 'string_without_dots'.split('.')[-1]
'string_without_dots'

Of course I need to update the patch to the current trunk, I will do it as soon as I have some free time.

comment:11 by Ivan Illarionov, 16 years ago

__import__(mod, {}, {}, [mod.split('.')[-1]]) works perfectly in any situation:

>>> for mod in ('django', 'django.utils', 'django.utils.encoding'):
...     print __import__(mod, {}, {}, [mod.split('.')[-1]])
... 
<module 'django' from '/usr/lib/python2.5/site-packages/django/__init__.pyc'>
<module 'django.utils' from '/usr/lib/python2.5/site-packages/django/utils/__init__.pyc'>
<module 'django.utils.encoding' from '/usr/lib/python2.5/site-packages/django/utils/encoding.pyc'>

Just like __import__(mod, {}, {}, ['']) only without ugly side-effects.

The change is *very* simple and working/tested, I don't understand the reasons to wait.

in reply to:  5 comment:12 by Tai Lee, 16 years ago

Replying to mtredinnick:

Note that this isn't actually a bug, since modules should be able to handle being imported multiple times in any case. That happens quite a bit in Django (since __import__ is equivalent to Python's reload()). Thus this is probably worth doing at some point, but it's not a 1.0 showstopper.

It seems that any module that registers ModelAdmin objects cannot handle being imported multiple times (raises AlreadyRegistered. This seems to happen right now if you've bundled your admin-specific forms, views, and urls into an admin module instead of an admin.py, it will be imported once by admin.autodiscover and again by include('myproject.admin.urls').

comment:13 by Henrik Vendelbo, 16 years ago

Ok, perhaps it isn't a show stopper.
This is causing problems for me for the third time though. I am adding a bit of settings defaults, and my defaults seem to be set 4 times. Even though I only set them if the attribute doesn't already exist. Sounds to me that django.conf.settings is being loaded 4 times. And then the "slight" problem that admin can't load. If I disable my code things appear fine.
Now there might be a bug in my code, but I still cant find it among my massive 6 lines of code after looking over it for 30 mins.

comment:14 by mrts, 16 years ago

As the patch is not up to date, I generated a new one as follows:

grep rule for reviewing all invalid imports:

$ grep -Er "__import__.*\['']" . | grep -v '\.svn'

Small shell script for fixing most of them:

FILES_TO_FIX=`grep -lEr "__import__.*\['']" . | grep -v '\.svn'`
sed -i "s/\(__import__(\)\([[:alnum:]._]\+\), {\(.*\)\[''])/\1\2, {\3[\2.split('.')[-1]])/" $FILES_TO_FIX
echo "Fix the following manually:"
grep -Er "__import__.*\['']" . | grep -v '\.svn'

Note that once Python 2.3 support is discontinued, foo.rsplit('.', 1)[-1] should be used (rsplit was added in 2.4).

by mrts, 16 years ago

Attachment: import_fixes.2.diff added

update patch to r9333

comment:15 by mrts, 16 years ago

hvendelbo: could you please test if this fixes your problem (I assume it does)?

comment:16 by anonymous, 16 years ago

Patch needs improvement: unset

by mrts, 16 years ago

Attachment: import_fixes.5.diff added

missing init.py in tests/regressiontests/dynamic_imports/

comment:17 by mrts, 16 years ago

Whoo, good news on the mailing list -- Python 2.3 support will be dropped for 1.1. I'll upload a separate patch for trunk with rsplit('.', 1)[-1] ASAP.

comment:18 by mrts, 16 years ago

See also #6579

comment:19 by mrts, 16 years ago

Seems 2.3 support might not be dropped, so won't update the patch until final agreement is met.

comment:20 by Malcolm Tredinnick, 16 years ago

Owner: changed from nobody to Malcolm Tredinnick
Patch needs improvement: set
Status: newassigned

Since all of these patches replace one form of undocumented (by working consistently for years until it changed) behaviour with a different sort, we shouldn't do that. The proposed patched rely on the fact that importing a missing name from a module doesn't raise and error and does import the module -- this is what Jacob was identifying when he pointed out it was doing the equivalent of "from settings import settings", for example. Rather, we should use the stable, documented way.

I've had a local branch in progress for a while to fix this up and it has revealed problems in some of the assumptions we are making about imports, so they have to be fixed, too. That's why I haven't committed anything here, yet, in case anybody was wondering.

comment:21 by mrts, 16 years ago

Malcolm, I filed this upstream as well with a patch, in http://bugs.python.org/issue4438 . Perhaps you care to take a look and comment (either here or there)?

comment:22 by mrts, 16 years ago

Hrvoje Niksic provides by far the cleanest solution in http://mail.python.org/pipermail/python-dev/2008-November/083728.html

comment:23 by mrts, 16 years ago

In short:

  • as __import__ is not meant to be used directly, there will be a much simpler API for runtime loading of modules in imp in Python 3.1 (and perhaps 2.7)
  • until that the following idiom should be used:
    import sys
    __import__(modname)
    mod = sys.modules[modname]
    

comment:25 by mrts, 16 years ago

And also available in the trunk version of the docs: http://docs.python.org/dev/library/functions.html#__import

by mrts, 16 years ago

patch with the sys.modules[modname] idiom, against trunk

by mrts, 16 years ago

update the test as well

comment:26 by Jacob, 16 years ago

#10144 is another dup.

comment:27 by mrts, 16 years ago

as is #10095

comment:28 by mrts, 16 years ago

Replying to Malcolm's second comment on #10095:

Note, also, that we don't guarantee things will only be imported once. Python's __import__ statement is the same as reload(), it reimports the module, even if it already exists. It's up to code authors to make any allowances for double execution of code lines.

Malcolm, are you sure about that? The claim is neither justified by tests:

$ cat foo.py
print "foo imported."

$ python
Python 2.5.2 (r252:60911, Oct  5 2008, 19:29:17) 
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> __import__('foo')
foo imported.
<module 'foo' from 'foo.py'>
>>> __import__('foo')
<module 'foo' from 'foo.py'>

nor by docs:

This function is invoked by the import statement. It can be replaced (by importing the builtins module and assigning to builtins.__import__) in order to change semantics of the import statement, but nowadays it is usually simpler to use import hooks (see PEP 302). -- (i.e. if __import__ would import modules more than once, then so would plain import which clearly is not the case).

So, unless you can prove me wrong, the ever-recurring problems of initialization code (signals, whatnot) running more than once will be gone after this is fixed.

comment:29 by Gary Wilson, 16 years ago

An importlib module and import_module function (API not yet final though) were recently checked into Python trunk:

http://svn.python.org/view/python/trunk/Lib/importlib.py?view=markup

comment:30 by Gonzalo Saavedra, 16 years ago

Cc: Gonzalo Saavedra added

comment:31 by (none), 16 years ago

milestone: post-1.0

Milestone post-1.0 deleted

comment:32 by Malcolm Tredinnick, 16 years ago

Resolution: fixed
Status: assignedclosed

(In [9947]) Removed some import-time dependencies on Django's settings.

Now you can import the file storage stuff and still call settings.configure()
afterwards. There is still one import-time usage of settings in
django.contrib.comments, but that's unavoidable.

Backport of r9945 and r9946 from trunk (this is needed in order to fix #8193 in
a clean fashion, which is why it's being backported).

comment:33 by Alex Gaynor, 16 years ago

Resolution: fixed
Status: closedreopened

This wasn't actually fixed, it just had the text fix in it :/

comment:34 by Brett Cannon, 16 years ago

Owner: changed from Malcolm Tredinnick to Brett Cannon
Status: reopenednew

I am working on a patch to port importlib to Django and switch over all uses of import to django.utils.importlib.import_module(). Beyond ridding Django of the need to call import directly it will also fix the original issue that led to this bug report. =)

by Brett Cannon, 16 years ago

Attachment: importlib.diff added

Add importlib to django.utils and move all import usage over to it.

comment:35 by Brett Cannon, 16 years ago

I have attached the patch that adds django.utils.importlib and moves all uses of import over to import_module(). Patch is named importlib.diff.

comment:36 by Brett Cannon, 16 years ago

Owner: changed from Brett Cannon to Jacob
Patch needs improvement: unset

Assigning to jacob as he said he would review the patch when I was finished with it.

comment:37 by Alex Gaynor, 16 years ago

milestone: 1.1

comment:38 by Jacob, 16 years ago

Resolution: fixed
Status: newclosed

(In [10088]) Fixed #8193: all dynamic imports in Django are now done correctly. I know this because Brett Cannon borrowed the time machine and brought Python 2.7's 'importlib back for inclusion in Django. Thanks for the patch-from-the-future, Brett!

in reply to:  38 comment:39 by Antti Kaihola, 15 years ago

Replying to jacob:

(In [10088]) Fixed #8193: all dynamic imports in Django are now done correctly. I know this because Brett Cannon borrowed the time machine and brought Python 2.7's 'importlib back for inclusion in Django. Thanks for the patch-from-the-future, Brett!

See #11264 for a small problem this changeset introduced.

comment:40 by Jacob, 13 years ago

milestone: 1.1

Milestone 1.1 deleted

Note: See TracTickets for help on using tickets.
Back to Top