Code

Opened 6 years ago

Closed 5 years ago

Last modified 3 years ago

#8193 closed (fixed)

`__import__(mod, {}, {}, [''])` causes double import of modules ('module' and 'module' + '.')

Reported by: i_i Owned by: jacob
Component: Uncategorized Version: master
Severity: Keywords: import
Cc: gonz Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

As discussed at http://groups.google.com/group/django-developers/browse_thread/thread/d27261561bc36d96
and http://bugs.python.org/issue2090, dynamically loaded modules with fromlist=[''] are initialized twice (as 'module' and 'module.').

Attachments (8)

import_fixes_8272.diff (15.1 KB) - added by i_i 6 years ago.
Fixes all imports
import_fixes_8272.2.diff (15.7 KB) - added by i_i 6 years ago.
Added another import fix to the previous patch
import_fixes_8272.3.diff (17.1 KB) - added by i_i 6 years ago.
Added simple test
import_fixes.2.diff (15.9 KB) - added by mrts 5 years ago.
update patch to r9333
import_fixes.5.diff (16.2 KB) - added by mrts 5 years ago.
missing init.py in tests/regressiontests/dynamic_imports/
import_fixes-with-sys.modules.diff (19.1 KB) - added by mrts 5 years ago.
patch with the sys.modules[modname] idiom, against trunk
import_fixes-with-sys.modules_updated-test.diff (19.1 KB) - added by mrts 5 years ago.
update the test as well
importlib.diff (34.0 KB) - added by brettcannon 5 years ago.
Add importlib to django.utils and move all import usage over to it.

Download all attachments as: .zip

Change History (48)

Changed 6 years ago by i_i

Fixes all imports

Changed 6 years ago by i_i

Added another import fix to the previous patch

comment:1 Changed 6 years ago by i_i

  • Needs documentation unset
  • Needs tests set
  • Patch needs improvement unset

comment:2 Changed 6 years ago by mrts

Thanks Ivan. Don't you think a wrapper would be more adequate, DRY?

Note that the DRY argument directly hits your patch as you have redundant brackets around module_path.split() everywhere (split() always returns a list). It's lots of trouble to fix that without a central point for doing it. Moreover, the overhead of the extra function call is negligible.

django/core/utils/import_wrapper:

def import_module(module_path, module_name=None):
  if not module_name: # note that we land here on '' as well
    return __import__(module_path, fromlist=module_path.split('.')[-1])
  return __import__(module_path, fromlist=[module_name])

Example use:

---
django/conf/__init__.py:
---

from django.utils.import_wrapper import import_module
...
    mod =  import_module(self.SETTINGS_MODULE)
...

---
django/db/__init__.py
---

from django.utils.import_wrapper import import_module
...
   backend = import_module('%s%s.base' % (_import_path, settings.DATABASE_ENGINE), 'base') 
...

As for tests, something should be done at module level that affects a particular testcase's state so that duplicate initialization can be caught. Nontrivial IMHO, I'd say not worth the trouble.

comment:3 follow-up: Changed 6 years ago by mrts

Oh, well... your [] was of course correct. So the wrapper should be:

django/core/utils/import_wrapper.py:

def import_module(module_path, module_name=None):
  if not module_name: # note that we land here on '' as well
    return __import__(module_path, fromlist=[module_path.split('.')[-1]])
  return __import__(module_path, fromlist=[module_name])

Perhaps using a wrapper is overdoing it and your patch should remain as-is.

Changed 6 years ago by i_i

Added simple test

comment:4 in reply to: ↑ 3 Changed 6 years ago by i_i

  • Needs tests unset

Replying to mrts:

Oh, well... your [] was of course correct. So the wrapper should be:

fromlist keyword argument is Python 2.5 only

It should be:

django/core/utils/import_wrapper.py:

def import_module(module_path, module_name=None):
    if not module_name:
        return __import__(module_path, {}, {}, [module_path.split('.')[-1]])
    return __import__(module_path, {}, {}, [module_name])

Perhaps using a wrapper is overdoing it and your patch should remain as-is.

I'm -0 for using the wrapper.

comment:5 follow-ups: Changed 6 years ago by mtredinnick

  • milestone changed from 1.0 to 1.0 maybe
  • Triage Stage changed from Unreviewed to Accepted

Note that this isn't actually a bug, since modules should be able to handle being imported multiple times in any case. That happens quite a bit in Django (since __import__ is equivalent to Python's reload()). Thus this is probably worth doing at some point, but it's not a 1.0 showstopper.

comment:6 Changed 6 years ago by i_i

I should add that without this patch we can see really scary things if we print

[k for k in sys.modules if k.endswith('.') or '..' in k]

inside any Django app.

comment:7 in reply to: ↑ 5 Changed 6 years ago by i_i

Replying to mtredinnick:

Note that this isn't actually a bug, since modules should be able to handle being imported multiple times in any case. That happens quite a bit in Django (since __import__ is equivalent to Python's reload()). Thus this is probably worth doing at some point, but it's not a 1.0 showstopper.

Multiple imports isn't a bug, but extra keys with dots in sys.modules is a bug.

comment:8 Changed 6 years ago by i_i

  • Summary changed from `__import__(mod, {}, {}, [''])` causes double initialization of modules to `__import__(mod, {}, {}, [''])` causes double import of modules ('module' and 'module' + '.')

comment:9 follow-up: Changed 6 years ago by jacob

  • milestone changed from 1.0 maybe to post-1.0
  • Patch needs improvement set

I agree with Malcolm -- this is a bit ugly, but it's not a bug. Moving post-1.0.

Also, the patch has some, um, bugs: think about what happens if settings.SESSION_ENGINE doesn't contain a ".", for example.

comment:10 in reply to: ↑ 9 Changed 6 years ago by i_i

Replying to jacob:

Also, the patch has some, um, bugs: think about what happens if settings.SESSION_ENGINE doesn't contain a ".", for example.

No bugs here.

>>> 'string_without_dots'.split('.')
['string_without_dots']
>>> 'string_without_dots'.split('.')[-1]
'string_without_dots'

Of course I need to update the patch to the current trunk, I will do it as soon as I have some free time.

comment:11 Changed 6 years ago by i_i

__import__(mod, {}, {}, [mod.split('.')[-1]]) works perfectly in any situation:

>>> for mod in ('django', 'django.utils', 'django.utils.encoding'):
...     print __import__(mod, {}, {}, [mod.split('.')[-1]])
... 
<module 'django' from '/usr/lib/python2.5/site-packages/django/__init__.pyc'>
<module 'django.utils' from '/usr/lib/python2.5/site-packages/django/utils/__init__.pyc'>
<module 'django.utils.encoding' from '/usr/lib/python2.5/site-packages/django/utils/encoding.pyc'>

Just like __import__(mod, {}, {}, ['']) only without ugly side-effects.

The change is *very* simple and working/tested, I don't understand the reasons to wait.

comment:12 in reply to: ↑ 5 Changed 6 years ago by mrmachine

Replying to mtredinnick:

Note that this isn't actually a bug, since modules should be able to handle being imported multiple times in any case. That happens quite a bit in Django (since __import__ is equivalent to Python's reload()). Thus this is probably worth doing at some point, but it's not a 1.0 showstopper.

It seems that any module that registers ModelAdmin objects cannot handle being imported multiple times (raises AlreadyRegistered. This seems to happen right now if you've bundled your admin-specific forms, views, and urls into an admin module instead of an admin.py, it will be imported once by admin.autodiscover and again by include('myproject.admin.urls').

comment:13 Changed 6 years ago by hvendelbo

Ok, perhaps it isn't a show stopper.
This is causing problems for me for the third time though. I am adding a bit of settings defaults, and my defaults seem to be set 4 times. Even though I only set them if the attribute doesn't already exist. Sounds to me that django.conf.settings is being loaded 4 times. And then the "slight" problem that admin can't load. If I disable my code things appear fine.
Now there might be a bug in my code, but I still cant find it among my massive 6 lines of code after looking over it for 30 mins.

comment:14 Changed 5 years ago by mrts

As the patch is not up to date, I generated a new one as follows:

grep rule for reviewing all invalid imports:

$ grep -Er "__import__.*\['']" . | grep -v '\.svn'

Small shell script for fixing most of them:

FILES_TO_FIX=`grep -lEr "__import__.*\['']" . | grep -v '\.svn'`
sed -i "s/\(__import__(\)\([[:alnum:]._]\+\), {\(.*\)\[''])/\1\2, {\3[\2.split('.')[-1]])/" $FILES_TO_FIX
echo "Fix the following manually:"
grep -Er "__import__.*\['']" . | grep -v '\.svn'

Note that once Python 2.3 support is discontinued, foo.rsplit('.', 1)[-1] should be used (rsplit was added in 2.4).

Changed 5 years ago by mrts

update patch to r9333

comment:15 Changed 5 years ago by mrts

hvendelbo: could you please test if this fixes your problem (I assume it does)?

comment:16 Changed 5 years ago by anonymous

  • Patch needs improvement unset

Changed 5 years ago by mrts

missing init.py in tests/regressiontests/dynamic_imports/

comment:17 Changed 5 years ago by mrts

Whoo, good news on the mailing list -- Python 2.3 support will be dropped for 1.1. I'll upload a separate patch for trunk with rsplit('.', 1)[-1] ASAP.

comment:18 Changed 5 years ago by mrts

See also #6579

comment:19 Changed 5 years ago by mrts

Seems 2.3 support might not be dropped, so won't update the patch until final agreement is met.

comment:20 Changed 5 years ago by mtredinnick

  • Owner changed from nobody to mtredinnick
  • Patch needs improvement set
  • Status changed from new to assigned

Since all of these patches replace one form of undocumented (by working consistently for years until it changed) behaviour with a different sort, we shouldn't do that. The proposed patched rely on the fact that importing a missing name from a module doesn't raise and error and does import the module -- this is what Jacob was identifying when he pointed out it was doing the equivalent of "from settings import settings", for example. Rather, we should use the stable, documented way.

I've had a local branch in progress for a while to fix this up and it has revealed problems in some of the assumptions we are making about imports, so they have to be fixed, too. That's why I haven't committed anything here, yet, in case anybody was wondering.

comment:21 Changed 5 years ago by mrts

Malcolm, I filed this upstream as well with a patch, in http://bugs.python.org/issue4438 . Perhaps you care to take a look and comment (either here or there)?

comment:22 Changed 5 years ago by mrts

Hrvoje Niksic provides by far the cleanest solution in http://mail.python.org/pipermail/python-dev/2008-November/083728.html

comment:23 Changed 5 years ago by mrts

In short:

  • as __import__ is not meant to be used directly, there will be a much simpler API for runtime loading of modules in imp in Python 3.1 (and perhaps 2.7)
  • until that the following idiom should be used:
    import sys
    __import__(modname)
    mod = sys.modules[modname]
    

comment:25 Changed 5 years ago by mrts

And also available in the trunk version of the docs: http://docs.python.org/dev/library/functions.html#__import

Changed 5 years ago by mrts

patch with the sys.modules[modname] idiom, against trunk

Changed 5 years ago by mrts

update the test as well

comment:26 Changed 5 years ago by jacob

#10144 is another dup.

comment:27 Changed 5 years ago by mrts

as is #10095

comment:28 Changed 5 years ago by mrts

Replying to Malcolm's second comment on #10095:

Note, also, that we don't guarantee things will only be imported once. Python's __import__ statement is the same as reload(), it reimports the module, even if it already exists. It's up to code authors to make any allowances for double execution of code lines.

Malcolm, are you sure about that? The claim is neither justified by tests:

$ cat foo.py
print "foo imported."

$ python
Python 2.5.2 (r252:60911, Oct  5 2008, 19:29:17) 
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> __import__('foo')
foo imported.
<module 'foo' from 'foo.py'>
>>> __import__('foo')
<module 'foo' from 'foo.py'>

nor by docs:

This function is invoked by the import statement. It can be replaced (by importing the builtins module and assigning to builtins.__import__) in order to change semantics of the import statement, but nowadays it is usually simpler to use import hooks (see PEP 302). -- (i.e. if __import__ would import modules more than once, then so would plain import which clearly is not the case).

So, unless you can prove me wrong, the ever-recurring problems of initialization code (signals, whatnot) running more than once will be gone after this is fixed.

comment:29 Changed 5 years ago by gwilson

An importlib module and import_module function (API not yet final though) were recently checked into Python trunk:

http://svn.python.org/view/python/trunk/Lib/importlib.py?view=markup

comment:30 Changed 5 years ago by gonz

  • Cc gonz added

comment:31 Changed 5 years ago by anonymous

  • milestone post-1.0 deleted

Milestone post-1.0 deleted

comment:32 Changed 5 years ago by mtredinnick

  • Resolution set to fixed
  • Status changed from assigned to closed

(In [9947]) Removed some import-time dependencies on Django's settings.

Now you can import the file storage stuff and still call settings.configure()
afterwards. There is still one import-time usage of settings in
django.contrib.comments, but that's unavoidable.

Backport of r9945 and r9946 from trunk (this is needed in order to fix #8193 in
a clean fashion, which is why it's being backported).

comment:33 Changed 5 years ago by Alex

  • Resolution fixed deleted
  • Status changed from closed to reopened

This wasn't actually fixed, it just had the text fix in it :/

comment:34 Changed 5 years ago by brettcannon

  • Owner changed from mtredinnick to brettcannon
  • Status changed from reopened to new

I am working on a patch to port importlib to Django and switch over all uses of import to django.utils.importlib.import_module(). Beyond ridding Django of the need to call import directly it will also fix the original issue that led to this bug report. =)

Changed 5 years ago by brettcannon

Add importlib to django.utils and move all import usage over to it.

comment:35 Changed 5 years ago by brettcannon

I have attached the patch that adds django.utils.importlib and moves all uses of import over to import_module(). Patch is named importlib.diff.

comment:36 Changed 5 years ago by brettcannon

  • Owner changed from brettcannon to jacob
  • Patch needs improvement unset

Assigning to jacob as he said he would review the patch when I was finished with it.

comment:37 Changed 5 years ago by Alex

  • milestone set to 1.1

comment:38 follow-up: Changed 5 years ago by jacob

  • Resolution set to fixed
  • Status changed from new to closed

(In [10088]) Fixed #8193: all dynamic imports in Django are now done correctly. I know this because Brett Cannon borrowed the time machine and brought Python 2.7's 'importlib back for inclusion in Django. Thanks for the patch-from-the-future, Brett!

comment:39 in reply to: ↑ 38 Changed 5 years ago by akaihola

Replying to jacob:

(In [10088]) Fixed #8193: all dynamic imports in Django are now done correctly. I know this because Brett Cannon borrowed the time machine and brought Python 2.7's 'importlib back for inclusion in Django. Thanks for the patch-from-the-future, Brett!

See #11264 for a small problem this changeset introduced.

comment:40 Changed 3 years ago by jacob

  • milestone 1.1 deleted

Milestone 1.1 deleted

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.