Opened 5 months ago

Last modified 30 hours ago

#36864 assigned New feature

Add support for modules to import_string()

Reported by: Jacob Walls Owned by: Leland Boeman
Component: Core (Management commands) Version: 5.2
Severity: Normal Keywords: import_string, submodule
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Adjust the test for customizing automatic shell imports to include a submodule of a module not normally imported, like django.views.decorators:

diff --git a/tests/shell/tests.py b/tests/shell/tests.py
index 78e61e6a9d..95041e455e 100644
--- a/tests/shell/tests.py
+++ b/tests/shell/tests.py
@@ -366,7 +366,7 @@ class ShellCommandAutoImportsTestCase(SimpleTestCase):
     def test_message_with_stdout_one_object(self):
         class TestCommand(shell.Command):
             def get_auto_imports(self):
-                return ["django.db.connection"]
+                return ["django.views.decorators.gzip"]
 
         with captured_stdout() as stdout:
             TestCommand().get_namespace(verbosity=2)
@@ -376,7 +376,7 @@ class ShellCommandAutoImportsTestCase(SimpleTestCase):
             1: "1 object imported automatically (use -v 2 for details).",
             2: (
                 "1 object imported automatically:\n\n"
-                "  from django.db import connection"
+                "  from django.views.decorators import gzip"
             ),
         }
         for verbosity, expected in cases.items():

This fails for the reason developed in #35402 that import_string() has trouble with modules: top-level modules always fail, and submodules sometimes succeed by happenstance. Here, django.db.connection only works as an automatic import because we depend on coincidentally importing django.db when running management commands (by importing django.core.checks). This seems fragile.

AssertionError: '1 object could not be automatically imported:\n\n [62 chars]lly.' != '1 object imported automatically:\n\n  from django.[24 chars]gzip'
- 1 object could not be automatically imported:
+ 1 object imported automatically:
  
-   django.views.decorators.gzip
?                          ^
+   from django.views.decorators import gzip
?  +++++                        ^^^^^^^^
- 
- 0 objects imported automatically.

If you're interested to see why:

>>> import django
>>> getattr(django, "db")  # fails
AttributeError ...
>>> import django.db  # coincidental import somewhere
>>> getattr(django, "db")  # succeeds
<module 'django.db' ...

import_string() is documented to "return the attribute/class designated by the last name in the path". I guess a submodule is an "attribute" of a parent module, so it's suggested that this works, but looking at existing usage I found that often the assumption is that only classes and callables succeed. (I added tests that fail if this assumption is changed in b99c608ea10cabc97a6b251cdb6e81ef2a83bdcf.)

I'm suggesting we make import_string() fail more consistently when given a submodule, to surface these sneaky problems (after a deprecation).


Or, we could make import_string() handle either submodules or top-level modules as well.

Tim Graham observed that werkzeug's implementation of import_string() handles both modules and submodules:

        except ImportError:
            # support importing modules not yet set up by the parent module
            # (or package for that matter)
            module = import_string(module_name)

But that's not the history we have for this util. At the time, I thought it would be a bigger lift to audit all the call sites of import_string() and make sure they're OK with modules & submodules now succeeding. We also have several places coping with the ambiguity of modules and classes correctly:

        # If import_module succeeds, entry points to the app module.
        try:
            app_module = import_module(entry)
...
        # If import_string succeeds, entry is an app config class.
        if app_config_class is None:
            try:
                app_config_class = import_string(entry)
...
        # If both import_module and import_string failed, it means that entry
        # doesn't have a valid value.

...but I'm alright with that assumption being questioned. I just think we're on the hook here for one behavior change or the other, since now it affects automatic shell imports.

PR incoming for the "reliably raise" option to aid the triage.

Attachments (1)

image-20260529-084659.png (57.2 KB ) - added by Leland Boeman 5 days ago.

Download all attachments as: .zip

Change History (18)

comment:1 by Jacob Walls, 5 months ago

Has patch: set
Patch needs improvement: set

PR

Needs a test for the deprecation warning, but waiting for acceptance before polishing that.

comment:2 by Clifford Gama, 5 months ago

Triage Stage: UnreviewedAccepted

Thanks for the report!

comment:3 by Clifford Gama, 5 months ago

Or, we could make import_string() handle either submodules or top-level modules as well.

I'd be +1 to this because it makes the interface simpler. Pointing out that this doesn't work when the string points to a module instead of a callable seems like information leakage to me for a function whose name doesn't seem to have any type preferences for what the import string is pointing to.

Version 0, edited 5 months ago by Clifford Gama (next)

comment:4 by Tim Graham, 5 months ago

Cc: Tim Graham removed

comment:5 by Tim Schilling, 3 months ago

Looking at werkzeug's implementation, it hasn't been touched in the last 6 years. Based on that, I'd say the risk of follow-up changes to extend import_string is low. While someone could be making use of import_string failing to import modules, it seems a bit unlikely especially considering it sometimes work and Jacob didn't mention tickets about people complaining about that inconsistency.

The failure mode I could see from making this more flexible is import_string fails to raise an exception and instead returns a module instead of the expected attribute/class. That seems like something that would quickly crop up in CI and is unlikely to unexpectedly make its way into a production web app.

For me, my preference would be to extend import_string to work with modules since we have a stable reference point to base it on and the downsides are both minimal and seem unlikely.

comment:6 by Jacob Walls, 3 months ago

Has patch: unset
Patch needs improvement: unset

comment:7 by Natalia Bidart, 3 months ago

+1 to extending import_string() to handle modules reliably. Falling back to importlib.import_module() when getattr() fails feels like the correct fix. It makes import_string() consistent regardless of prior import state, and as Tim said werkzeug's long-standing implementation of the same pattern demonstrates it is stable in practice.

comment:8 by Leland Boeman, 2 weeks ago

Owner: changed from Jacob Walls to Leland Boeman

Hello! I'm here at PyCon sprints, and Jacob suggested this issue might benefit from new set of eyes, so I thought I'd take a look.

comment:9 by Leland Boeman, 2 weeks ago

Has patch: set

comment:10 by Leland Boeman, 11 days ago

I realized I missed that in #35402 https://github.com/django/django/blob/main/django/db/backends/base/creation.py#L378-L387 an except block was added to handle this bug.

            except ImportError:
                # import_string() can raise ImportError if a submodule's parent
                # module hasn't already been imported during test discovery.
                # This can happen in at least two cases:
                # 1. When running a subset of tests in a module, the test
                #    runner won't import tests in that module's other
                #    submodules.
                # 2. When the parallel test runner spawns workers with an empty
                #    import cache.
                test_to_mark = import_string(test_name)
                test_frame = sys.modules.get(test_to_mark.__module__)

It looks like adopting this functionality would not break that code because any modules that don't exist would just result in a second ImportError being raised on the subsequent calls to import_string/sys.modules.get . But given that the behavior noted in the comments shouldn't occur if support for modules is added, should this except block be removed as to not cause confusion?

comment:11 by Jacob Walls, 6 days ago

Needs documentation: set
Patch needs improvement: set
Type: BugNew feature

Given the above consensus, we're solving a bug with automatic shell imports by enhancing import_string(). Left some minor comments on the PR.

comment:12 by Tim McCurrach, 5 days ago

I have a question about how import_string() should work when it is able to import both modules and the various attributes of modules. I think the ability to do that introduces some ambiguity - and would potentially be a more significant breaking change than someone relying on exceptions being raised when importing modules and the like.

Suppose we have:

├─ a.py
└─ b/
   ├─ __init__.py    # Contains a function called c
   └─ c.py

then:

from b import c # imports the function
import b.c # imports the module

And so it's not obvious what import_string("b.c") should do.

If my reading of the linked werkzueg solution above, and of the currently open PR are correct; they both first attempt to import a module, and only when that fails do they then look for an attribute to import instead. This is different to Natalia's suggestion of maintaining existing behaviour and dropping back to module import only when that fails - which would be less breaking.

EDIT: The docs also currently document that c = import_string("b.c") should be equivalent to from b import c

Last edited 5 days ago by Tim McCurrach (previous) (diff)

by Leland Boeman, 5 days ago

Attachment: image-20260529-084659.png added

comment:13 by Tim McCurrach, 5 days ago

I think the behaviour you have in the attached image will be because you put import b.c first, so the module c will already be cached against b. If you swap the order round you'll get the behaviour described above.

comment:14 by Leland Boeman, 5 days ago

I agree with your point, and I dislike the ambiguity. I think the answer is follow Natalia's suggestion more closely, which I think mimics the behavior of from b import c in the documentation. I did not understand the nuance in behavior between how the import statements resolve their targets.

Thanks for the feedback. I'll have some time tomorrow to dig into this more.

comment:15 by Leland Boeman, 4 days ago

I've poked around here a bit and it has been fun to get to know Python's import system a little better. Thanks for calling that out Tim. I hadn't considered that case before.

I have implemented the suggested behavior of falling back to load_module after getattr fails. This should follow the behavior of from b import c as suggested in the docs and I believe it is the best way forward and least breaking as has been previously discussed.

I was surprised by Python's import behavior around name collisions between objects defined by a module and it's submodules.

Given the example from above:

├─ a.py
└─ b/
   ├─ __init__.py    # Contains a function called c
   └─ c.py

What the user receives from the from b import c statement is determined by whether or not the module b.c has been previously loaded. This is also true for the existing implementation of import_string, with the exception that currently import_string raises when b.c has not been previously loaded.

If we adopt this behavior, I think it would be helpful to users to call out this case in the docs. It could also be argued that equivalence to from module import x noted in the docs is enough of an explanation.

I have some house keeping updates and some comments on the PR to address, but wanted to check if this all seems agreeable before proceeding.

in reply to:  15 comment:16 by Jacob Walls, 30 hours ago

Replying to Leland Boeman:

If we adopt this behavior, I think it would be helpful to users to call out this case in the docs. It could also be argued that equivalence to from module import x noted in the docs is enough of an explanation.

I think we should make the priority explicit in the docs. Nice catch, Tim!

I have some house keeping updates and some comments on the PR to address, but wanted to check if this all seems agreeable before proceeding.

+1 for going with the least breaking change when implementing this new feature.

comment:17 by Jacob Walls, 30 hours ago

Summary: Automatic shell imports cannot reliably handle submodulesAdd support for modules to import_string()
Note: See TracTickets for help on using tickets.
Back to Top