Opened 4 years ago

Last modified 6 weeks ago

#30950 new Bug

Replace __file__ with importlib.resources

Reported by: John Vandenberg Owned by: nobody
Component: Core (Other) Version: dev
Severity: Normal Keywords: freezers
Cc: William Schwartz, Ülgen Sarıkavak Triage Stage: Someday/Maybe
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

__file__ is an optional attribute of modules, and it either is added/modified by magic logic like Nuitka, PyInstaller, cx_Freeze, and each new freezer needs to add per-package hacks to get Django working, and need to be fixed whenever Django internals change, or isnt provided by freezers like PyOxidizer which refuse to play that cat & mouse game (c.f. https://github.com/indygreg/PyOxidizer/issues/69).

The solution is to use importlib.resources where-ever the code is accessing data files inside the Django package, or other modules.

e.g. under PyOxidizer:
`

File "django.test", line 3, in <module>
File "django.test.client", line 14, in <module>
File "django.core.handlers.base", line 8, in <module>
File "django.urls", line 1, in <module>
File "django.urls.base", line 8, in <module>
File "django.urls.exceptions", line 1, in <module>
File "django.http", line 5, in <module>
File "django.http.response", line 15, in <module>
File "django.core.serializers", line 23, in <module>
File "django.core.serializers.base", line 7, in <module>
File "django.db.models", line 3, in <module>
File "django.db.models.aggregates", line 5, in <module>
File "django.db.models.expressions", line 8, in <module>
File "django.db.models.fields", line 11, in <module>
File "django.forms", line 6, in <module>
File "django.forms.boundfield", line 4, in <module>
File "django.forms.widgets", line 26, in <module>
File "django.forms.renderers", line 16, in <module>

NameError: name 'file' is not defined
`

To make this feature contained, I would limit it to runtime code of Django, and exclude instances of __file__ in docs and templates deployed into new projects which need to be "manage.py-relative-paths", such as those in https://github.com/django/django/pull/12005

Change History (13)

comment:1 by Carlos_Mir_de_Souza, 4 years ago

Type: UncategorizedBug

comment:2 by Carlos_Mir_de_Souza, 4 years ago

I am changing the type to bug based on your comment, if you can or want to add more info, it would be useful.

comment:3 by Mariusz Felisiak, 4 years ago

Component: UncategorizedCore (Other)
Triage Stage: UnreviewedSomeday/Maybe
Type: BugNew feature
Version: 2.2master

It is not a bug, using __file__ doesn't cause any issues in internal Django purposes. Moreover Django supports Python 3.6. We can reconsider this ticket when Python 3.7 becomes the minimal Python supported by Django.

comment:4 by Carlos_Mir_de_Souza, 4 years ago

Sorry for the mistake, i understood this as an error in the code:

  File "django.test", line 3, in <module>
  File "django.test.client", line 14, in <module>
  File "django.core.handlers.base", line 8, in <module>
  File "django.urls", line 1, in <module>
  File "django.urls.base", line 8, in <module>
  File "django.urls.exceptions", line 1, in <module>
  File "django.http", line 5, in <module>
  File "django.http.response", line 15, in <module>
  File "django.core.serializers", line 23, in <module>
  File "django.core.serializers.base", line 7, in <module>
  File "django.db.models", line 3, in <module>
  File "django.db.models.aggregates", line 5, in <module>
  File "django.db.models.expressions", line 8, in <module>
  File "django.db.models.fields", line 11, in <module>
  File "django.forms", line 6, in <module>
  File "django.forms.boundfield", line 4, in <module>
  File "django.forms.widgets", line 26, in <module>
  File "django.forms.renderers", line 16, in <module>
NameError: name '__file__' is not defined
Last edited 2 months ago by Natalia Bidart (previous) (diff)

comment:5 by John Vandenberg, 4 years ago

Type: New featureBug

__file__ is an optional attribute.

This has nothing to do with Python 3.7. importlib.resources had a backport before it was included in Python core , documented in the help https://docs.python.org/3.7/library/importlib.html#module-importlib.resources.

comment:6 by William Schwartz, 4 years ago

Cc: William Schwartz added

Support for PyOxidizer is important to my use case.

In addition to the issue of package data, for which importlib.resources is a good solution, there's also the matter of management commands. The current algorithm uses the standard library's pkgutil.iter_modules, which only works with modules located by path-entry finders (cf. indygreg/PyOxidizer#337). A good alternative that would work for non-file code would be to use the import mechanism as follows. The following code would replace django.core.management.get_commands() and django.core.management.find_commands(). I made some effort to keep the code similar to the existing code. I dressed up some of the docstring for Sphinx in case a description of the algorithm would need to go into the docs.

import functools
from importlib import import_module
from types import ModuleType
from typing import Dict, List

from django.apps import apps
from django.conf import settings
from django.core.management.base import BaseCommand


def find_commands(management_pkg_name: str) -> List[str]:
    """Given the name of a management package, return a list of all the command
    names that are available.
    """
    commands_pkg_name = '.'.join([management_pkg_name, 'commands'])
    try:
        pkg = import_module(commands_pkg_name)
    except ModuleNotFoundError:
        continue
    return [name for name, obj in pkg.__dict__.items()
        if isinstance(obj, ModuleType)
        # Django's rules ignore modules whose names start with _ or is a package
        # A module has a __path__ attribute if and only if it's a package
        and not hasattr(obj, '__path__')
        and not obj.__name__.startswith('_')
        # Ignore incidental imports from other packages into __init__.py
        and obj.__package__ == commands_pkg_name
        # We only care if there's a BaseCommand subclass named Command
        and hasattr(obj, 'Command') 
        and issubclass(obj.Command, BaseCommand)
    ]


@functools.lru_cache(maxsize=None)
def get_commands_by_import() -> Dict[str, str]:
    r"""Find :doc:`django:ref/django-admin` commands via the import system.

    Look for a management.commands package in django.core, and in each
    installed application — if a commands package exists, register all
    commands in that package.

    Core commands are always included. If a settings module has been
    specified, also include user-defined commands.

    The dictionary is in the format {command_name: app_name}. Key-value
    pairs from this dictionary can then be used in calls to
    load_command_class(app_name, command_name)

    If a specific version of a command must be loaded (e.g., with the
    startapp command), the instantiated module can be placed in the
    dictionary in place of the application name.

    The dictionary is cached on the first call and reused on subsequent
    calls.

    :returns:
        A dictionary mapping :samp:`{command}` names to the :samp:`{app}` they
        come from. For each :samp:`{app}` in :setting:`INSTALLED_APPS`, we
        search for an :samp:`{app}.management.commands.{command}` module by
        trying to import :samp:`{app}.management.commands`, iterating through
        the module's :attr:`~object.__dict__`,  and testing its values. We
        include any :class:`types.ModuleType` :samp:`{command}` if its name does
        not start with an underscore (``_``), it's not a :term:`python:package`,
        and it has a :class:`django.core.management.BaseCommand` subclass called
        ``Command``.
    """
    commands = {name: 'django.core' for name in find_commands(__package__)}

    if not settings.configured:
        return commands

    for app_config in reversed(list(apps.get_app_configs())):
        pkg_name = '.'.join([app_config.name, 'management'])
        commands.update({name: app_config.name for name in find_commands(pkg_name)})

    return commands

The main backward incompatibility is that this would require apps' management.commands packages to import all command modules in __init__.py. If this is unacceptable, find_commands could fall back to the existing pkgutil-based search algorithm.

A minor backward incompatibility is that this search is more conservative than the existing algorithm. In particular, if a module doesn't contain a BaseCommand subclass called Command, no attempt is made to include it.

A minor performance consideration here is that we're importing more modules' code at command-line start. Maybe some use of importlib.util.LazyLoader and foregoing to the check for the presence of the Command class could mitigate this performance penalty.

Last edited 3 years ago by William Schwartz (previous) (diff)

comment:7 by William Schwartz, 3 years ago

Django can add support for environments supporting pkgutil.iter_modules() much more quickly than it can add support for more restrictive environments. So I am working on adding iter_modules() support to PyOxidizer: https://github.com/indygreg/PyOxidizer/pull/343

I'm not sure whether iter_modules() works in other environments such as Nuitka or cx_Freeze.

comment:8 by Mariusz Felisiak, 3 years ago

Keywords: freezers added

comment:9 by William Schwartz, 3 years ago

As I wrote here, if #32316 gets fixed in time for Django 3.2, I'll be able to experiment with actually embedding Django in production. Then I'll be able to write up at least an experience report, if not a complete DEP, for the mailing list. One of my goals for that process would be to take advantage of #32355's dropping support for Python < 3.8 in Django 4.0. That would allow Django to use importlib.resources directly.

comment:10 by Joachim Jablon, 9 months ago

I could be wrong but I believe that whatever we do to tackle this ticket, it's important ensure that __file__ isn't added back by new contributions in the future. Ideally this shouldn't be a burden on reviewers, there should be a linting rule, or a test or something that ensures that any contributions following the resolution of this ticket isn't adding this back.

comment:11 by Natalia Bidart, 2 months ago

Related to the latest comment: https://github.com/django/django/pull/17870

comment:12 by Mariusz Felisiak <felisiak.mariusz@…>, 2 months ago

In 6feaad9:

Refs #30950, Refs #35187 -- Added tests for byte-compiled Django to daily builds.

comment:13 by Ülgen Sarıkavak, 6 weeks ago

Cc: Ülgen Sarıkavak added
Note: See TracTickets for help on using tickets.
Back to Top