Context Navigation

← Previous Ticket
Next Ticket →

#36083 assigned Bug

LiveServerTestCase fails in parallel test runner if django.contrib.auth.backends has not yet been imported

Reported by:	Adam Zapletal	Owned by:	Adam Zapletal
Component:	Testing framework	Version:	dev
Severity:	Normal	Keywords:	TransactionTestCase setupclass available_apps
Cc:	Adam Zapletal, Adam Johnson	Triage Stage:	Accepted
Has patch:	yes	Needs documentation:	no
Needs tests:	no	Patch needs improvement:	no
Easy pickings:	no	UI/UX:	no

Description (last modified by Adam Zapletal)

The full test suite run with multiple threads passes for me, but when I run the tests.file_storage.tests file in isolation (./runtests.py file_storage.tests), I get a failure. It passes in isolation if I run the tests with only one thread (./runtests.py --parallel 1 file_storage.tests). This happened to others when I asked about it in the Django Discord server, but not everyone. For what it's worth, I'm on the latest macOS running Python v3.12.

When a failure happens, I get the following error:

django.core.exceptions.ImproperlyConfigured: AUTH_USER_MODEL refers to model 'auth.User' that has not been installed

I have narrowed the issue down to the fact that this test file contains a class that inherits from LiveServerTestCase. It seems like any test file that mixes LiveServerTestCase and something more normal like unittest.TestCase will fail for me when run in isolation with more than one thread.

Here's a minimal example if anyone is interested:

from unittest import TestCase

from django.test import LiveServerTestCase


class TestCaseTests(TestCase):
    def test(self):
        self.assertEqual(2, 2)


class LiveServerTestCaseTests(LiveServerTestCase):
    available_apps = []

    def test(self):
        self.assertEqual(2, 2)

I have a fix, but I'm not sure if it's a good one. Setting available_apps = ['django.contrib.auth'] on the LiveServerTestCase-based class fixes it as one would expect from the error message, but I wonder if this is hiding something like a race condition. I wonder why it passes the full test suite when run with multiple threads, but it can't pass in isolation when run with multiple threads.

Should I open a pull request to fix the test in the way I mentioned above, or is something deeper going on here? It could be that I'm misunderstanding something about how LiveServerTestCase is isolated during test runs.

Change History (20)

comment:1 by Jacob Walls, 6 months ago

Keywords:	TransactionTestCase setupclass available_apps added
Severity:	Normal → Release blocker
Summary:	tests.file_storage.tests can fail when run in isolation → TransactionTestCase available_apps not set correctly for parallel test runner
Triage Stage:	Unreviewed → Accepted

Thanks, I remember replying on Discord that I could reproduce this.

Bisected to a060a22ee2dde7aa29a5a29120087c4864887325. Hey, that was me!

Looks like some debugging is needed around why apps.set_available_apps(cls.available_apps) is not being called at a point that works for the parallel test runner.

Would you like to look into this a little bit? I'm marking as a release blocker, and since the alpha is scheduled to go out this week, I can make some time to investigate as well. Having a couple people look at it will be helpful.

RE: your original proposal, I think we have another option, which would be to do this, but I think you're right this is just masking the deeper issue:

tests/file_storage/tests.py

diff --git a/tests/file_storage/tests.py b/tests/file_storage/tests.py
index 5f0024b81a..86429ec3a8 100644

                from django.test import (
     SimpleTestCase,
     TestCase,
     ignore_warnings,
+    modify_settings,
     override_settings,
+)
 from django.test.utils import requires_tz_support
-…
+               class ContentFileStorageTestCase(unittest.TestCase):
 @override_settings(ROOT_URLCONF="file_storage.urls")
+@modify_settings(MIDDLEWARE={"remove": "django.contrib.auth.middleware.AuthenticationMiddleware"})
 class FileLikeObjectTestCase(LiveServerTestCase):
     """
     Test file-like objects (#15644).

comment:2 by Jacob Walls, 6 months ago

available_apps is a private API, so we may have some discretion about whether to block the alpha over it, but I'm hoping a fix develops to make deciding that moot.

comment:3 by Jacob Walls, 6 months ago

Summary:	TransactionTestCase available_apps not set correctly for parallel test runner → LiveServerTestCase fails in parallel test runner if django.contrib.auth.backends has not yet been imported

This was a fun one. Here's what I found:

The commit I blamed exposed this by lifting the first call to set_available_apps() up into TransactionTestCase.setUpClass() (via _pre_setup()), and so the order of operations in LiveServerTestCase became different.
The reported issue goes away by moving super().setUpClass() to the bottom of LiveServerTestCase.setUpClass().

Why?

The check_middleware system check registered by django.contrib.auth has the side effect of importing django.contrib.auth.backends
The parallel test runner does not run the system checks per-process
When a parallel test runner worker tries to setUpClass() a LiveServerTestCase, it may be importing django.contrib.auth.backends for the first time and thus running get_user_model()
get_user_model() will fail (as designed) if django.contrib.auth is not in available_apps, which is now the case based on the change in the order of operations

I tested two solutions, both of which work:

Running system checks inside each worker, to ensure the side effect described above is executed in each process:

django/test/runner.py

diff --git a/django/test/runner.py b/django/test/runner.py
index b83cd37343..360bba4fd7 100644

                def _init_worker(
     process_setup_args=None,
     debug_mode=None,
     used_aliases=None,
+    verbosity=1,
 ):
     """
     Switch to databases dedicated to this worker.
-…
+               def _init_worker(
             process_setup(*process_setup_args)
         django.setup()
         setup_test_environment(debug=debug_mode)
+        call_command("check", verbosity=verbosity, databases=used_aliases)
     db_aliases = used_aliases if used_aliases is not None else connections
     for alias in db_aliases:
-…
+               class ParallelTestSuite(unittest.TestSuite):
     runner_class = RemoteTestRunner
     def __init__(
+        self, subsuites, processes, failfast=False, debug_mode=False, buffer=False
+        self,
+        subsuites,
+        processes,
+        failfast=False,
+        debug_mode=False,
+        buffer=False,
+        verbosity=1,
     ):
         self.subsuites = subsuites
         self.processes = processes
-…
+               class ParallelTestSuite(unittest.TestSuite):
         self.initial_settings = None
         self.serialized_contents = None
         self.used_aliases = None
+        self.verbosity = verbosity
         super().__init__()
     def run(self, result):
-…
+               class ParallelTestSuite(unittest.TestSuite):
                 self.process_setup_args,
                 self.debug_mode,
                 self.used_aliases,
+                self.verbosity,
             ],
+        )
         args = [
-…
+               class DiscoverRunner:
                     self.failfast,
                     self.debug_mode,
                     self.buffer,
+                    self.verbosity,
+                )
         return suite

Adjusting the order of operations in LiveServerTestCase.setUpClass() to act as before:

django/test/testcases.py

diff --git a/django/test/testcases.py b/django/test/testcases.py
index 36366bd777..d641bb3c6b 100644

                class LiveServerTestCase(TransactionTestCase):
     @classmethod
     def setUpClass(cls):
-        super().setUpClass()
         cls.enterClassContext(
             modify_settings(ALLOWED_HOSTS={"append": cls.allowed_host})
+        )
         cls._start_server_thread()
+        super().setUpClass()
     @classmethod
     def _start_server_thread(cls):

Then of course there is:

Fiddle with available_apps and/or MIDDLEWARE so that LiveServerTestCase subclasses always either have contrib.auth in available_apps or else no middleware that relies on it.

Option 1 feels like a workaround for the fact that a certain system check has a side effect. It's a more invasive diff, and it adds performance drag.
Option 2 feels like the simplest thing to do for now to preserve prior behavior, although it's a workaround for the fact that a system check has a side effect
Option 3 feels more semantically correct but would need some thought around the design to make it ergonomic (fiddling with this in every subclass doesn't seem right)

Adam, what do you think? Would you like to prepare a patch with a regression test?

If folks cannot reproduce the issue, I would venture a guess that it's due to the difference between "spawn" and "fork" defaults. I tested on MacOS where the default is spawn.

Last edited 6 months ago by Jacob Walls (previous) (diff)

comment:4 by Adam Zapletal, 6 months ago

Great job researching this, and thanks for the quick response!

For context, I found this issue while working on #10244, and this is not my area of expertise in the codebase. However, here are my initial thoughts on the options you presented:

Yes, the diff is a bit unfortunate, but running system checks in each worker seems more correct to me if I'm understanding the design correctly. It seems like we'd want to run them in every context or none.
This seems like a simple band-aid fix that would work for now if this ticket has a deadline. I'm not sure of all the implications here.
I agree that this seems correct in a way similar to the first option. We don't want to evade system checks on accident.

I'm a bit thrown off by the fact that the system check in question, via ‎_subclass_index, has the side effect of importing a module. I'm wondering the following:

Are there other system checks with side effects?
Is there a way to implement this system check without the side effect?

I'm not familiar enough with the system check framework to know what's reasonable here, though I'd be interested to know.

This may be irrelevant, but if I grep the test suite for uses of LiveServerTestCase, I find the following:

Tests testing aspects LiveServerTestCase itself (and StaticLiveServerTestCase)
A test class for the startproject command that sets available_apps correctly
The file storage test class that this ticket is about, which sets available_apps to an empty list (for context, available_apps was set here.)

So we can say that the test class in question is the only one in the whole test suite that uses LiveServerTestCase but doesn't include django.contrib.auth in available_apps. I'm not sure what that implies, but this test is an outlier in multiple ways. Should the test itself be questioned?

comment:5 by Jacob Walls, 6 months ago

Thanks Adam.

Is there a way to implement this system check without the side effect?

To find out whether a middleware class declared by a string is a subclass of something else, I think we have to import the string.

I may have skewed things by using the phrase "side effect": I think it's good to admit flexibility in the order by which modules are imported. As the long as the module-level caches are designed well, then in practice it's not really a problem. It does raise test isolation concerns.

The test framework goes to great effort to reset the database layer between tests, but the python layer is left up to the user. It might be worthy to try out adding an --isolate-imports flag to the test runner in a third-party package that would reset sys.modules between test case classes. (I don't know if there's a test framework out there that already implements such a thing.)

Here is where I would cc Adam Johnson if I had the right permission bit.

Should the test itself be questioned?

Thanks for the grep -- it's true Django itself only has one test case class in this situation, but a user could be doing this. I think there would have been a similar problem with SeleniumTestCase if it were supported by the parallel test runner. (It doesn't today but might in the future.)

This seems like a simple band-aid fix that would work for now if this ticket has a deadline.

I tenatively marked this as a release blocker since it smelled like a regression in Django 5.2: your test case class breaks if it depends on django.contrib.auth via LiveServerTestCase (or SeleniumTestCase if they have a way to run it in parallel or skip checks) but sets available_apps = []. The wrinkle being that available_apps is a private API we can change at will. I'll let the fellows speak to urgency.

Option 3 feels more semantically correct but would need some thought around the design to make it ergonomic (fiddling with this in every subclass doesn't seem right)

We could also just doc that LiveServerTestCase subclasses should be careful to to include auth in available_apps, given we've already documented this as an advanced/private feature. Or add fanciness to to LiveServerTestCase._pre_setup() to sniff for this misconfiguration, but the complexity tradeoff looks bad if we need to be resilient against subclasses of the auth middleware (hence the whole reason for _subclass_index in the system check in the first place).

This seems like a simple band-aid fix that would work for now if this ticket has a deadline. I'm not sure of all the implications here.

Re: shuffling the cheese around in LiveServerTestCase.setUpClass(), I guess it has the benefit of clarifying that available_apps is really only about signals and the database layer, and we could say that it's not a way to adjust the app registry for other purposes like skipping imports.

I'm fine with any of the three options. Would like to hear from others.

comment:6 by Adam Zapletal, 6 months ago

I appreciate your insight here. Please know that I wasn't questioning your approach...I'm just trying to understand the problem!

One point of clarification on this comment:

Thanks for the grep -- it's true Django itself only has one test case class in this situation, but a user could be doing this. I think there would have been a similar problem with SeleniumTestCase if it were supported by the parallel test runner. (It doesn't today but might in the future.)

I was assuming that in a real project (not Django itself), the user wouldn't have this problem. They'd be opting into setting available_apps on their test class, so we can expect them to do it correctly. The attribute is only required in Django's test suite.

When I remove the linked check from runtests.py or set available_apps to None, the test in question passes.

I'd like to hear from others as well. Do you want to ping people on this ticket, or should we open a forum thread?

comment:7 by Adam Zapletal, 6 months ago

Description:	modified (diff)

comment:8 by Jacob Walls, 6 months ago

No worries Adam, I didn't think you were questioning me at all. I appreciate you thinking it through with me! We'll hear from others soon, as it's still the weekend in my timezone.

They'd be opting into setting available_apps on their test class, so we can expect them to do it correctly.

I agree with that. I guess I'm taking the opportunity to wonder about what is correct: if LiveServerTestCase is what creates the dependency on contrib.auth purely by instantiating WSGIHandler(), should we expect that to leak to the subclasses, meaning they have to either not use the available_apps attribute or discover for themselves this tricky edge case and guard against it for the first time in 5.2? Again, not necessarily a regression/release blocker given that it's a private API, just polling for views since we at least have the option of preventing an unintentional behavior change here.

comment:9 by Sarah Boyce, 6 months ago

Cc:	Adam Johnson added

comment:10 by Sarah Boyce, 6 months ago

I think in the interest of time, we should do option 2. We can create a followup cleanup/optimization ticket for option 3

comment:11 by Jacob Walls, 6 months ago

Owner:	set to Jacob Walls
Status:	new → assigned

Sounds great, I can put together a PR for this this morning.

comment:12 by Jacob Walls, 6 months ago

No worries if someone else beats me to it

comment:13 by Sarah Boyce, 6 months ago

One thought I had was whether we want to call checks when the installed apps settings change, something like:

django/test/signals.py

                from asgiref.local import Local
 from django.apps import apps
 from django.core.exceptions import ImproperlyConfigured
+from django.core.management import call_command
 from django.core.signals import setting_changed
 from django.db import connections, router
 from django.db.utils import ConnectionRouter
-…
+               def update_installed_apps(*, setting, **kwargs):
         from django.utils.translation import trans_real
         trans_real._translations = {}
+        call_command("check")

For the specific test failure, we should probably set the middleware to be []
This would have some performance impact though 🤔

comment:14 by Sarah Boyce, 6 months ago

I can put together a PR for this this morning.

Thank you so much

comment:15 by Jacob Walls, 6 months ago

Unfortunately there is a new test failure with option 2, so it's probably less safe than doing nothing:

======================================================================
ERROR: test_media_files (servers.tests.LiveServerViews.test_media_files)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/user/django/tests/servers/tests.py", line 310, in test_media_files
    with self.urlopen("/media/example_media_file.txt") as f:
         ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/django/tests/servers/tests.py", line 44, in urlopen
    return urlopen(self.live_server_url + url)
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/urllib/request.py", line 189, in urlopen
    return opener.open(url, data, timeout)
           ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/urllib/request.py", line 495, in open
    response = meth(req, response)
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/urllib/request.py", line 604, in http_response
    response = self.parent.error(
        'http', request, response, code, msg, hdrs)
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/urllib/request.py", line 533, in error
    return self._call_chain(*args)
           ~~~~~~~~~~~~~~~~^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/urllib/request.py", line 466, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/urllib/request.py", line 613, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

----------------------------------------------------------------------
Ran 30 tests in 5.779s

For reference here is the test I was intending to add:

class LiveServerAvailableAppsTests(LiveServerTestCase):
    available_apps = []

    @classmethod
    def setUpClass(cls):
        # Simulate the system checks having not run their side effect of
        # importing these modules via check_middleware().
        auth_backends = sys.modules.pop("django.contrib.auth.backends", None)
        cls.addClassCleanup(sys.modules.__setitem__, "django.contrib.auth.backends", auth_backends)
        auth_middleware = sys.modules.pop("django.contrib.auth.middleware", None)
        cls.addClassCleanup(sys.modules.__setitem__, "django.contrib.auth.middleware", auth_middleware)

        super().setUpClass()

    def test_server_starts_when_system_checks_have_not_run(self):
        # Mostly to exercise setUpClass() but also to verify test setup.
        self.assertEqual(self.available_apps, [])

Option 3 (Adam's original patch, or the variants Sarah and I mentioned around popping middleware) may be the most practical thing at this point (and doesn't need to be a blocker, since it's just changing our existing test.)

Thoughts?

Last edited 6 months ago by Jacob Walls (previous) (diff)

comment:16 by Adam Zapletal, 6 months ago

Let me know how I can help. I'm happy to open a PR with the fix from my original post if that's the direction we want to go in.

comment:17 by Jacob Walls, 6 months ago

Owner:	changed from Jacob Walls to Adam Zapletal
Severity:	Release blocker → Normal

Thanks, Adam. I can pass the baton to you.

Now that we know option 2 isn't very safe I don't think this is urgent anymore. Sarah, let me know if you disagree.

I'm starting to like Option 1 again (run system checks per process).

comment:18 by Sarah Boyce, 6 months ago

I feel like the commit exposed an existing issue which we hadn't noticed before, that we can define available_apps to remove installed apps, which can make the middleware invalid
So to me, it doesn't feel like a release blocker.
We can still backport to 5.2 as we can backport bug fixes prior to the beta release. If a user encounters a problem with 5.2, we can treat this as a release blocker

comment:19 by Adam Zapletal, 5 months ago

I opened a draft PR and pinged Jacob to get some guidance on how to test this.

comment:20 by Adam Zapletal, 5 months ago

Has patch:	set

After Jacob and I interacted on the draft PR, I squashed my commits on the branch, updated the commit message, and made the PR ready for review.

Note: See TracTickets for help on using tickets.

Download in other formats:

Issues