Context Navigation

← Previous Ticket
Next Ticket →

#31286 closed Bug (fixed)

Database specific fields checks should be databases aware.

Reported by:	Hongtao Ma	Owned by:	Hongtao Ma
Component:	Database layer (models, ORM)	Version:	dev
Severity:	Normal	Keywords:
Cc:	Simon Charette, Jacob Walls	Triage Stage:	Ready for checkin
Has patch:	yes	Needs documentation:	no
Needs tests:	yes	Patch needs improvement:	no
Easy pickings:	yes	UI/UX:	no

Description

In an attempt to trigger the check Error mentioned in Tickets: #31144, I found that the check for database backend doesn't check against the backend specified, e.g. --database mysql, rather, it always checks against the 'default' backend.

After diving into the source code, I found the following method defined in django/db/models/fields/init.py

    def _check_backend_specific_checks(self, **kwargs):
        app_label = self.model._meta.app_label
        for db in connections:
            if router.allow_migrate(db, app_label, model_name=self.model._meta.model_name):
                return connections[db].validation.check_field(self, **kwargs)
        return []

It checks the first db defined in connections rather those provided by users.

A proposed change would be:

    def _check_backend_specific_checks(self, **kwargs):
        app_label = self.model._meta.app_label
        errors = []
        for db in kwargs['databases'] or ['default']:
            if router.allow_migrate(db, app_label, model_name=self.model._meta.model_name):
                errors.extend(connections[db].validation.check_field(self, **kwargs))
        return errors

It worked as intended on my local machine.
I would happily provide a patch for this one.

Change History (12)

follow-up: 2 comment:1 by Mariusz Felisiak, 6 years ago

Cc:	Simon Charette added
Easy pickings:	set
Summary:	Database specific checks always goes to the 'default' backend → Database specific fields checks should be databases aware.
Triage Stage:	Unreviewed → Accepted

Thanks, this was missed in 0b83c8cc4db95812f1e15ca19d78614e94cf38dd. We should use here the same approach and pass databases to these checks:

diff --git a/django/db/models/fields/__init__.py b/django/db/models/fields/__init__.py
index 6bd95f542c..c54b8f6e31 100644
--- a/django/db/models/fields/__init__.py
+++ b/django/db/models/fields/__init__.py
@@ -335,11 +335,13 @@ class Field(RegisterLookupMixin):
         else:
             return []
 
-    def _check_backend_specific_checks(self, **kwargs):
+    def _check_backend_specific_checks(self, databases=None, **kwargs):
         app_label = self.model._meta.app_label
-        for db in connections:
-            if router.allow_migrate(db, app_label, model_name=self.model._meta.model_name):
-                return connections[db].validation.check_field(self, **kwargs)
+        if databases is None:
+            return []
+        for alias in databases:
+            if router.allow_migrate(alias, app_label, model_name=self.model._meta.model_name):
+                return connections[alias].validation.check_field(self, **kwargs)
         return []
 
     def _check_validators(self):

in reply to: 1 comment:2 by Hongtao Ma, 6 years ago

Owner:	changed from nobody to Hongtao Ma
Status:	new → assigned

Applied the patch, two testcases failed, however:

FAIL: test_checks_called_on_the_default_database (check_framework.test_multi_db.TestMultiDBChecks)
FAIL: test_checks_called_on_the_other_database (check_framework.test_multi_db.TestMultiDBChecks)

        for alias in databases:
            if router.allow_migrate(alias, app_label, model_name=self.model._meta.model_name):
                return connections[alias].validation.check_field(self, **kwargs)
        return []

The early return will cause only a single backend be checked, in case of multiple backends provided:

(djangodev) ➜  my_test_proj git:(iss_check) ✗ python manage.py check --database sqlite3 --database mysql
System check identified no issues (0 silenced).
(djangodev) ➜  my_test_proj git:(iss_check) ✗ python manage.py check --database mysql        
SystemCheckError: System check identified some issues:

ERRORS:
issue.AnotherIndex.name: (mysql.E001) MySQL does not allow unique CharFields to have a max_length > 255.
issue.Networkservergroup.groupname: (mysql.E001) MySQL does not allow unique CharFields to have a max_length > 255.

System check identified 2 issues (0 silenced).
(djangodev) ➜  my_test_proj git:(iss_check) ✗

comment:3 by Mariusz Felisiak, 6 years ago

You have to pass databases={'default', 'other'} to model.check() calls in TestMultiDBChecks.

comment:4 by Hongtao Ma, 6 years ago

Has patch:	set
Needs tests:	set

comment:5 by Simon Charette, 6 years ago

Triage Stage:	Accepted → Ready for checkin

Thanks for the patch @Taoup!

comment:6 by Mariusz Felisiak <felisiak.mariusz@…>, 6 years ago

Resolution:	→ fixed
Status:	assigned → closed

In 271fdab:

Fixed #31286 -- Made database specific fields checks databases aware.

Follow up to 0b83c8cc4db95812f1e15ca19d78614e94cf38dd.

comment:7 by Tim Graham, 10 months ago

I'm not following the analysis in the ticket description. My understanding is that before this change, the database-specific field checks always ran for all databases (for db in connections). Now that it uses databases as specified by the check command, these checks are skipped unless check --database=... is specified. While migrate's check does specify a database, this change postpones these errors/warnings from the makemigrations stage to the migrate stage, at which point the user needs to take an extra corrective step of recreating the migration file. (In my case, I wrote a check to prohibit AutoField for the MongoDB backend.)

Although it's documented that "Database checks are not run by default because they do more than static code analysis as regular checks do," the database-specific field checks are only static analysis (as opposed to something like the MySQL check for strict mode which does query the database), so I see no reason they shouldn't be included as part of the normal checks. I think this patch should be reverted, but let me know if I've missed something.

comment:8 by Jacob Walls, 10 months ago

Cc:	Jacob Walls added

comment:9 by Simon Charette, 10 months ago

Thanks for sharing your thoughts Tim, having you work on that MongoDB backend has been a great source of feedback for the rough edges of build such a third-party app.

The intent of #31055 (0b83c8cc4db95812f1e15ca19d78614e94cf38dd) was two fold

Ensure that the check framework is provided the exact set of databases it should perform checks against (as not all of them might be available all the time, e.g. migrations)
Have the test framework take advantage of 1. as since #28478 test database creation is skipped if unneeded.

Now in retrospect I think that making the check command default databases=None to [] instead of list(connections) might have been a mistake but it's also not clear whether [DEFAULT_DB_ALIAS] might not have been also a good candidate (that's what other database related commands do)

In summary I wouldn't be opposed to reverting part of this patch that have check default to all databases (or only the default one) but we must maintain the part that made check databases aware as it's something the migration and testing framework rely on.

Although it's documented that "Database checks are not run by default because they do more than static code analysis as regular checks do," the database-specific field checks are only static analysis (as opposed to something like the MySQL check for strict mode which does query the database), so I see no reason they shouldn't be included as part of the normal checks.

I don't disagree in principle but that seems like a somewhat different can of worm isn't it? Database checks were never ran by default even prior to this patch so it has little to do this with this particular ticket beyond the fact that specifying a --database is now the opt-in for enabling those instead of --tag database.

Last edited 10 months ago by Simon Charette (previous) (diff)

comment:10 by Tim Graham, 5 months ago

I think there's a misunderstanding with your statement: "Database checks were never ran by default even prior to this patch." The only database check (designated by @register(Tags.database)) invokes DatabaseValidation.check(). This invokes, for example, MySQL's on strict mode. This check queries the database server, but as far as I can tell, it doesn't require the presence of the test database. Nonetheless, I see no issue with the test runner skipping these checks.

Separately, the DatabaseValidation class also has a check_field() hook which calls DatabaseValidation.check_field_type() (if implemented by the backend, e.g. MySQL's). These checks do static analysis (checking field attributes), and while some of these checks may rely on a connection to the database server to populate connection.features, for example, I don't believe these checks require the presence of the test database. The calling chain is Field.check() -> Field._check_backend_specific_checks(). These checks were run by makemigrations (and even by runserver which I'll get in to below) until 271fdab8b78af558238df51c64b4d1c8dd0792bb, which stopped iterating through all the databases (django.db.connections) and instead uses the list of databases aliases passed from (typically) the management command's --database argument (e.g. for migrate).

I believe it's useful to have field-specific checks run for all databases at the makemigrations stage rather than deferring them until migrate, at which point a potentially invalid migration has already been created. Besides Field._check_backend_specific_checks(), the other checks that rely on databases are therefore are currently deferred are:

My proposal is to make makemigrations run these checks for all connections:

django/core/management/commands/makemigrations.py

diff --git a/django/core/management/commands/makemigrations.py b/django/core/management/commands/makemigrations.py
index 7f711ed7ae..5860c462c2 100644

                class Command(BaseCommand):
     def log(self, msg):
         self.log_output.write(msg)
+    def get_check_kwargs(self, options):
+        kwargs = super().get_check_kwargs(options)
+        return {**kwargs, "databases": connections}
     @no_translations
     def handle(self, *app_labels, **options):
         self.written_files = []

While thinking through this, I noticed another surprising, perhaps inadvertent regression / behavior change. Currently, in a multi-db setup (I'll use the aliases 'default' and 'other' as an example), unless I missed a place that passes all db aliases to check(), the only way to get the pre-Django 3.1 behavior for Model._check_long_column_names() (which consults all database aliases to find the maximum allowed column name) would be to invoke: check --database=default --database=other.

Essentially, #31055 started treating the bulleted list of checks above as "database checks" which was perhaps overly aggressive and unneeded. They are really static checks that don't require the presence of a database. These checks used to be a lot more visible since they were even invoked by runserver (right?). So arguably runserver (and even check) could get the same get_check_kwargs() treatment proposed above, and then we may want to go back to excluding database tagged checks unless --database is specified. What do you think?

Last edited 4 months ago by Tim Graham (previous) (diff)

comment:11 by Simon Charette, 4 months ago

Thanks for the analysis Tim

This check queries the database server, but as far as I can tell, it doesn't require the presence of the test database.

The mode is effectively not a database specific setting but it is nonetheless a value that must be retrieved using a query (SELECT @@sql_mode to be precise) which means that a connection must exists against the server and if no test database is setup this leaves the testing machinery no choice but to connect directly to the non-test database which is frowned upon in a testing scenario. Now that we don't systematically create / setup for reuse test databases when the suite subset doesn't need it (#28478) we need a way to ensure we don't issue database queries against un-prepared databases.

These checks do static analysis (checking field attributes), and while some of these checks may rely on a connection to the database server to populate connection.features, for example, I don't believe these checks require the presence of the test database.

Similar scenario in this case, they may or may not require to issue database queries to perform these checks and we likely don't want them to be performed against non-test databases.

They are really static checks that don't require the presence of a database.

That's the crux of the issue I believe. They may or may not as they are opaque to the caller and its this opacity that forces setups where only a subset of the databases to be safely available (e.g. test runs) to limit their scope.

As I've shared in comment:9 I'm in agreement with you here though, I do believe that we should default to having unspecified database subset mean all databases instead of no databases and not only for makemigrations. What I mean is that we should be doing something like the following

django/core/checks/database.py

diff --git a/django/core/checks/database.py b/django/core/checks/database.py
index d125f1eaa6..91265ac966 100644

-              a
 @register(Tags.database)
+def check_database_backends(databases=None, **kwargs):
+    if databases is None:
+        return []
+def check_database_backends(databases, **kwargs):
     issues = []
     for alias in databases:
         conn = connections[alias]

django/core/checks/registry.py

diff --git a/django/core/checks/registry.py b/django/core/checks/registry.py
index 3139fc3ef4..a285abbc3f 100644

-              a
 from collections.abc import Iterable
 from itertools import chain
+from django.db import connections
 from django.utils.inspect import func_accepts_kwargs
-…
+               def run_checks(
         if tags is not None:
             checks = [check for check in checks if not set(check.tags).isdisjoint(tags)]
+        if databases is None:
+            databases = list(connections)
         for check in checks:
             new_errors = check(app_configs=app_configs, databases=databases)
             if not isinstance(new_errors, Iterable):

django/core/management/commands/check.py

diff --git a/django/core/management/commands/check.py b/django/core/management/commands/check.py
index e61cff79f3..531e059c37 100644

                def add_arguments(self, parser):
             action="append",
             choices=tuple(connections),
             dest="databases",
             help="Run database related checks against these aliases.",
+            help="Run database related checks only against these aliases.",
+        )
     def handle(self, *app_labels, **options):

I've given the full suite a go here. Note that this preserves the behaviour of commands that directly operate against a database and default to [DB_DEFAULT_ALIAS] such as migrate as it's likely that not all databases are available in the context they are run.

Last edited 4 months ago by Simon Charette (previous) (diff)

comment:12 by Tim Graham, 4 months ago

Follow up ticket: #36663.

Note: See TracTickets for help on using tickets.

Download in other formats:

Issues