#31286 closed Bug (fixed)
Database specific fields checks should be databases aware.
Reported by: | Hongtao Ma | Owned by: | Hongtao Ma |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Simon Charette, Jacob Walls | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | yes | Patch needs improvement: | no |
Easy pickings: | yes | UI/UX: | no |
Description
In an attempt to trigger the check Error mentioned in Tickets: #31144, I found that the check for database backend doesn't check against the backend specified, e.g. --database mysql, rather, it always checks against the 'default' backend.
After diving into the source code, I found the following method defined in django/db/models/fields/init.py
def _check_backend_specific_checks(self, **kwargs): app_label = self.model._meta.app_label for db in connections: if router.allow_migrate(db, app_label, model_name=self.model._meta.model_name): return connections[db].validation.check_field(self, **kwargs) return []
It checks the first db defined in connections rather those provided by users.
A proposed change would be:
def _check_backend_specific_checks(self, **kwargs): app_label = self.model._meta.app_label errors = [] for db in kwargs['databases'] or ['default']: if router.allow_migrate(db, app_label, model_name=self.model._meta.model_name): errors.extend(connections[db].validation.check_field(self, **kwargs)) return errors
It worked as intended on my local machine.
I would happily provide a patch for this one.
Change History (11)
follow-up: 2 comment:1 by , 6 years ago
Cc: | added |
---|---|
Easy pickings: | set |
Summary: | Database specific checks always goes to the 'default' backend → Database specific fields checks should be databases aware. |
Triage Stage: | Unreviewed → Accepted |
comment:2 by , 6 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Applied the patch, two testcases failed, however:
- FAIL: test_checks_called_on_the_default_database (check_framework.test_multi_db.TestMultiDBChecks)
- FAIL: test_checks_called_on_the_other_database (check_framework.test_multi_db.TestMultiDBChecks)
for alias in databases: if router.allow_migrate(alias, app_label, model_name=self.model._meta.model_name): return connections[alias].validation.check_field(self, **kwargs) return []
The early return will cause only a single backend be checked, in case of multiple backends provided:
(djangodev) ➜ my_test_proj git:(iss_check) ✗ python manage.py check --database sqlite3 --database mysql System check identified no issues (0 silenced). (djangodev) ➜ my_test_proj git:(iss_check) ✗ python manage.py check --database mysql SystemCheckError: System check identified some issues: ERRORS: issue.AnotherIndex.name: (mysql.E001) MySQL does not allow unique CharFields to have a max_length > 255. issue.Networkservergroup.groupname: (mysql.E001) MySQL does not allow unique CharFields to have a max_length > 255. System check identified 2 issues (0 silenced). (djangodev) ➜ my_test_proj git:(iss_check) ✗
comment:3 by , 6 years ago
You have to pass databases={'default', 'other'}
to model.check()
calls in TestMultiDBChecks
.
comment:7 by , 6 months ago
I'm not following the analysis in the ticket description. My understanding is that before this change, the database-specific field checks always ran for all databases (for db in connections
). Now that it uses databases
as specified by the check
command, these checks are skipped unless check --database=...
is specified. While migrate
's check does specify a database, this change postpones these errors/warnings from the makemigrations
stage to the migrate
stage, at which point the user needs to take an extra corrective step of recreating the migration file. (In my case, I wrote a check to prohibit AutoField for the MongoDB backend.)
Although it's documented that "Database checks are not run by default because they do more than static code analysis as regular checks do," the database-specific field checks are only static analysis (as opposed to something like the MySQL check for strict mode which does query the database), so I see no reason they shouldn't be included as part of the normal checks. I think this patch should be reverted, but let me know if I've missed something.
comment:8 by , 6 months ago
Cc: | added |
---|
comment:9 by , 6 months ago
Thanks for sharing your thoughts Tim, having you work on that MongoDB backend has been a great source of feedback for the rough edges of build such a third-party app.
The intent of #31055 (0b83c8cc4db95812f1e15ca19d78614e94cf38dd) was two fold
- Ensure that the check framework is provided the exact set of databases it should perform checks against (as not all of them might be available all the time, e.g. migrations)
- Have the test framework take advantage of 1. as since #28478 test database creation is skipped if unneeded.
Now in retrospect I think that making the check
command default databases=None
to []
instead of list(connections)
might have been a mistake but it's also not clear whether [DEFAULT_DB_ALIAS]
might not have been also a good candidate (that's what other database related commands do)
In summary I wouldn't be opposed to reverting part of this patch that have check
default to all databases (or only the default
one) but we must maintain the part that made check databases aware as it's something the migration and testing framework rely on.
Although it's documented that "Database checks are not run by default because they do more than static code analysis as regular checks do," the database-specific field checks are only static analysis (as opposed to something like the MySQL check for strict mode which does query the database), so I see no reason they shouldn't be included as part of the normal checks.
I don't disagree in principle but that seems like a somewhat different can of worm isn't it? Database checks were never ran by default even prior to this patch so it has little to do this with this particular ticket beyond the fact that specifying a --database
is now the opt-in for enabling those instead of --tag database
.
comment:10 by , 3 weeks ago
I think there's a misunderstanding with your statement: "Database checks were never ran by default even prior to this patch." The only database check (designated by @register(Tags.database)
) invokes DatabaseValidation.check(). This invokes, for example, MySQL's on strict mode. This check queries the database server, but as far as I can tell, it doesn't require the presence of the test database. Nonetheless, I see no issue with the test runner skipping these checks.
Separately, the DatabaseValidation
class also has a check_field() hook which calls DatabaseValidation.check_field_type()
(if implemented by the backend, e.g. MySQL's). These checks do static analysis (checking field attributes), and while some of these checks may rely on a connection to the database server to populate connection.features
, for example, I don't believe these checks require the presence of the test database. The calling chain is Field.check()
-> Field._check_backend_specific_checks()
. These checks were run by makemigrations
(and even by runserver
which I'll get in to below) until 271fdab8b78af558238df51c64b4d1c8dd0792bb, which stopped iterating through all the databases (django.db.connections
) and instead uses the list of databases
aliases passed from (typically) the management command's --database
argument (e.g. for migrate).
I believe it's useful to have field-specific checks run for all databases at the makemigrations
stage rather than deferring them until migrate
, at which point a potentially invalid migration has already been created. Besides Field._check_backend_specific_checks()
, the other checks that rely on databases
are therefore are currently deferred are:
- Field._check_db_default()
- Field._check_db_comment()
- Model._check_long_column_names()
- Model._check_indexes()
- Model._check_constraints()
- Model._check_db_table_comment()
My proposal is to make makemigrations
run these checks for all connections:
-
django/core/management/commands/makemigrations.py
diff --git a/django/core/management/commands/makemigrations.py b/django/core/management/commands/makemigrations.py index 7f711ed7ae..5860c462c2 100644
a b class Command(BaseCommand): 102 102 def log(self, msg): 103 103 self.log_output.write(msg) 104 104 105 def get_check_kwargs(self, options): 106 kwargs = super().get_check_kwargs(options) 107 return {**kwargs, "databases": connections} 108 105 109 @no_translations 106 110 def handle(self, *app_labels, **options): 107 111 self.written_files = []
While thinking through this, I noticed another surprising, perhaps inadvertent regression / behavior change. Currently, in a multi-db setup (I'll use the aliases 'default' and 'other' as an example), unless I missed a place that passes all db aliases to check()
, the only way to get the pre-Django 3.1 behavior for Model._check_long_column_names()
(which consults all database aliases to find the maximum allowed column name) would be to invoke: check --database=default --database=other
.
Essentially, #31055 started treating the bulleted list of checks above as "database checks" which was perhaps overly aggressive and unneeded. They are really static checks that don't require the presence of a database. These checks used to be a lot more visible since they were even invoked by runserver
(right?). So arguably runserver
(and even check
) could get the same get_check_kwargs()
treatment proposed above, and then we may want to go back to excluding database tagged checks unless --database
is specified. What do you think?
comment:11 by , 11 days ago
Thanks for the analysis Tim
This check queries the database server, but as far as I can tell, it doesn't require the presence of the test database.
The mode is effectively not a database specific setting but it is nonetheless a value that must be retrieved using a query (SELECT @@sql_mode
to be precise) which means that a connection must exists against the server and if no test database is setup this leaves the testing machinery no choice but to connect directly to the non-test database which is frowned upon in a testing scenario. Now that we don't systematically create / setup for reuse test databases when the suite subset doesn't need it (#28478) we need a way to ensure we don't issue database queries against un-prepared databases.
These checks do static analysis (checking field attributes), and while some of these checks may rely on a connection to the database server to populate connection.features, for example, I don't believe these checks require the presence of the test database.
Similar scenario in this case, they may or may not require to issue database queries to perform these checks and we likely don't want them to be performed against non-test databases.
They are really static checks that don't require the presence of a database.
That's the crux of the issue I believe. They may or may not as they are opaque to the caller and its this opacity that forces setups where only a subset of the databases to be safely available (e.g. test runs) to limit their scope.
As I've shared in comment:9 I'm in agreement with you here though, I do believe that we should default to having unspecified database
subset mean all databases instead of no databases and not only for makemigrations
. What I mean is that we should be doing something like the following
-
django/core/checks/database.py
diff --git a/django/core/checks/database.py b/django/core/checks/database.py index d125f1eaa6..91265ac966 100644
a b 4 4 5 5 6 6 @register(Tags.database) 7 def check_database_backends(databases=None, **kwargs): 8 if databases is None: 9 return [] 7 def check_database_backends(databases, **kwargs): 10 8 issues = [] 11 9 for alias in databases: 12 10 conn = connections[alias] -
django/core/checks/registry.py
diff --git a/django/core/checks/registry.py b/django/core/checks/registry.py index 3139fc3ef4..a285abbc3f 100644
a b 1 1 from collections.abc import Iterable 2 2 from itertools import chain 3 3 4 from django.db import connections 4 5 from django.utils.inspect import func_accepts_kwargs 5 6 6 7 … … def run_checks( 85 86 if tags is not None: 86 87 checks = [check for check in checks if not set(check.tags).isdisjoint(tags)] 87 88 89 if databases is None: 90 databases = list(connections) 91 88 92 for check in checks: 89 93 new_errors = check(app_configs=app_configs, databases=databases) 90 94 if not isinstance(new_errors, Iterable): -
django/core/management/commands/check.py
diff --git a/django/core/management/commands/check.py b/django/core/management/commands/check.py index e61cff79f3..531e059c37 100644
a b def add_arguments(self, parser): 46 46 action="append", 47 47 choices=tuple(connections), 48 48 dest="databases", 49 help="Run database related checks against these aliases.",49 help="Run database related checks only against these aliases.", 50 50 ) 51 51 52 52 def handle(self, *app_labels, **options):
I've given the full suite a go here. Note that this preserves the behaviour of commands that directly operate against a database and default to [DB_DEFAULT_ALIAS]
such as migrate
as it's likely that not all databases are available in the context they are run.
Thanks, this was missed in 0b83c8cc4db95812f1e15ca19d78614e94cf38dd. We should use here the same approach and pass
databases
to these checks: