Opened 6 years ago

Closed 6 years ago

Last modified 11 days ago

#31286 closed Bug (fixed)

Database specific fields checks should be databases aware.

Reported by: Hongtao Ma Owned by: Hongtao Ma
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Simon Charette, Jacob Walls Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description

In an attempt to trigger the check Error mentioned in Tickets: #31144, I found that the check for database backend doesn't check against the backend specified, e.g. --database mysql, rather, it always checks against the 'default' backend.

After diving into the source code, I found the following method defined in django/db/models/fields/init.py

    def _check_backend_specific_checks(self, **kwargs):
        app_label = self.model._meta.app_label
        for db in connections:
            if router.allow_migrate(db, app_label, model_name=self.model._meta.model_name):
                return connections[db].validation.check_field(self, **kwargs)
        return []

It checks the first db defined in connections rather those provided by users.

A proposed change would be:

    def _check_backend_specific_checks(self, **kwargs):
        app_label = self.model._meta.app_label
        errors = []
        for db in kwargs['databases'] or ['default']:
            if router.allow_migrate(db, app_label, model_name=self.model._meta.model_name):
                errors.extend(connections[db].validation.check_field(self, **kwargs))
        return errors

It worked as intended on my local machine.
I would happily provide a patch for this one.

Change History (11)

comment:1 by Mariusz Felisiak, 6 years ago

Cc: Simon Charette added
Easy pickings: set
Summary: Database specific checks always goes to the 'default' backendDatabase specific fields checks should be databases aware.
Triage Stage: UnreviewedAccepted

Thanks, this was missed in 0b83c8cc4db95812f1e15ca19d78614e94cf38dd. We should use here the same approach and pass databases to these checks:

diff --git a/django/db/models/fields/__init__.py b/django/db/models/fields/__init__.py
index 6bd95f542c..c54b8f6e31 100644
--- a/django/db/models/fields/__init__.py
+++ b/django/db/models/fields/__init__.py
@@ -335,11 +335,13 @@ class Field(RegisterLookupMixin):
         else:
             return []
 
-    def _check_backend_specific_checks(self, **kwargs):
+    def _check_backend_specific_checks(self, databases=None, **kwargs):
         app_label = self.model._meta.app_label
-        for db in connections:
-            if router.allow_migrate(db, app_label, model_name=self.model._meta.model_name):
-                return connections[db].validation.check_field(self, **kwargs)
+        if databases is None:
+            return []
+        for alias in databases:
+            if router.allow_migrate(alias, app_label, model_name=self.model._meta.model_name):
+                return connections[alias].validation.check_field(self, **kwargs)
         return []
 
     def _check_validators(self):

in reply to:  1 comment:2 by Hongtao Ma, 6 years ago

Owner: changed from nobody to Hongtao Ma
Status: newassigned

Applied the patch, two testcases failed, however:

  • FAIL: test_checks_called_on_the_default_database (check_framework.test_multi_db.TestMultiDBChecks)
  • FAIL: test_checks_called_on_the_other_database (check_framework.test_multi_db.TestMultiDBChecks)
        for alias in databases:
            if router.allow_migrate(alias, app_label, model_name=self.model._meta.model_name):
                return connections[alias].validation.check_field(self, **kwargs)
        return []

The early return will cause only a single backend be checked, in case of multiple backends provided:

(djangodev) ➜  my_test_proj git:(iss_check) ✗ python manage.py check --database sqlite3 --database mysql
System check identified no issues (0 silenced).
(djangodev) ➜  my_test_proj git:(iss_check) ✗ python manage.py check --database mysql        
SystemCheckError: System check identified some issues:

ERRORS:
issue.AnotherIndex.name: (mysql.E001) MySQL does not allow unique CharFields to have a max_length > 255.
issue.Networkservergroup.groupname: (mysql.E001) MySQL does not allow unique CharFields to have a max_length > 255.

System check identified 2 issues (0 silenced).
(djangodev) ➜  my_test_proj git:(iss_check) ✗ 

comment:3 by Mariusz Felisiak, 6 years ago

You have to pass databases={'default', 'other'} to model.check() calls in TestMultiDBChecks.

comment:4 by Hongtao Ma, 6 years ago

Has patch: set
Needs tests: set

comment:5 by Simon Charette, 6 years ago

Triage Stage: AcceptedReady for checkin

Thanks for the patch @Taoup!

comment:6 by Mariusz Felisiak <felisiak.mariusz@…>, 6 years ago

Resolution: fixed
Status: assignedclosed

In 271fdab:

Fixed #31286 -- Made database specific fields checks databases aware.

Follow up to 0b83c8cc4db95812f1e15ca19d78614e94cf38dd.

comment:7 by Tim Graham, 6 months ago

I'm not following the analysis in the ticket description. My understanding is that before this change, the database-specific field checks always ran for all databases (for db in connections). Now that it uses databases as specified by the check command, these checks are skipped unless check --database=... is specified. While migrate's check does specify a database, this change postpones these errors/warnings from the makemigrations stage to the migrate stage, at which point the user needs to take an extra corrective step of recreating the migration file. (In my case, I wrote a check to prohibit AutoField for the MongoDB backend.)

Although it's documented that "Database checks are not run by default because they do more than static code analysis as regular checks do," the database-specific field checks are only static analysis (as opposed to something like the MySQL check for strict mode which does query the database), so I see no reason they shouldn't be included as part of the normal checks. I think this patch should be reverted, but let me know if I've missed something.

comment:8 by Jacob Walls, 6 months ago

Cc: Jacob Walls added

comment:9 by Simon Charette, 6 months ago

Thanks for sharing your thoughts Tim, having you work on that MongoDB backend has been a great source of feedback for the rough edges of build such a third-party app.

The intent of #31055 (0b83c8cc4db95812f1e15ca19d78614e94cf38dd) was two fold

  1. Ensure that the check framework is provided the exact set of databases it should perform checks against (as not all of them might be available all the time, e.g. migrations)
  2. Have the test framework take advantage of 1. as since #28478 test database creation is skipped if unneeded.

Now in retrospect I think that making the check command default databases=None to [] instead of list(connections) might have been a mistake but it's also not clear whether [DEFAULT_DB_ALIAS] might not have been also a good candidate (that's what other database related commands do)

In summary I wouldn't be opposed to reverting part of this patch that have check default to all databases (or only the default one) but we must maintain the part that made check databases aware as it's something the migration and testing framework rely on.

Although it's documented that "Database checks are not run by default because they do more than static code analysis as regular checks do," the ​database-specific field checks are only static analysis (as opposed to something like ​the MySQL check for strict mode which does query the database), so I see no reason they shouldn't be included as part of the normal checks.

I don't disagree in principle but that seems like a somewhat different can of worm isn't it? Database checks were never ran by default even prior to this patch so it has little to do this with this particular ticket beyond the fact that specifying a --database is now the opt-in for enabling those instead of --tag database.

Last edited 6 months ago by Simon Charette (previous) (diff)

comment:10 by Tim Graham, 3 weeks ago

I think there's a misunderstanding with your statement: "Database checks ​were never ran by default even prior to this patch." The only database check (designated by @register(Tags.database)) invokes DatabaseValidation.check(). This invokes, for example, MySQL's on strict mode. This check introspects the database and fails if it doesn't exist, so I understand that the test runner may need to skip these checks.

Separately, the DatabaseValidation class also has a check_field() hook which calls DatabaseValidation.check_field_type() (if implemented by the backend, e.g. MySQL's). This method does static analysis (checking field attributes) and doesn't require the presence of a database. The calling chain is Field.check() -> Field._check_backend_specific_checks(). These checks were run by makemigrations (and even by runserver which I'll get in to below) until 271fdab8b78af558238df51c64b4d1c8dd0792bb, which stopped iterating through all the databases (django.db.connections) and instead uses the list of databases aliases passed from (typically) the management command's --database argument (e.g. for migrate).

I believe it's useful to have field-specific checks run for all databases at the makemigrations stage rather than deferring them until migrate, at which point a potentially invalid migration has already been created. Besides Field._check_backend_specific_checks(), the other checks that rely on databases are therefore are currently deferred are:

My proposal is to make makemigrations run these checks for all connections:

  • django/core/management/commands/makemigrations.py

    diff --git a/django/core/management/commands/makemigrations.py b/django/core/management/commands/makemigrations.py
    index 7f711ed7ae..5860c462c2 100644
    a b class Command(BaseCommand):  
    102102    def log(self, msg):
    103103        self.log_output.write(msg)
    104104
     105    def get_check_kwargs(self, options):
     106        kwargs = super().get_check_kwargs(options)
     107        return {**kwargs, "databases": connections}
     108
    105109    @no_translations
    106110    def handle(self, *app_labels, **options):
    107111        self.written_files = []

While thinking through this, I noticed another surprising, perhaps inadvertent regression / behavior change. Currently, in a multi-db setup (I'll use the aliases 'default' and 'other' as an example), unless I missed a place that passes all db aliases to check(), the only way to get the pre-Django 3.1 behavior for Model._check_long_column_names() (which consults all database aliases to find the maximum allowed column name) would be to invoke: check --database=default --database=other.

Essentially, #31055 started treating the bulleted list of checks above as "database checks" which was overly aggressive. They are really static checks that don't require the presence of a database. These checks used to be a lot more visible since they were even invoked by runserver (right?). So arguably runserver (and even check) could get the same get_check_kwargs() treatment proposed above, and then we may want to go back to excluding database tagged checks by default.

Version 0, edited 3 weeks ago by Tim Graham (next)

comment:11 by Simon Charette, 11 days ago

Thanks for the analysis Tim

This check queries the database server, but as far as I can tell, it doesn't require the presence of the test database.

The mode is effectively not a database specific setting but it is nonetheless a value that must be retrieved using a query (SELECT @@sql_mode to be precise) which means that a connection must exists against the server and if no test database is setup this leaves the testing machinery no choice but to connect directly to the non-test database which is frowned upon in a testing scenario. Now that we don't systematically create / setup for reuse test databases when the suite subset doesn't need it (#28478) we need a way to ensure we don't issue database queries against un-prepared databases.

These checks do static analysis (checking field attributes), and while some of these checks may rely on a connection to the database server to populate connection.features, for example, I don't believe these checks require the presence of the test database.

Similar scenario in this case, they may or may not require to issue database queries to perform these checks and we likely don't want them to be performed against non-test databases.

They are really static checks that don't require the presence of a database.

That's the crux of the issue I believe. They may or may not as they are opaque to the caller and its this opacity that forces setups where only a subset of the databases to be safely available (e.g. test runs) to limit their scope.

As I've shared in comment:9 I'm in agreement with you here though, I do believe that we should default to having unspecified database subset mean all databases instead of no databases and not only for makemigrations. What I mean is that we should be doing something like the following

  • django/core/checks/database.py

    diff --git a/django/core/checks/database.py b/django/core/checks/database.py
    index d125f1eaa6..91265ac966 100644
    a b  
    44
    55
    66@register(Tags.database)
    7 def check_database_backends(databases=None, **kwargs):
    8     if databases is None:
    9         return []
     7def check_database_backends(databases, **kwargs):
    108    issues = []
    119    for alias in databases:
    1210        conn = connections[alias]
  • django/core/checks/registry.py

    diff --git a/django/core/checks/registry.py b/django/core/checks/registry.py
    index 3139fc3ef4..a285abbc3f 100644
    a b  
    11from collections.abc import Iterable
    22from itertools import chain
    33
     4from django.db import connections
    45from django.utils.inspect import func_accepts_kwargs
    56
    67
    def run_checks(  
    8586        if tags is not None:
    8687            checks = [check for check in checks if not set(check.tags).isdisjoint(tags)]
    8788
     89        if databases is None:
     90            databases = list(connections)
     91
    8892        for check in checks:
    8993            new_errors = check(app_configs=app_configs, databases=databases)
    9094            if not isinstance(new_errors, Iterable):
  • django/core/management/commands/check.py

    diff --git a/django/core/management/commands/check.py b/django/core/management/commands/check.py
    index e61cff79f3..531e059c37 100644
    a b def add_arguments(self, parser):  
    4646            action="append",
    4747            choices=tuple(connections),
    4848            dest="databases",
    49             help="Run database related checks against these aliases.",
     49            help="Run database related checks only against these aliases.",
    5050        )
    5151
    5252    def handle(self, *app_labels, **options):

I've given the full suite a go here. Note that this preserves the behaviour of commands that directly operate against a database and default to [DB_DEFAULT_ALIAS] such as migrate as it's likely that not all databases are available in the context they are run.

Last edited 11 days ago by Simon Charette (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top