Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#34009 closed Uncategorized (invalid)

migrations.RunPython runs queries against wrong database

Reported by: Marcel Moreaux Owned by: nobody
Component: Uncategorized Version: 4.1
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

In an application with multiple databases, RunPython in migrations appears to always run queries on the default database, instead of the currently selected database, resulting in database errors. This can be observed with manage.py migrate --database, but the problem is also triggered by manage.py test:

(venv) [marcelm@carbon foo .]% ./manage.py test
Found 1 test(s).
Creating test database for alias 'default'...
Creating test database for alias 'extra'...
Traceback (most recent call last):
  File "/home/marcelm/meh/venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "/home/marcelm/meh/venv/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 357, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.OperationalError: table bar_bar has no column named name

The relevant migration:

from django.db import migrations

# BUG: This always uses the default database, even when the migrations are
# run to set up the extra database.
def add_bar(apps, schema_editor):
    print('RunPython starts here')
    Bar = apps.get_model('bar', 'Bar')
    Bar.objects.create(name='hello')
    print('RunPython ends here')

class Migration(migrations.Migration):
    dependencies = [ ('bar', '0001_initial'), ]

    operations = [
        migrations.RunPython(add_bar, reverse_code=migrations.RunPython.noop),
    ]

(migration 0001 adds the name field)

More debug info

Adding debug code to print the DB queries and the database on which they're run confirms this. Selected output:

% QDEBUG=1 ./manage.py test -v3
Creating test database for alias 'default' ('file:memorydb_default?mode=memory&cache=shared')...
Found 1 test(s).
CONNECTING TO {'database': 'file:memorydb_default?mode=memory&cache=shared', 'detect_types': 3, 'check_same_thread': False, 'uri': True}
QUERY AGAINST file:memorydb_default?mode=memory&cache=shared: 
            SELECT name, type FROM sqlite_master
            WHERE type in ('table', 'view') AND NOT name='sqlite_sequence'
            ORDER BY name
[...]
Creating test database for alias 'extra' ('file:memorydb_extra?mode=memory&cache=shared')...
CONNECTING TO {'database': 'file:memorydb_extra?mode=memory&cache=shared', 'detect_types': 3, 'check_same_thread': False, 'uri': True}
QUERY AGAINST file:memorydb_extra?mode=memory&cache=shared: 
            SELECT name, type FROM sqlite_master
            WHERE type in ('table', 'view') AND NOT name='sqlite_sequence'
            ORDER BY name
[...]
  Applying bar.0002_add_bar...QUERY AGAINST file:memorydb_extra?mode=memory&cache=shared: PRAGMA foreign_keys = OFF
QUERY AGAINST file:memorydb_extra?mode=memory&cache=shared: PRAGMA foreign_keys
QUERY AGAINST file:memorydb_extra?mode=memory&cache=shared: BEGIN
RunPython starts here
QUERY AGAINST file:memorydb_default?mode=memory&cache=shared: INSERT INTO "bar_bar" ("name") VALUES (%s)
QUERY AGAINST file:memorydb_extra?mode=memory&cache=shared: PRAGMA foreign_key_check
QUERY AGAINST file:memorydb_extra?mode=memory&cache=shared: PRAGMA foreign_keys = ON
Traceback (most recent call last):
  File "/home/marcelm/meh/venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "/home/marcelm/meh/foo/foo/settings.py", line 70, in wrapped
    return func(self, query, params)
  File "/home/marcelm/meh/venv/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 357, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.OperationalError: table bar_bar has no column named name

(note QUERY AGAINST file:memorydb_default just after RunPython starts here)

Similar problems can be observed by running manage.py migrate --database extra.

Reproducing

I attached the full output of both QDEBUG=1 manage.py test -v3 and QDEBUG=1 manage.py migrate --database extra. Also attached is a tar archive of a minimal project that reproduces the bug. I tested on Django 3.2.15, 4.1.1, and the main branch on github.

To reproduce, extract the minimal reproducible example somewhere appropriate and follow these steps:

tar xvzf foo.tgz
virtualenv venv
. venv/bin/activate
pip install Django
cd foo
QDEBUG=1 ./manage.py test
QDEBUG=1 ./manage.py migrate --database=extra

QDEBUG triggers some monkey patching in settings.py to print database queries. Queries on default are printed in magenta, queries on extra are printed in cyan, and all other queries are printed in yellow. This makes the problematic queries easy to spot.

Attachments (3)

test-full-output.txt (73.2 KB ) - added by Marcel Moreaux 2 years ago.
migrate-full-output.txt (30.2 KB ) - added by Marcel Moreaux 2 years ago.
foo.tgz (2.2 KB ) - added by Marcel Moreaux 2 years ago.

Download all attachments as: .zip

Change History (8)

by Marcel Moreaux, 2 years ago

Attachment: test-full-output.txt added

by Marcel Moreaux, 2 years ago

Attachment: migrate-full-output.txt added

by Marcel Moreaux, 2 years ago

Attachment: foo.tgz added

comment:1 by Alex Morega, 2 years ago

A quick search turned up this bit of the docs: https://docs.djangoproject.com/en/4.1/howto/writing-migrations/#data-migrations-and-multiple-databases

In order to do that you can check the database connection’s alias inside a RunPython operation by looking at the schema_editor.connection.alias attribute

So, IIUC, you're supposed to select the database yourself in a RunPython operation. Or maybe I've missed something entirely, in which case, I apologise :)

comment:2 by Mariusz Felisiak, 2 years ago

Resolution: invalid
Status: newclosed

comment:3 by Marcel Moreaux, 2 years ago

Resolution: invalid
Status: closednew

Hi Alex, thanks for you feedback!

That's not how I'm reading the documentation though. What it says there is that if you want your migrations to only run on a specific database, you need to perform that check.

That's not what's happening here; all migrations are correctly run against the extra database, except the RunPython migration, which is run against default. The documentation you link does not imply that RunPython migrations must select the database themselves, only that they may check the selected database.

In fact, the documentation for database routers (https://docs.djangoproject.com/en/4.1/topics/db/multi-db/#automatic-database-routing) explicitly mentions that the Django's default database router ensures 'stickyness' to objects' original database, which seems violated here.

And finally, how would the migrations even set this? When running the migrations in the context of the default database, the database should be default, and when the migrations are run to set up the extra database, the database should be extra. Hardcoding either database in the migration won't work.

I maintain this is a bug, so I'm reopening this ticket.

comment:4 by Mariusz Felisiak, 2 years ago

Resolution: invalid
Status: newclosed

I maintain this is a bug, so I'm reopening this ticket.

It's not a bug but a documented behavior:

"RunPython does not magically alter the connection of the models for you; any model methods you call will go to the default database unless you give them the current database alias (available from schema_editor.connection.alias, where schema_editor is the second argument to your function)."

comment:5 by Marcel Moreaux, 2 years ago

Ah, I see, I missed that.

That's still very unfortunate behaviour though; it means RunPython methods should be littered with explicit .using() clauses to ensure they actually use the DB specified in schema_editor.connection.alias. Miss one, and it may inadvertently wreak havoc on the default database... (as was happening in our project)

Note: See TracTickets for help on using tickets.
Back to Top