#30490 closed Bug (wontfix)
migrations unique_index on (app, name).
Reported by: | Richard Kojedzinszky | Owned by: | nobody |
---|---|---|---|
Component: | Uncategorized | Version: | dev |
Severity: | Normal | Keywords: | migrations parallel run |
Cc: | Adam Johnson | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I've a django based api service. I run it in containers, in kubernetes. When I update my image, basically I run the migiration process, then start the application. With k8s for example, when I start the deployment with 2 or more replicas, actually 2 parallel running migration process will start. With the following very simple migration process, unfortunately the migration will be applied twice:
from django.db import migrations from django.db.models import F import time def forward(apps, schema_editor): TT = apps.get_model('center', 'TestTable') time.sleep(10) TT.objects.all().update(value=F('value') + 1) def backward(apps, schema_editor): TT = apps.get_model('center', 'TestTable') TT.objects.all().update(value=F('value') - 1) class Migration(migrations.Migration): dependencies = [ ('center', '0007_testtable'), ] operations = [ migrations.RunPython(forward, backward) ]
Even if the migration process is ran in a transaction, due to the unique index missing, it will be applied multiple times.
Even though my setup may be rare, I think it would really be necessary to prevent migrations to be applied more than once. Db unique_together could be utilized for that.
Change History (12)
comment:1 by , 5 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Summary: | migrations unique_index on (app, name) → migrations unique_index on (app, name). |
Version: | 1.11 → master |
comment:2 by , 5 years ago
That is why I wrote, that my approach might not be the best practice. With an automated deployments, I would really like the case when Django itself would prevent applying the migrations multiple times. That would even help preventing errors made by humans. You are right, the whole migration step should be in one database transaction for this to work. And as it is now, I rather think it is a bug.
follow-up: 4 comment:3 by , 5 years ago
Resolution: | wontfix |
---|---|
Status: | closed → new |
Unfortunately, knowing that the migration tasks and recording its existence is not run in the same transaction, in a worst case scenario, the info that a migration has been applied can be lost at all. So, in any circumstances, they should be in the same transaction.
Replying to Richard Kojedzinszky:
I've a django based api service. I run it in containers, in kubernetes. When I update my image, basically I run the migiration process, then start the application. With k8s for example, when I start the deployment with 2 or more replicas, actually 2 parallel running migration process will start. With the following very simple migration process, unfortunately the migration will be applied twice:
from django.db import migrations from django.db.models import F import time def forward(apps, schema_editor): TT = apps.get_model('center', 'TestTable') time.sleep(10) TT.objects.all().update(value=F('value') + 1) def backward(apps, schema_editor): TT = apps.get_model('center', 'TestTable') TT.objects.all().update(value=F('value') - 1) class Migration(migrations.Migration): dependencies = [ ('center', '0007_testtable'), ] operations = [ migrations.RunPython(forward, backward) ]Even if the migration process is ran in a transaction, due to the unique index missing, it will be applied multiple times.
Even though my setup may be rare, I think it would really be necessary to prevent migrations to be applied more than once. Db unique_together could be utilized for that.
comment:4 by , 5 years ago
Perhaps, a similar patch would be a good starting point:
https://github.com/rkojedzinszky/django/commit/12477598fb3711f50ff482328ddcda62934caee6
Replying to Richard Kojedzinszky:
Unfortunately, knowing that the migration tasks and recording its existence is not run in the same transaction, in a worst case scenario, the info that a migration has been applied can be lost at all. So, in any circumstances, they should be in the same transaction.
Replying to Richard Kojedzinszky:
I've a django based api service. I run it in containers, in kubernetes. When I update my image, basically I run the migiration process, then start the application. With k8s for example, when I start the deployment with 2 or more replicas, actually 2 parallel running migration process will start. With the following very simple migration process, unfortunately the migration will be applied twice:
from django.db import migrations from django.db.models import F import time def forward(apps, schema_editor): TT = apps.get_model('center', 'TestTable') time.sleep(10) TT.objects.all().update(value=F('value') + 1) def backward(apps, schema_editor): TT = apps.get_model('center', 'TestTable') TT.objects.all().update(value=F('value') - 1) class Migration(migrations.Migration): dependencies = [ ('center', '0007_testtable'), ] operations = [ migrations.RunPython(forward, backward) ]Even if the migration process is ran in a transaction, due to the unique index missing, it will be applied multiple times.
Even though my setup may be rare, I think it would really be necessary to prevent migrations to be applied more than once. Db unique_together could be utilized for that.
comment:5 by , 5 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Hey Richard, I understand your desire to get this issue fixed but it's more complex than wrapping apply_migration
and unapply_migration
in a transaction.
For example, some backends don't support performing schema alteration within a transaction (e.g. MySQL will silently commit the current transaction on such operation) and some migrations are explicitly marked as non-atomic to prevent long running transactions (e.g. large back-filling data migrations).
For these reasons holding a database level lock for the duration of a migration application is probably not a feasible option. Altering the django_migrations
table is also problematic given it's not tracked by migrations itself which is another kind of worms.
In your particular case I'd suggest you explore using another form of distributed lock (e.g. Redis) namespaced by your deploy hash to prevent concurrent migration but you're probably better off getting support from our support channels of k8s ones where other users might have already come up with a solution to your problem.
For these reasons I agree with Mariusz' resolution and ask that you follow triaging guidelines with regards to wontfix tickets. Thanks.
comment:6 by , 5 years ago
You don't need to use Redis for distributed locking, you can use advisory locking in your database server. They can live outside of transactions.
In PostgreSQL they are called advisory locks - https://hashrocket.com/blog/posts/advisory-locks-in-postgres . They are by an integer key.
In MySQL/MariaDB they are called user locks - https://mariadb.com/kb/en/library/get_lock/ . They are by a string key.
I don't know about SQLite or Oracle. Quick internet searches don't reveal anything
Potentially Django could use these advisory locks.
comment:7 by , 5 years ago
Cc: | added |
---|
comment:8 by , 5 years ago
The way we solved this with Kubernetes is to create and launch a Job which will run manage.py migrate as part of the deployment pipeline. Once that Job is complete, update the Deployment/ReplicaSet as needed.
As always you have to be completely sure the migration can run concurrently with your app, but in genera running migrations as an initpod or as part of the web app pod is really not advised, and adding a hack like a distributed lock will only bring you pain.
comment:9 by , 17 months ago
Resolution: | wontfix |
---|---|
Status: | closed → new |
@simon As the migration script and recording it now runs in the same transaction, the only thing left here is to add the unique key on the records table. Am I right?
comment:10 by , 17 months ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
I appreciate you'd like to reopen the ticket, but please follow the triaging guidelines with regards to wontfix tickets and don't reopen closed tickets before reaching a consensus on the forum.
#34610 was a duplicate.
Thanks for the report, however it is rather a support issue, IMO. Adding unique on
(app, name)
will not fix your issue, because Django adds entry to the table with migrations history outside the main transaction.First of all I would use
select_for_update()
to lock table before update (to prevent race condition). Secondly you should run migrations (on the same database) only on one instance.Closing per TicketClosingReasons/UseSupportChannels.