#31320 closed New feature (wontfix)
Prevent BEGIN and COMMIT in RunSQL in atomic migrations.
Reported by: | Adam Johnson | Owned by: | nobody |
---|---|---|---|
Component: | Migrations | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Simon Charette | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Migrations are atomic
by default, which means they use a transaction on backends that support them (PostgreSQL).
RunSQL
allows running arbitrary SQL, including BEGIN
/ COMMIT
which break the atomic transaction handling.
PostgreSQL will warn about BEGIN
within a transaction ( https://www.postgresql.org/docs/current/sql-begin.html ) but this warning won't be displayed on a standard Django setup. Equally it will warn about a COMMIT
when no transaction is running.
(SQLite at least raises an error for a BEGIN
within a transaction).
Beginners to the migration framework can miss that migrations are 'atomic' by default, and use their SQL knowledge to write BEGIN
/COMMIT
, not realizing they're actually breaking the transactional integrity of their migrations.
It would be good to detect this and refuse to run such RunSQL
operations.
Change History (8)
comment:1 by , 5 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Summary: | Prevent BEGIN and COMMIT in RunSQL in atomic migrations → Prevent BEGIN and COMMIT in RunSQL in atomic migrations. |
comment:2 by , 5 years ago
I'm pretty sure sqlparse parses the BEGIN
statement differently ot the keyword in DO ... BEGIN .. END
. And we already parse the SQL in RunSQL for splitting, so it doesn't seem like it would be much overhead to me to make a pass over the tokens at that point and warn/error. Migrations using BEGIN
/ COMMIT
could possibly cause data loss so I think it's a fair case to guard against.
comment:3 by , 5 years ago
Cc: | added |
---|
And we already parse the SQL in RunSQL for splitting, so it doesn't seem like it would be much overhead to me to make a pass over the tokens at that point and warn/error.
Yes, but not on PostgreSQL. For a long statements this can introduce a performance regression, IMO. Do you want to prepare a patch? I'm not convinced that we need to be more protective in RunSQL()
, I've used it many times but always for a complicated data migrations.
comment:4 by , 5 years ago
I've gone for a docs PR instead: https://github.com/django/django/pull/12511
The reason I made this was when squashing migrations on a client project, it had some RunSQL operations in migrations that worked fine alone, but when the operatiosn were squashed together it broke the squashed migration's transactional state.
comment:5 by , 5 years ago
The reason I made this was when squashing migrations on a client project, it had some RunSQL operations in migrations that worked fine alone, but when the operatiosn were squashed together it broke the squashed migration's transactional state.
Interesting, were these migrations marked atomic = False
initially? It looks like the current code completely ignores this flag. I guess the command could at least warn about it.
Migration squashing is definitely an area that would benefit from a bit of love wrt to dependency resolving and could be a good GSOC project in the next years if we ever apply.
From the docs themselves
To manually resolve a CircularDependencyError, break out one of the ForeignKeys in the circular dependency loop into a separate migration, and move the dependency on the other app with it. If you’re unsure, see how makemigrations deals with the problem when asked to create brand new migrations from your models. In a future release of Django, squashmigrations will be updated to attempt to resolve these errors itself.
It looks like non-atomic migrations pose a similar problem where they'd need to be considered an optimization barrier and require the creation of a standalone migration.
comment:6 by , 5 years ago
Interesting, were these migrations marked atomic = False initially?
No
It looks like non-atomic migrations pose a similar problem where they'd need to be considered an optimization barrier and require the creation of a standalone migration.
Yes, that would be great
Thanks for this ticket, however
RunSQL()
is "useful for more advanced features of database backends that Django doesn’t support directly", moreover you can useBEGIN
not only to start a transaction, e.g. with DO. IMO parsing a custom SQL provided by users inRunSQL()
is out of the scope of Django.