Opened 2 months ago

Last modified 2 months ago

#31632 new Bug

Single queries without transaction do not follow AUTOCOMMIT

Reported by: Filipe Pina Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: Florian Apolloner, Simon Charette, Adam (Chainz) Johnson Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The change in this PR https://github.com/django/django/pull/10448 removed the transaction from single queries in Model.save_base.

This broke an application we tried to upgrade from Django 2.1 to 2.2 last year as some operations were not persisted across DB connections.
We ended up rolling back but now, taking a better look, we pinned the change to this PR and the fact that our MySQL instance has the global autocommit set to 0.

As Django <2.2 always used transactions, this was never an issue. With this PR introduced, I believe this raises confusion:

  • if Django AUTOCOMMIT defaults to True, we'd expect it to behave the same as DB autocommit global, even if it's not exactly the same.

As we need global autocommit to 0 to avoid careless manual operations on the DB (that involve related queries), we'll be working around it by using OPTIONS init_command with set autocommit=1.

If this is accepted as an issue, I'd like to leave some suggestions as well (and I won't mind doing the PR after one is chosen):

  • MySQL global autocommit setting detected as database feature
  • If said feature is False and OPTIONS AUTOCOMMIT is True, init connection with set autocommit=1

Or

  • Always init connection with set autocommit=OPTIONS[AUTOCOMMIT]

Or, the most agnostic one, I suppose

  • In Model.save_base, if AUTOCOMMIT=True, wrap single queries with transaction as well.

Change History (19)

comment:1 Changed 2 months ago by Simon Charette

Cc: Florian Apolloner added

We ended up rolling back but now, taking a better look, we pinned the change to this PR and the fact that our MySQL instance has the global autocommit set to 0.

Just to make sure, does that mean your server was configured to disable autocommit but you didn't disable Django's transaction management? That could certainly lead to strange behaviour if Django thinks it's in autocommit mode but it actually isn't. The documentation is also pretty clear about why this setup is not recommended.

Last edited 2 months ago by Simon Charette (previous) (diff)

comment:2 in reply to:  1 Changed 2 months ago by Filipe Pina

Replying to Simon Charette:

The documentation is also pretty clear about why this setup is not recommended.

Yes, autocommit in DB is off and autocommit in Django is ON. Could you highlight where does it state this setup is not recommended?
The only comment I see is saying that Django overrides database default autocommit behavior which was indeed true until this PR: as everything was wrapped in a transaction by the ORM itself, it didn't matter what was the db backend configuration.

I don't see any highlight mentioning weird behavior coming from unaligned settings. The only warning is if disabling Django AUTOCOMMIT you'll leave it up to the db backend, but that's not the case.

And, once again, if every query was wrapped in a transaction before 2.2, this setting in MySQL is completely ignored.

Plus statements like https://docs.djangoproject.com/en/3.0/topics/db/transactions/#django-s-default-transaction-behavior are confusing as it states uses autocommit by default but apparently doesn't, because single queries are not committed anymore, unless the database (not Django) commits them.

comment:3 Changed 2 months ago by Florian Apolloner

@Simon, thank you for the CC much appreciated.

Uff that is a tricky one. What is the reason for running with autocommit=off? IMO it is usually not the behavior you want. You accumulate locks over the length of the implicit transaction and generally there are queries which don't need transactions or can't be in transactions (alter table on mysql, at least for earlier versions, probably getting better now)

In Model.save_base, if AUTOCOMMIT=True, wrap single queries with transaction as well.

With actual AUTOCOMMIT=True (if it is working), there is no need for a transaction. If Django has AUTOCOMMIT=True configured it should do whatever it can to put the database session into such a state, independent of global server configuration. In that sense Django should execute set autocommit=1 or whatever the equivalent is because otherwise further assumptions will not hold true…

comment:4 Changed 2 months ago by Simon Charette

In that sense Django should execute set autocommit=1 or whatever the equivalent is because otherwise further assumptions will not hold true…

Something that puzzles me is that Django already sets autocommit=1 on connection initialization unless DATABASES_AUTOCOMMIT=False which the reporter explicitly stated they didn't do. Not sure what I'm missing here but even if their server is configured to have autocommit=0 the connections created by Django should all be in autocommit=1 in this case.

Filipe, if you open a Django shell against for your project and input from django.db import connection;print(connection.get_autocommit()) what do you get?

Last edited 2 months ago by Simon Charette (previous) (diff)

comment:5 Changed 2 months ago by Florian Apolloner

if Django AUTOCOMMIT defaults to True, we'd expect it to behave the same as DB autocommit global, even if it's not exactly the same.

I disagree, I'd expect autocommit behavior no matter what the database defaults are.

Something that puzzles me is that Django already sets autocommit=1 ​

Mhm, I am not really a MySQL user, but I guess an SQL log from a MySQL server illustrating that issue might shed some light on it?

comment:6 Changed 2 months ago by Filipe Pina

Thanks for the questions and feedback guys, no wonder I haven't stopped using Django since 0.13 with this kind of responsiveness to community issues!

In that sense Django should execute set autocommit=1 or whatever the equivalent is because otherwise further assumptions will not hold true…

Exactly Florian, that was my point, and expectation.

Filipe, if you open a Django shell against for your project and input from django.db import connection;print(connection.get_autocommit()) what do you get?

After removing set autocommit=1 from init_command (as the fix we used), it still returns True, Simon. Which goes in line with the code line that Florian pointed out I suppose..

But something must be missing then (maybe that does not translate into an actual set autocommit in MySQL backend implementation?), because adding set autocommit=1 to DATABASES[default][OPTIONS][init_command] made all the difference.

I have a test that we used with git bissect to find the commit/change, I'll try to run it again and take a look at what happens at that line. I'll also try to replicate it with a clean app and MySQL docker to easily demonstrate, as currently I'm using a dev database that I do not control (nor I can see the query logs).

What I have in the test logs, as I've used django.db.backends logger DEBUG level to see queries and also monkeypatched MySQLdb Connect class/method to capture connection initialization, is that there is no SET AUTOCOMMIT=1 sent nor any such parameter in the connection instantiation.

I'll update with more details later tonight.

comment:7 Changed 2 months ago by Filipe Pina

Added full example (including mysql docker with autocommit off) in https://github.com/fopina/tmp_docker_bug

comment:8 Changed 2 months ago by Florian Apolloner

Something is going very wrong here (I enabled mysql query logging):

2020-05-28T14:34:26.959959Z	    3 Connect	root@127.0.0.1 on mysql using TCP/IP
2020-05-28T14:34:26.960183Z	    3 Query	SET autocommit=0
2020-05-28T14:34:26.960334Z	    3 Query	SELECT @@SQL_AUTO_IS_NULL
2020-05-28T14:34:26.960490Z	    3 Query	SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED
2020-05-28T14:34:26.960630Z	    3 Query	SET autocommit=0
2020-05-28T14:34:26.962107Z	    3 Query	INSERT INTO `polls_notapoll` (`text`) VALUES (1)
2020-05-28T14:34:26.962351Z	    3 Query	COMMIT
2020-05-28T14:34:26.977407Z	    3 Query	UPDATE `polls_notapoll` SET `text` = 2 WHERE `polls_notapoll`.`id` = 1
2020-05-28T14:34:26.979122Z	    3 Query	SELECT `polls_notapoll`.`id`, `polls_notapoll`.`text` FROM `polls_notapoll` WHERE `polls_notapoll`.`id` = 1
2020-05-28T14:34:26.980213Z	    3 Quit	
2020-05-28T14:34:26.981963Z	    4 Connect	root@127.0.0.1 on mysql using TCP/IP
2020-05-28T14:34:26.982421Z	    4 Query	SET autocommit=0
2020-05-28T14:34:26.982909Z	    4 Query	SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED
2020-05-28T14:34:26.983500Z	    4 Query	SELECT `polls_notapoll`.`id`, `polls_notapoll`.`text` FROM `polls_notapoll` WHERE `polls_notapoll`.`id` = 1
2020-05-28T14:34:27.031868Z	    4 Quit	

Why it sends "SET autocommit=0" is beyond my understanding, but lets see what I can dig up :)

Last edited 2 months ago by Florian Apolloner (previous) (diff)

comment:9 Changed 2 months ago by Florian Apolloner

If you ask me this is a mysqlclient bug:

#!/usr/bin/env python
import MySQLdb as Database

db1 = Database.connect(
    host="127.0.0.1",
    user="root",
    charset="utf8",
    db="mysql",
    client_flag=2,
    autocommit=True,
)

print(db1.get_autocommit())

db2 = Database.connect(
    host="127.0.0.1",
    user="root",
    charset="utf8",
    db="mysql",
    client_flag=2,
    autocommit=False,
)

print(db2.get_autocommit())

The result is true in both cases!

comment:10 Changed 2 months ago by Florian Apolloner

Tested this:

#!/usr/bin/env python
import MySQLdb as Database

db1 = Database.connect(
    host="127.0.0.1",
    user="root",
    charset="utf8",
    db="mysql",
    client_flag=2,
    autocommit=True,
)

print(db1.get_autocommit())
db1.autocommit(False)
print(db1.get_autocommit())
db1.autocommit(True)
print(db1.get_autocommit())
db1.autocommit(False)
print(db1.get_autocommit())

against MySQL 8, it always prints True and the query log is like YOLO:

2020-05-28T15:07:43.224129Z	   14 Connect	root@127.0.0.1 on mysql using TCP/IP
2020-05-28T15:07:43.224440Z	   14 Query	SET autocommit=0
2020-05-28T15:07:43.224629Z	   14 Query	SET autocommit=0
2020-05-28T15:07:43.225759Z	   14 Quit	

Related bug reports for MySQL:

And the code in mysqlclient which exhibits this behavior:
https://github.com/PyMySQL/mysqlclient-python/blob/4c1950148ba10e23017bd5190f9da6bdb347e6b3/MySQLdb/_mysql.c#L755-L765

Django generally works (that is if the global autocommit is not set to off) since it manually tracks autocommit and as such is not affected by this bug.

mysql-connector-python exhibits the same broken behavior.

This fixes it for Django (don't ask me if that would be sane to include):

diff --git a/django/db/backends/mysql/base.py b/django/db/backends/mysql/base.py
index eb0ba18cda..6fd4364cd6 100644
--- a/django/db/backends/mysql/base.py
+++ b/django/db/backends/mysql/base.py
@@ -260,8 +260,8 @@ class DatabaseWrapper(BaseDatabaseWrapper):
             pass
 
     def _set_autocommit(self, autocommit):
-        with self.wrap_database_errors:
-            self.connection.autocommit(autocommit)
+        with self.cursor() as cursor:
+            cursor.execute('SET autocommit=%s', [1 if autocommit else 0])
 
     def disable_constraint_checking(self):
         """

And results in:

2020-05-28T15:16:10.911555Z	   22 Connect	root@127.0.0.1 on mysql using TCP/IP
2020-05-28T15:16:10.911893Z	   22 Query	SET autocommit=0
2020-05-28T15:16:10.912164Z	   22 Query	SET autocommit=1
2020-05-28T15:16:10.912389Z	   22 Query	SELECT @@SQL_AUTO_IS_NULL
2020-05-28T15:16:10.912568Z	   22 Query	SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED
2020-05-28T15:16:10.912818Z	   22 Query	SET autocommit=0
2020-05-28T15:16:10.920255Z	   22 Query	SELECT VERSION()
2020-05-28T15:16:10.920560Z	   22 Query	INSERT INTO `polls_notapoll` (`text`) VALUES (1)
2020-05-28T15:16:10.921004Z	   22 Query	COMMIT
2020-05-28T15:16:10.935950Z	   22 Query	SET autocommit=1
2020-05-28T15:16:10.936680Z	   22 Query	UPDATE `polls_notapoll` SET `text` = 2 WHERE `polls_notapoll`.`id` = 3
2020-05-28T15:16:10.943920Z	   22 Query	SELECT `polls_notapoll`.`id`, `polls_notapoll`.`text` FROM `polls_notapoll` WHERE `polls_notapoll`.`id` = 3 LIMIT 21
2020-05-28T15:16:10.945997Z	   22 Quit	
2020-05-28T15:16:10.949059Z	   23 Connect	root@127.0.0.1 on mysql using TCP/IP
2020-05-28T15:16:10.949937Z	   23 Query	SET autocommit=0
2020-05-28T15:16:10.950920Z	   23 Query	SET autocommit=1
2020-05-28T15:16:10.951961Z	   23 Query	SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED
2020-05-28T15:16:10.952983Z	   23 Query	SELECT `polls_notapoll`.`id`, `polls_notapoll`.`text` FROM `polls_notapoll` WHERE `polls_notapoll`.`id` = 3 LIMIT 21
2020-05-28T15:16:10.986958Z	   23 Quit	

The overhead of set autocommit=0 is still there for every connection because mysqldb does this to be dbapi compliant. One could remove this by adding autocommit=None to the OPTIONS dict.

comment:11 Changed 2 months ago by Simon Charette

Cc: Simon Charette added

If you ask me this is a mysqlclient bug

yeah I came up to the same conclusion locally as well. It's crazy though given mysqlclient calls directly into mysql.h's mysql_autocommit.

comment:12 Changed 2 months ago by Florian Apolloner

Correction, mysql-connector-python works since it manually send SET @@session.autocommit = OFF|ON

yeah I came up to the same conclusion locally as well. It's crazy though given mysqlclient calls directly into mysql.h's ​mysql_autocommit.

That is not the problem, the problem is that before it calls into mysql it checks the server state to see whether autocommit is on or off, it never even gets that far (at least not for autocommit=1):
https://github.com/PyMySQL/mysqlclient-python/blob/4c1950148ba10e23017bd5190f9da6bdb347e6b3/MySQLdb/connections.py#L243

comment:13 Changed 2 months ago by Simon Charette

Ahh that makes sense.

Correction, mysql-connector-python works since it manually send SET @@session.autocommit = OFF|ON

I guess we could then adjust set_autocommit to call directly _mysql.connection.autocommit(self, value) when mysqlclient is used under a certain version since I assume they'll be interested in addressing that on their side?

comment:14 Changed 2 months ago by Filipe Pina

I don't usually run with MySQL (PG favorite choice), so I don't know how often these sort of bugs come up or how you tackle them.

But as a developer that uses Django, I wouldn't mind if instead of "working around" external bugs, a warning would be raised (with unique name to be suppressed, if desired), something along the lines:

`
mysqlclient <=1.4.6 (TBC) always reports server autocommit as True so Django is unable to choose whether to change it or not.
To avoid unexpected issues, set it to 1 globally in your MySQL instance or add set autocommit = 1 to your init_command.
`

comment:15 Changed 2 months ago by Florian Apolloner

Cc: Adam (Chainz) Johnson added

I don't usually run with MySQL (PG favorite choice), so I don't know how often these sort of bugs come up or how you tackle them.

Same here, so I'll defer to our MySQL users on how to handle this. Adding Adam since he is also a heavy MySQL user. I am fine with fixing it in Django and/or mysqlclient but I will not write the code for it.

comment:16 Changed 2 months ago by Florian Apolloner

Triage Stage: UnreviewedAccepted
Type: UncategorizedBug
Version: 2.2master

comment:17 Changed 2 months ago by Adam (Chainz) Johnson

We can require a later mysqlclient version imo, after the bug was fixed.

Imo the "real" solution is to move Django to use PyMySQL for MySQL connections. The same developer supports both but hasn't earnestly supported mysqlclient for years. PyMySQL seems to have more correct autocommit behaviour, defaulting to current server value: https://github.com/PyMySQL/PyMySQL/blob/c3c87a7e773dbb09def0b081c70dd55fe83b9633/pymysql/connections.py#L158 .

comment:18 Changed 2 months ago by Florian Apolloner

What the docs say is one thing, but if you look at get_autocommit in https://github.com/PyMySQL/PyMySQL/blob/c3c87a7e773dbb09def0b081c70dd55fe83b9633/pymysql/connections.py#L391-L393 it still uses server_status like the C variant which simply is broken to my understanding. So unless the mysql bugs are fixed server_status will still be wrong to the best of my knowledge -- but that would require actually testing Django against PyMySQL which is more than I want to do ;)

comment:19 Changed 2 months ago by Florian Apolloner

Just to clarify: Both (ie PyMYSQL and mysqlclient) have autocommit=None which should use the server default. The problem is that MySQL itself has very old bugs (as documented in https://code.djangoproject.com/ticket/31632#comment:10 ), so this approach will never work (even if it would be the best & most efficient way to do things). The only client getting this right currently is mysql-connector which actually queries the server state (at the expense of an extra query)

Note: See TracTickets for help on using tickets.
Back to Top