Opened 3 weeks ago

Last modified 2 weeks ago

#36877 assigned Cleanup/optimization

Order of update operations behaves differently on MySQL compared to other databases

Reported by: Samir Shah Owned by: Samir Shah
Component: Documentation Version: 6.0
Severity: Normal Keywords: mysql
Cc: Samir Shah Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by Samir Shah)

There is a quirk in how MySQL handles update queries that means you can get inconsistent results when updating fields that derive values from one another. Here is an example:

class Entity:
    name = models.CharField(max_length=100)
    name_length = models.PositiveSmallIntegerField(default=0)

Now say you have an object in the database:

Entity.objects.create(name="Bob", name_length=3)

and then you run this update query:

from django.db.models.functions import Length
Entity.objects.update(name="Alice", name_length=Length("name"))

As per the SQL specification, the update is atomic, so that the length is computed using the *original* value of name. Thus in databases like PostgreSQL and SQLite this query will result in name_length being assigned the value 3. MySQL however, behaves differently, and uses the *new* value of name, resulting in the name_length being assigned the value 5.

This is documented:

If you access a column from the table to be updated in an expression, UPDATE uses the current value of the column. For example, the following statement sets col1 to one more than its current value:

UPDATE t1 SET col1 = col1 + 1;

The second assignment in the following statement sets col2 to the current (updated) col1 value, not the original col1 value. The result is that col1 and col2 have the same value. This behaviour differs from standard SQL.

UPDATE t1 SET col1 = col1 + 1, col2 = col1;

Single-table UPDATE assignments are generally evaluated from left to right.

Of note is the last comment in particular. This means the following two queries will yield different results:

Entity.objects.update(name="Alice", name_length=Length("name"))  # name_length will be set to 5 in MySQL
Entity.objects.update(name_length=Length("name"), name="Alice")  # name_length will be set to 3 in MySQL, because it was updated before name was changed

So we have two issues:

  1. ORM queries run in this fashion behave differently on MySQL
  2. MySQL is sensitive to the order in which kwargs are supplied to update(). This can very easily lead to gotchas that are hard to track down.

I do not know whether it is possible to apply any workarounds in Django so that the ORM behaviour is consistent with other databases - if it is, they may be worth attempting. If not, this might be at least worth documenting as a quirk with MySQL? I have tentatively categorised this as a bug but recognise that it may not be fixable as one.

Change History (5)

comment:1 by Samir Shah, 3 weeks ago

Description: modified (diff)

comment:2 by Simon Charette, 2 weeks ago

Component: Database layer (models, ORM)Documentation
Needs documentation: set
Patch needs improvement: set
Triage Stage: UnreviewedAccepted
Type: BugCleanup/optimization

Thank you for your report Samir, I could swear I dropped a message here already but I think we should won't fix or focus it as a documentation admonition

Because update(**fields) maintains the user specified order I think we must keep things as is to allow either NEW or OLD tuple values to be targetted. Changing the default behaviour on MySQL seems clever but it suffers from a few problems

  1. It's backward incompatible, users might be relying on this MySQL'ism with explicit **fields ordering just like they do with order_by (see #31573)
  2. Forcing a particular order is error prone as there could be inter-dependency that only the user know about between the updated fields
  3. It opens a can of worms for other interfaces that use update such as Model.save which defauls to the order of definition of the model fields

For these reasons I think we're better off keeping things as they are as they allow the user to have more control over the expressiveness of the resulting UPDATE query which is a property the ORM maintained since dict preserve insertion order (Python 3.7+ in 2018).

comment:3 by Samir Shah, 2 weeks ago

Owner: set to Samir Shah
Status: newassigned

comment:4 by Samir Shah, 2 weeks ago

Has patch: set

comment:5 by Samir Shah, 2 weeks ago

Note: See TracTickets for help on using tickets.
Back to Top