Opened 11 years ago

Closed 9 years ago

Last modified 9 years ago

#20708 closed Bug (wontfix)

QuerySet.update() ignores order_by() clause

Reported by: Jacob Rief Owned by: nobody
Component: Database layer (models, ORM) Version: 1.7
Severity: Normal Keywords:
Cc: slachinger Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Given a database model, say 'MyModel', with a unique integer field, say 'position'.

Attempt to shift all values of 'position' by a given positive offset:

ascending:
MyModel.objects.order_by('position').update('position': F('position') - offset)
or descending
MyModel.objects.order_by('-position').update('position': F('position') + offset)

The ORM ignores the order_by clause. The generated SQL is (offset = 1):
UPDATE `myapp_mymodel` SET `position` = `myapp_mymodel`.`position` - 1;

This is a problem with MySQL, because for bulk updates on a unique key, you must specify the working order, otherwise the above statement can cause an IntegrityError: (1062, "Duplicate entry...") on column 'position' since these values are updated out of order.

A solution would be to add ORDER BY position ASC / DESC (http://dev.mysql.com/doc/refman/5.1/en/update.html).
An alternative solution would be to ALTER TABLE ... DISABLE KEYS before updating and ALTER TABLE ... ENABLE KEYS after the update (http://dev.mysql.com/doc/refman/5.1/en/alter-table.html).

Postgres does not need/offer an ORDER BY clause on bulk updates.

Change History (5)

comment:1 by Shai Berger, 11 years ago

Resolution: needsinfo
Status: newclosed

I don't think this should be fixed. Fixing it would add complexity to the ORM for all backends, just for the benefit of a very weird quirk of one backend. IMO, the fact that MySQL checks constraints in mid-statement is nothing less than a bug, and the Order-By "feature" is nothing more than a partial workaround. For example, in the same situation as described, try to reverse the order of positions:

MyModel.objects.update('position': max_position - F('position') )

No order-by can save you there; you need to find another way, like adding a constant larger than max_position to all positions first, than setting the right values. And you can do the same for the simpler shift case.

So, the reasoning given in the report is not enough to justify fixing, IMO; feel free to re-open with more persuasive arguments.

comment:2 by anonymous, 11 years ago

Just to complete this issue:

To compare ORM's I did some tests with SQLAlchemy. On invokation of

query.order_by('position').update()
sqlalchemy.exc.InvalidRequestError: Can't call Query.update() when order_by() has been called

Im comparison to that, the Django ORM just ignores the order_by() clause.

Then I tested with SQLite. Here we have the same problem as with MySQL, you can't update a table with a unique column, with a statement as above. Even worse, SQLite does not accept an ORDER BY clause during updates.

sqlite> UPDATE users SET position=position+1 ORDER BY position;
Error: near "ORDER": syntax error

To conclude this issue, Postgres is the only of the 3 supported databases, which does it "right", ie. in one transaction.

comment:3 by slachinger, 9 years ago

Cc: slachinger added
Resolution: needsinfo
Status: closednew
Version: 1.51.7

More compelling arguments:

  • An ORM should shield the user from the quirks of each DBMS, making it transparent from the underlying DBMS. In this case the Django ORM fails miserably to do so. And this is just addressing the issue that mysql checks constraints mid-statement.
  • Adressing the actual bug report: the order_by() should be hould be honored by the update command. If the developer using the ORM explicitly requests ORM to use order_by it should not just silently ignore that.

Just ran into a very similar issue as the OP describes and by having to revert to raw SQL the benefit of using an ORM is gone.

comment:4 by Carl Meyer, 9 years ago

Resolution: wontfix
Status: newclosed

Hi! Thanks for the follow-up. I'm afraid, though, that I agree with Shai, and I don't find those arguments compelling.

  1. It is not possible for an ORM to paper over every DBMS bug. See Shai's example: there are many situations where there is no way the Django ORM could shield the user from the consequences of MySQL's broken constraint handling. And in some where it could, the benefit may not be worth the cost in added complexity.
  1. It's common to create a queryset that is used for both querying and updating, so the presence of an order_by in a queryset that is used for an update call does not necessarily indicate that the developer wanted the update to be ordered (particularly since at the semantic level of the Django ORM that's a meaningless request). Making an update after an order_by an error (as SQLAlchemy does) would break lots of perfectly-valid working code.

Following on (2), since there are likely many cases of people calling update() on querysets with an ordering in MySQL without ever considering that the order might be applied to the update, it's possible that adding this could actually break bulk updates which are currently working with the default ordering.

So all in all, I'm closing this wontfix. I'd reconsider that if I saw a) an actual patch which is non-invasive outside the MySQL backend, and b) strong evidence that there is no plausible scenario where the change in behavior could break currently-working code.

comment:5 by slachinger, 9 years ago

I agree that a) would be the way to go to fix 1).

Regarding 2) I do not think that raising an error would be the correct way, but actually adding the order_by as requested by the developer. However if the consens is to continue "assuming that the developer did not acutally meant to do what he/she actually coded, thus it should be ignored" due to backwards compatibility, I think that should be a hint in the documentation that this is actually the desired behaviour.

Note: See TracTickets for help on using tickets.
Back to Top