#15566 closed Bug (fixed)
.update queries should affect Date*Field's with auto_now
Reported by: | Jeremy Dunck | Owned by: | agiliq |
---|---|---|---|
Component: | Documentation | Version: | 1.2 |
Severity: | Normal | Keywords: | |
Cc: | Chris Streeter, charette.s@…, anssi.kaariainen@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | yes | UI/UX: | no |
Description
Right now, calling Manager.update will update records without affecting date fields w/ auto_now specified.
This is a bug. Perhaps UpdateQuery should generate an update to any fields with auto_now.
Attachments (1)
Change History (21)
by , 14 years ago
Attachment: | ticket_15566_test.patch added |
---|
comment:1 by , 14 years ago
Has patch: | set |
---|---|
Owner: | changed from | to
Patch needs improvement: | set |
Status: | new → assigned |
Triage Stage: | Unreviewed → Accepted |
follow-up: 7 comment:2 by , 14 years ago
Triage Stage: | Accepted → Design decision needed |
---|
It's a backwards incompatible change and there are a lot of people depending on it. Quoting the docs:
Be aware that the update() method is converted directly to an SQL statement. It is a bulk operation for direct updates. It doesn't run any save() methods on your models, or emit the pre_save or post_save signals (which are a consequence of calling save()).
It doesn't explicitly refer to the auto_now
option, but that was always a hack, so you don't have to implement it on pre_save
yourself. I would be in favour of leaving the current behaviour and add a documentation note to docs about how it works with auto_now
.
comment:3 by , 14 years ago
Sure, a decision on whether to keep auto_now needs to be made in general, but either way, that's a 2.x decision. If they are kept, I think this is a good feature at that point.
comment:4 by , 14 years ago
Severity: | → Normal |
---|
comment:5 by , 14 years ago
Type: | → Bug |
---|
comment:6 by , 13 years ago
Component: | Database layer (models, ORM) → Documentation |
---|---|
Easy pickings: | set |
Has patch: | unset |
Patch needs improvement: | unset |
Triage Stage: | Design decision needed → Accepted |
UI/UX: | unset |
I think we should keep the current behavior and document it, as explained in comment 2.
comment:7 by , 13 years ago
Component: | Documentation → Database layer (models, ORM) |
---|
Replying to lrekucki:
It's a backwards incompatible change and there are a lot of people depending on it. Quoting the docs:
Be aware that the update() method is converted directly to an SQL statement. It is a bulk operation for direct updates. It doesn't run any save() methods on your models, or emit the pre_save or post_save signals (which are a consequence of calling save()).
It doesn't explicitly refer to the
auto_now
option, but that was always a hack, so you don't have to implement it onpre_save
yourself. I would be in favour of leaving the current behaviour and add a documentation note to docs about how it works withauto_now
.
This seems like a weak argument to me. As acknowledged, the docs say nothing about auto_now
and qs.update()
. If one reads all the relevant docs, I think one would reasonably assume that auto_now
should work in case of a queryset update too, and I see no technical argument against it doing so. To argue that the feature should remain broken when it could be fixed because it was "always a hack" makes no sense to me. If the feature is a hack in a seriously bad sense, then it should either be removed, or fixed so that it is no longer a hack.
I think that this is a bug, and we should fix bugs, not document them, even if someone might be depending on the buggy behavior. I have a hard time imagining the case where someone puts auto_now
on a field but is relying on that field *not* being updated in case of a bulk update, or how this fix would break that code.
comment:8 by , 13 years ago
I've pasted a basic implementation (no patch, since I did this the subqueries.py of my site-packages vs my local src copy, and because of the timezone stuff that I'm not sure how to handle).
https://gist.github.com/1872523#L83
I've tested this locally, in case it's OK as-is, and somebody wants to make a patch an add tests.
comment:9 by , 13 years ago
Cc: | added |
---|
comment:10 by , 13 years ago
Replying to carljm:
This seems like a weak argument to me.
I 100% agree with Carl here. I don't know where the contention regarding auto_now
and auto_now_add
came from, but it's completely unwarranted and ridiculous. It's a perfectly valid feature with a perfectly sane and non-hacky implementation. Everyone wants timestamps for updated_on and created_on at some point in their project, and this offers a way to do it on a base model without having to use signals or anything else (which I would deem to be overkill).
comment:11 by , 13 years ago
Cc: | added |
---|
comment:12 by , 13 years ago
I disagree with Carl on this one. I have always interpreted QuerySet.update()
as a lower-level SQL construct, and would be pretty surprised if it did anything other than update the fields that I explicitly told it to.
I think the documentation is ambiguous - it simply doesn't state how QuerySet.update()
and auto_now
interact. I had interpreted it the other way to you - that update()
does a SQL update of exactly the fields you specify, and nothing more. I think either interpretation is valid, and it is up to us to decide how to change the behaviour or clarify the docs.
I can think of ways the proposed change could break people's code. If I had an auto_now field that was used to essentially track the last modified timestamp of 'public' data in table, and then added some 'internal' state data to a table, I might make use of the current behaviour of QuerySet.update()
to avoid triggering the auto_now
behaviour (or anything else that might normally run on model save()
). In fact I have coded some models where I have a split just like this. I can't remember if they had auto_now
fields, but I wouldn't have wanted them to be updated when the internal fields were changed.
In addition, with the proposed change it's hard to see how you would execute a SQL "update-these-fields-and-only-these-fields-and-don't-do anything-special" command. I had relied on QuerySet.update()
for that until now. It's not hard to add a field to the update call, it's harder to remove one.
follow-up: 14 comment:13 by , 13 years ago
Cc: | added |
---|
One more reason for wontfixing this: the pre/post save signals are not sent for .update() queries. This means that audit-logging .update() calls is impossible for example. So, what does this have to do with this issue? It makes the argument stronger that .update() is a lower level API.
I think this should be clearly documented: .update() updates _only_ the fields specified. It does not do _anything_ else. I haven't checked if this is already in the documentation, but it should be made absolutely clear that Django does not make sure you are able to log every change done to the models by signals, or by auto_now.
comment:14 by , 13 years ago
Component: | Database layer (models, ORM) → Documentation |
---|
I agree that this could be read either way. I don't think a Field
argument is parallel to signals in terms of "API level"; it's closer to parallel to an overridden save
. I just noticed that update
doesn't even respect e.g. max_length
(you just get a raw database error/warning, depending on your backend); if that's not considered a bug, then that does strengthen the argument that update
is pretty low-level.
In any case, I don't feel strongly, and I'm swayed by Luke's argument that you can easily add your timestamp field to an update call if you want it, but you can't easily remove it if you want a way to bulk-update exactly the specified fields and no others. Changing component back to Documentation.
follow-up: 16 comment:15 by , 13 years ago
If Queryset.update
is a lower-level SQL construct, then why do we have cursor.execute
which could achieve the same thing in almost the same amount of SLOC and still remain perfectly injection-proof?
Queryset.update
is used widely in most of the apps I've worked on, and I'm pretty certain the developers who've added it didn't think of it as a low-level tool, but rather just an ORM method for updating multiple rows at the same time without having to select each and save each separately (ie, a simply efficiency tool in the ORM).
Allowing Queryset.update
to cause database max_length
errors and break the auto_now
docs is a bug and serves no helpful purpose, because A) If I want to demonstrate a Postgres error, there is cursor.execute
, and B) If a user wants to override the documented auto_now
functionality during a Queryset.update
, they can simply provide an override value, and C) If a user wants to update specific rows while not leaving any evidence behind, they can simply use cursor.execute
to update only the rows they want to update. The use-case that Luke suggested is simply an exploit of the bug and is probably a rarity in most code-bases because since auto_now
also implies editable=False
, it sort-of also implies that auto_now
is more of a back-end feature.
We don't have a documented Queryset.insert
method for performing raw INSERT
queries, or, for that matter, any documented Queryset
methods which are data-altering, low-level SQL constructs. In fact, the only documented methods that are low-level seem to provide all the functionality necessary. Queryset.raw
and cursor.execute
provide the ability to perform low level SQL queries against the database to both retrieve and modify data.
I think that creating an inconsistency in Queryset.update
with support predicated only on the potential for previous exploitation of an undocumented bug seems like a really bad idea.
To show I'm not just ranting, I'll be willing to write the necessary patch and corresponding tests with some guidance on how to handle both the aforementioned cases.
follow-up: 17 comment:16 by , 13 years ago
Replying to anonymous:
If
Queryset.update
is a lower-level SQL construct, then why do we havecursor.execute
which could achieve the same thing in almost the same amount of SLOC and still remain perfectly injection-proof?
Because .update() is database agnostic; raw SQL isn't necessarily cross-database compatible. I'd also argue that the syntax for something like a simple field value increment is cleaner in the ORM than in raw SQL, especially for a codebase that is principally written in ORM syntax.
Queryset.update
is used widely in most of the apps I've worked on, and I'm pretty certain the developers who've added it didn't think of it as a low-level tool, but rather just an ORM method for updating multiple rows at the same time without having to select each and save each separately (ie, a simply efficiency tool in the ORM).
I can't speak to how the method is perceived in the wild -- that's a documentation issue -- but a quick inspection of the implementation of the parts involved points pretty quickly to the fact that it can't be interpreted as anything *but* a low level method.
auto_now_add is implemented as a pre-save, per-field modification. The fact that the value for this modification is easily evaluated, and the value for the field will be the same for all rows in the update is convenient, but doesn't solve the underlying issue -- that what we're dealing with is a field value that is computed on a per-row basis.
To point out the underlying problem - consider another type of field with a pre-save trigger; say, a CharField whose value is a computed checksum based on other fields and the timestamp. There's no way this could be expressed as a single update statement for multi-row updates.
The only way you could reconcile this with auto_now_add being fired on .update() is to consider the handling of auto_now_add to be a special case because it is a timestamp. I'm not convinced that special case is special enough to warrant breaking the rules.
To show I'm not just ranting, I'll be willing to write the necessary patch and corresponding tests with some guidance on how to handle both the aforementioned cases.
If you can provide a patch that works for pre-save updates *in the general case*, then I'm interested. Otherwise, what you're proposing is to make a special case of date and time fields, and that's not something I'm comfortable with.
I concur with Luke and Anssi; this is a documentation issue regarding the limitations and expectations of a call to .update().
comment:17 by , 13 years ago
To point out the underlying problem - consider another type of field with a pre-save trigger; say, a CharField whose value is a computed checksum based on other fields and the timestamp. There's no way this could be expressed as a single update statement for multi-row updates.
Oops, I see now, that points to a far worse inconsistency than what I perceived as a consequence of not patching Queryset.update
. While I can think of at least one terribly Byzantinian way of performing at least general pre-save updates, I realize that it's impossible to perform something like a computed checksum or anything that would require a Python library on an update. Therefore, Queryset.update
stands acquitted :( but I much appreciate the explanation, Russell.
comment:18 by , 12 years ago
Has patch: | set |
---|---|
Owner: | changed from | to
Status: | assigned → new |
Here is the pull request.
https://github.com/django/django/pull/338
comment:19 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Yes, I think it's reasonable to expect auto_now=True to work with .update().
Just attached a basic regressiontest for now.