Opened 16 years ago

Closed 15 years ago

#7210 closed (fixed)

Added expression support for QuerySet.update

Reported by: Sebastian Noack Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Keywords:
Cc: Russell Keith-Magee, Malcolm Tredinnick, Nicolas Lara, gav@…, Gonzalo Saavedra, Joel Watts, simon@…, Reflejo@… Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: yes Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

I think QuerySet.update works to inflexible. In SQL you can define expressions containing the current value or the value of an other column in the update clause, for Example to incrementing a value in the result set. This is not possible by the ORM at the moment. I wrote a possible patch, with which you can do following:


from django.db.models.sql.expressions import *

# Equivalent to model.all().update(foo=42)
model.all().update(foo=LiteralExpr(42))
# Increment column 'foo' by one.
model.all().update(foo=CurrentExpr() + LiteralExpr(1))
# Swap the value of the column 'foo' and 'bar'.
model.all().update(foo=ColumnExpr('bar'), bar=ColumnExpr('foo'))

Attachments (12)

0001-Added-expression-support-for-QuerySet.update.patch (8.5 KB ) - added by Sebastian Noack 16 years ago.
0001-Added-expression-support-for-QuerySet.update.2.patch (8.2 KB ) - added by Sebastian Noack 16 years ago.
update_tests.py (958 bytes ) - added by Alex Gaynor 16 years ago.
Some quick tests, not a ton, but tests basic funcionality, should be a good starting point if anyone wants to help out.
0001-Added-expression-support-for-QuerySet.update.3.patch (15.1 KB ) - added by Sebastian Noack 16 years ago.
0001-Added-expression-support-for-QuerySet.update.4.patch (23.7 KB ) - added by Sebastian Noack 16 years ago.
0001-Added-expression-support-for-QuerySet.update.5.patch (23.8 KB ) - added by Sebastian Noack 16 years ago.
0001-Added-F-support-to-filter-exclude-and-update.patch (24.6 KB ) - added by Sebastian Noack 16 years ago.
7210-F-support.patch (14.5 KB ) - added by Collin Grady 16 years ago.
updated patch to r7631
7210-F-support.2.patch (23.4 KB ) - added by Collin Grady 16 years ago.
fixed patch - last one was missing two files
expr.2.diff (40.8 KB ) - added by Alex Gaynor 16 years ago.
Added 2 missing files.
expr.diff (23.9 KB ) - added by Alex Gaynor 16 years ago.
Added 2 missing files.
7210-F-Syntax.patch (20.0 KB ) - added by Nicolas Lara 16 years ago.
Re-write of the patch. Added docs.

Download all attachments as: .zip

Change History (27)

comment:1 by Alex Gaynor, 16 years ago

Needs documentation: set
Needs tests: set
Triage Stage: UnreviewedDesign decision needed

by Alex Gaynor, 16 years ago

Attachment: update_tests.py added

Some quick tests, not a ton, but tests basic funcionality, should be a good starting point if anyone wants to help out.

comment:2 by Sebastian Noack, 16 years ago

Cc: Russell Keith-Magee Malcolm Tredinnick added

comment:3 by Collin Grady, 16 years ago

would this patch allow ColumnExpr to be used in .filter() ?

comment:4 by Sebastian Noack, 16 years ago

Yes and exclude() too. But it is not called ColumnExpr anymore. See the discussion at the mailing list (link above).

comment:5 by Nicolas Lara, 16 years ago

Cc: Nicolas Lara added

by Collin Grady, 16 years ago

Attachment: 7210-F-support.patch added

updated patch to r7631

by Collin Grady, 16 years ago

Attachment: 7210-F-support.2.patch added

fixed patch - last one was missing two files

by Alex Gaynor, 16 years ago

Attachment: expr.2.diff added

Added 2 missing files.

by Alex Gaynor, 16 years ago

Attachment: expr.diff added

Added 2 missing files.

comment:6 by Malcolm Tredinnick, 16 years ago

I've read through this patch in detail for the first time. It smells a bit over-engineered in places, so I'm going to have to think about that (there seem to be too many extra classes involved). For now, though, there are some more fundamental problems I'd like to bring up:

  1. You've somewhat arbitrarily removed the get_placeholder() stuff. That is there because it's needed by certain extensions (in particular, geo-django, but it will also be useful in other cases). Remember that this is all new code; it's not like things are hanging around for historical reasons. Basically the "placeholder" is either "%s" in the normal case or some other format string (e.g. "SOME_FUNC(%s)" in cases like the GIS situations. So treat the placeholder as an opaque string that you use instead of %s to indicate parameters.
  2. The WhereNode class is now a proper tree of other WhereNode classes (after [7835]), so that might affect some things in this patch. In particular, converting things passed in to values should be done in WhereNode.add() so that no references to fields or models are stored in the WhereNode class. This avoids infinite loops and pickling problems.
  3. Calling curry() in make_atom() doesn't looks useful. You just want to save it to use as a function later when you still have the pieces of information you use. Query construction takes long enough without extra overhead like this, so just call the function directly at the right moment.
  4. Having to pass opts to as_sql() -- which will now be add() after [7835] -- feels wrong. The where-class itself shouldn't need to care about that, it's purely for the benefit of the "smart objects", so they should contain the information they need. This feeds into the next point (and below) that, in general, these classes should know how to convert themselves to SQL fragments.
  5. Having to convert normal values to LiteralValues just to convert them back looks like unneeded overhead to me. Follow the pattern elsewhere: the default case will be normal values (the things we do now). That will be by far the most common stuff and it gets handled directly in make_atom(). Anything else, such as any smart objects, should convert themselves via a common method (their interface) and return the resulting string and list. They should have something like their own as_sql() or make_atom() method that is called if we detect the "value" has such a method. So it will be similar to the approach in as_sql(): if the thing we're processing has its own as_sql() method, call that; otherwise, it's a basic piece of data and we'll handle it via make_atom(). In this case with Expression derivatives, maybe make_atom() needs to check if value() has its own make_atom() or make_value() method or something. The idea here is to keep the level of coupling between the normal WhereNode code and any objects like F() to an absolute minimum. F and Expression don't need to be treated specially. They are examples of a class of smart objects that should know how to prepare themselves.

I realise points 4 and 5 above sounds a bit less concrete than the others, but they're actually pretty major. Right now, this patch introduces some pretty tight coupling between these new classes and the query construction code. My intuition is that this coupling is not necessary. If you have to mention something like Expression in WhereNode, you've probably got a leaky abstraction. Use the interface that something like Expression should have to call it and that way we're not tied to only using the Expression class.

I suspect portions of this could probably be broken up into separate steps. I like the idea of an F() object; that's certainly necessary. All this stuff to allow additions and subtraction and other manipulations of values might be useful at some point, but it's probably less important than the ability to refer to a field in an expression and working out the correct way to call smart objects as values. It can certainly be added later and, if the changes I'm talking about above are done right, doesn't have to be part of core immediately, which buys us some room to experiment. So don't try to do too much at once here.

One thought I had is that the bit in make_atom() that currently says

if isinstance(params, QueryWrapper):
   ...

could take over the general role of smart objects here and become something like

if hasattr(params, 'make_value'):
   ...

and then we have to work out how to weave general return values into the result (the current (extra, params) tuple probably won't be enough). In any case, that's the sort of lines I'd work along to make things work like the rest of the code throughout the query construction: we provide the means to "shell out" to advanced objects and handle only the base case in the core code.

Note that this shows we've already got this slightly leaky abstraction in the code, as a once-off to handle SQL subqueries as values. Designing this particular patch correctly should actually help plug that leak by possibly removing the need for WhereNode to know or care about QueryWrapper.

That's where I'm sitting at the moment. The idea is certainly worth pursuing. There are some implementation issues that need to be solved, as well as some broader design ones before we can go further. I'm not completely convinced it's "1.0-beta" material. Nice to have, but not a showstopper if we don't have it in 1.0. We can certainly add this stuff at any point without breaking the external API. Getting it right is therefore more important than getting it done quickly.

comment:7 by Nicolas Lara, 16 years ago

Another thing I've noticed with this patch is that it handles the fields based on the model's _meta attribute. This is a problem if F (or any expression) is going to be used with extra(), related fields or aggregates. I wanted to re-write it to avoid the dependency of the model at least for the quering case.

Also, the latest trunk revision introduces some changes that heavily conflict with this patch. Mostly the in "where.py" and "sql/query.py".

The only problem I see with the objects converting themselves to SQL fragments is that sometimes the objects don't have all the necessary information to do that and would require information of the state of the query class (e.g. aliases). The same problem would arrise if we want to allow relation spaning fields to be used without explicitly joining the tables beforehand. The second point (relation spaning fields) is a far fetched case and I'm not sure it should even be allowed, but the first one is a common case that could easily be solved by setting a placeholder for the field information and letting the queryset code handle the propper name selection.

I am not sure either that this belongs in "1.0-beta" but I have been working on it and would like to get it working after EuroPython's sprint.

comment:8 by Russell Keith-Magee, 16 years ago

milestone: 1.0 beta
Patch needs improvement: set
Triage Stage: Design decision neededAccepted

Officially taking this off the 1.0 beta list. Nicolas has been working on this as part of the aggregation work, but it won't be ready for a merge.

by Nicolas Lara, 16 years ago

Attachment: 7210-F-Syntax.patch added

Re-write of the patch. Added docs.

comment:9 by George Vilches, 15 years ago

Cc: gav@… added

comment:10 by Ivan Sagalaev, 15 years ago

A typo in the patch at db-api.txt:1754

"When filtering you can refer to you can refer to other attributes of " -- double "you can refer to".

P.S. The trick with F-objects overloading math operators is very clever!

comment:11 by Gonzalo Saavedra, 15 years ago

Cc: Gonzalo Saavedra added

comment:12 by Joel Watts, 15 years ago

Cc: Joel Watts added

comment:13 by anonymous, 15 years ago

Cc: simon@… added

comment:14 by Martín Conte Mac Donell, 15 years ago

Cc: Reflejo@… added

comment:15 by Russell Keith-Magee, 15 years ago

Resolution: fixed
Status: newclosed

(In [9792]) Fixed #7210 -- Added F() expressions to query language. See the documentation for details on usage.

Many thanks to:

  • Nicolas Lara, who worked on this feature during the 2008 Google Summer of Code.
  • Alex Gaynor for his help debugging and fixing a number of issues.
  • Malcolm Tredinnick for his invaluable review notes.
Note: See TracTickets for help on using tickets.
Back to Top