Opened 7 years ago

Closed 6 years ago

#7210 closed (fixed)

Added expression support for QuerySet.update

Reported by: sebastian_noack Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Keywords:
Cc: russellm, mtredinnick, nicolas, gav@…, gonz, jpwatts, simon@…, Reflejo@… Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: yes Patch needs improvement: yes
Easy pickings: UI/UX:

Description

I think QuerySet.update works to inflexible. In SQL you can define expressions containing the current value or the value of an other column in the update clause, for Example to incrementing a value in the result set. This is not possible by the ORM at the moment. I wrote a possible patch, with which you can do following:


from django.db.models.sql.expressions import *

# Equivalent to model.all().update(foo=42)
model.all().update(foo=LiteralExpr(42))
# Increment column 'foo' by one.
model.all().update(foo=CurrentExpr() + LiteralExpr(1))
# Swap the value of the column 'foo' and 'bar'.
model.all().update(foo=ColumnExpr('bar'), bar=ColumnExpr('foo'))

Attachments (12)

0001-Added-expression-support-for-QuerySet.update.patch (8.5 KB) - added by sebastian_noack 7 years ago.
0001-Added-expression-support-for-QuerySet.update.2.patch (8.2 KB) - added by sebastian_noack 7 years ago.
update_tests.py (958 bytes) - added by Alex 7 years ago.
Some quick tests, not a ton, but tests basic funcionality, should be a good starting point if anyone wants to help out.
0001-Added-expression-support-for-QuerySet.update.3.patch (15.1 KB) - added by sebastian_noack 7 years ago.
0001-Added-expression-support-for-QuerySet.update.4.patch (23.7 KB) - added by sebastian_noack 7 years ago.
0001-Added-expression-support-for-QuerySet.update.5.patch (23.8 KB) - added by sebastian_noack 7 years ago.
0001-Added-F-support-to-filter-exclude-and-update.patch (24.6 KB) - added by sebastian_noack 7 years ago.
7210-F-support.patch (14.5 KB) - added by cgrady 7 years ago.
updated patch to r7631
7210-F-support.2.patch (23.4 KB) - added by cgrady 7 years ago.
fixed patch - last one was missing two files
expr.2.diff (40.8 KB) - added by Alex 7 years ago.
Added 2 missing files.
expr.diff (23.9 KB) - added by Alex 7 years ago.
Added 2 missing files.
7210-F-Syntax.patch (20.0 KB) - added by nicolas 7 years ago.
Re-write of the patch. Added docs.

Download all attachments as: .zip

Change History (27)

Changed 7 years ago by sebastian_noack

comment:1 Changed 7 years ago by Alex

  • Needs documentation set
  • Needs tests set
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Design decision needed

Changed 7 years ago by Alex

Some quick tests, not a ton, but tests basic funcionality, should be a good starting point if anyone wants to help out.

comment:3 Changed 7 years ago by cgrady

would this patch allow ColumnExpr to be used in .filter() ?

comment:4 Changed 7 years ago by sebastian_noack

Yes and exclude() too. But it is not called ColumnExpr anymore. See the discussion at the mailing list (link above).

comment:5 Changed 7 years ago by nicolas

  • Cc nicolas added

Changed 7 years ago by sebastian_noack

Changed 7 years ago by cgrady

updated patch to r7631

Changed 7 years ago by cgrady

fixed patch - last one was missing two files

Changed 7 years ago by Alex

Added 2 missing files.

Changed 7 years ago by Alex

Added 2 missing files.

comment:6 Changed 7 years ago by mtredinnick

I've read through this patch in detail for the first time. It smells a bit over-engineered in places, so I'm going to have to think about that (there seem to be too many extra classes involved). For now, though, there are some more fundamental problems I'd like to bring up:

  1. You've somewhat arbitrarily removed the get_placeholder() stuff. That is there because it's needed by certain extensions (in particular, geo-django, but it will also be useful in other cases). Remember that this is all new code; it's not like things are hanging around for historical reasons. Basically the "placeholder" is either "%s" in the normal case or some other format string (e.g. "SOME_FUNC(%s)" in cases like the GIS situations. So treat the placeholder as an opaque string that you use instead of %s to indicate parameters.
  2. The WhereNode class is now a proper tree of other WhereNode classes (after [7835]), so that might affect some things in this patch. In particular, converting things passed in to values should be done in WhereNode.add() so that no references to fields or models are stored in the WhereNode class. This avoids infinite loops and pickling problems.
  3. Calling curry() in make_atom() doesn't looks useful. You just want to save it to use as a function later when you still have the pieces of information you use. Query construction takes long enough without extra overhead like this, so just call the function directly at the right moment.
  4. Having to pass opts to as_sql() -- which will now be add() after [7835] -- feels wrong. The where-class itself shouldn't need to care about that, it's purely for the benefit of the "smart objects", so they should contain the information they need. This feeds into the next point (and below) that, in general, these classes should know how to convert themselves to SQL fragments.
  5. Having to convert normal values to LiteralValues just to convert them back looks like unneeded overhead to me. Follow the pattern elsewhere: the default case will be normal values (the things we do now). That will be by far the most common stuff and it gets handled directly in make_atom(). Anything else, such as any smart objects, should convert themselves via a common method (their interface) and return the resulting string and list. They should have something like their own as_sql() or make_atom() method that is called if we detect the "value" has such a method. So it will be similar to the approach in as_sql(): if the thing we're processing has its own as_sql() method, call that; otherwise, it's a basic piece of data and we'll handle it via make_atom(). In this case with Expression derivatives, maybe make_atom() needs to check if value() has its own make_atom() or make_value() method or something. The idea here is to keep the level of coupling between the normal WhereNode code and any objects like F() to an absolute minimum. F and Expression don't need to be treated specially. They are examples of a class of smart objects that should know how to prepare themselves.

I realise points 4 and 5 above sounds a bit less concrete than the others, but they're actually pretty major. Right now, this patch introduces some pretty tight coupling between these new classes and the query construction code. My intuition is that this coupling is not necessary. If you have to mention something like Expression in WhereNode, you've probably got a leaky abstraction. Use the interface that something like Expression should have to call it and that way we're not tied to only using the Expression class.

I suspect portions of this could probably be broken up into separate steps. I like the idea of an F() object; that's certainly necessary. All this stuff to allow additions and subtraction and other manipulations of values might be useful at some point, but it's probably less important than the ability to refer to a field in an expression and working out the correct way to call smart objects as values. It can certainly be added later and, if the changes I'm talking about above are done right, doesn't have to be part of core immediately, which buys us some room to experiment. So don't try to do too much at once here.

One thought I had is that the bit in make_atom() that currently says

if isinstance(params, QueryWrapper):
   ...

could take over the general role of smart objects here and become something like

if hasattr(params, 'make_value'):
   ...

and then we have to work out how to weave general return values into the result (the current (extra, params) tuple probably won't be enough). In any case, that's the sort of lines I'd work along to make things work like the rest of the code throughout the query construction: we provide the means to "shell out" to advanced objects and handle only the base case in the core code.

Note that this shows we've already got this slightly leaky abstraction in the code, as a once-off to handle SQL subqueries as values. Designing this particular patch correctly should actually help plug that leak by possibly removing the need for WhereNode to know or care about QueryWrapper.

That's where I'm sitting at the moment. The idea is certainly worth pursuing. There are some implementation issues that need to be solved, as well as some broader design ones before we can go further. I'm not completely convinced it's "1.0-beta" material. Nice to have, but not a showstopper if we don't have it in 1.0. We can certainly add this stuff at any point without breaking the external API. Getting it right is therefore more important than getting it done quickly.

comment:7 Changed 7 years ago by nicolas

Another thing I've noticed with this patch is that it handles the fields based on the model's _meta attribute. This is a problem if F (or any expression) is going to be used with extra(), related fields or aggregates. I wanted to re-write it to avoid the dependency of the model at least for the quering case.

Also, the latest trunk revision introduces some changes that heavily conflict with this patch. Mostly the in "where.py" and "sql/query.py".

The only problem I see with the objects converting themselves to SQL fragments is that sometimes the objects don't have all the necessary information to do that and would require information of the state of the query class (e.g. aliases). The same problem would arrise if we want to allow relation spaning fields to be used without explicitly joining the tables beforehand. The second point (relation spaning fields) is a far fetched case and I'm not sure it should even be allowed, but the first one is a common case that could easily be solved by setting a placeholder for the field information and letting the queryset code handle the propper name selection.

I am not sure either that this belongs in "1.0-beta" but I have been working on it and would like to get it working after EuroPython's sprint.

comment:8 Changed 7 years ago by russellm

  • milestone 1.0 beta deleted
  • Patch needs improvement set
  • Triage Stage changed from Design decision needed to Accepted

Officially taking this off the 1.0 beta list. Nicolas has been working on this as part of the aggregation work, but it won't be ready for a merge.

Changed 7 years ago by nicolas

Re-write of the patch. Added docs.

comment:9 Changed 7 years ago by gav

  • Cc gav@… added

comment:10 Changed 7 years ago by isagalaev

A typo in the patch at db-api.txt:1754

"When filtering you can refer to you can refer to other attributes of " -- double "you can refer to".

P.S. The trick with F-objects overloading math operators is very clever!

comment:11 Changed 7 years ago by gonz

  • Cc gonz added

comment:12 Changed 7 years ago by jpwatts

  • Cc jpwatts added

comment:13 Changed 6 years ago by anonymous

  • Cc simon@… added

comment:14 Changed 6 years ago by Reflejo

  • Cc Reflejo@… added

comment:15 Changed 6 years ago by russellm

  • Resolution set to fixed
  • Status changed from new to closed

(In [9792]) Fixed #7210 -- Added F() expressions to query language. See the documentation for details on usage.

Many thanks to:

  • Nicolas Lara, who worked on this feature during the 2008 Google Summer of Code.
  • Alex Gaynor for his help debugging and fixing a number of issues.
  • Malcolm Tredinnick for his invaluable review notes.
Note: See TracTickets for help on using tickets.
Back to Top