Opened 2 years ago

Last modified 2 years ago

#25425 new Cleanup/optimization

Enforce calling resolve_expression before as_sql on all expressions

Reported by: Josh Smeaton Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

There are many places in the ORM that will try to compile an expression without first having ensured it was resolved. The contract of expressions is as follows:

resolved = expression.resolve_expression(*kwargs)
sql, params = compiler.compile(resolved)

Resolving the expression does two major things. First, it creates a copy so that expressions can be shared without clashing with each other. Secondly, the resolve_expression method may do necessary validation or extra query work.

There are a few known places where expression like objects are used without first being resolved.

  • Most Where nodes are not resolved correctly (anywhere where self.where or self.query.where is compiled)
  • SQLUpdateCompiler
  • SQLDeleteCompiler
  • django.db.models.lookups.Lookup
  • django.db.models.related_lookups.RelatedIn

I wonder if we should consider enforcing this at the code level. Refuse to compile the expression unless it has been resolved first. A naive way of doing this would be to maintain a resolved boolean flag, and switch it to True when resolved. The as_sql method could then error if the expression has not been resolved.

Change History (4)

comment:1 Changed 2 years ago by Anssi Kääriäinen

We have two interfaces:

  • Resolvable: resolved to an expression with resolve_expression
  • Compilable: can be turned into SQL

Not all compilables are resolvables, nor the other way around. So, it doesn't make sense to enforce that all compilables have been resolved.

I'm not even sure what WhereNode.resolve_expression() would do? We have other similar types that are not resolved ever, for example BaseTable and Join.

We could do the check for those cases where we have the resolve + compile contract.

comment:2 Changed 2 years ago by Josh Smeaton

Yes, but what about a resolvable nested within a non-resolvable. The docs also (perhaps incorrectly) point out that resolve_expression is a place to do validation https://docs.djangoproject.com/en/1.8/ref/models/expressions/#django.db.models.Expression.resolve_expression

Provides the chance to do any pre-processing or validation of the expression before it’s added to the query. resolve_expression() must also be called on any nested expressions. A copy() of self should be returned with any necessary transformations.

If we want to support expressions as filters, then we're probably going to have to support F() within a WhereNode():

Model.objects.filter(Greater(F('field'), 3)) -> WhereNode()

That WhereNode will have to be resolved at some point so that the nested F() is also resolved. This was the major source of problems when converting transforms into expressions:

https://github.com/django/django/blob/f2975c021d247bf8c6a5fc23988639c636da86f5/django/db/models/lookups.py#L76

I'm resolving the transform during the as_sql phase above which is wrong. But it was way too hard trying to track down all of the places that WhereNodes (and Lookups within the where node) were being rendered without first being resolved. Which is the very reason for this ticket.

A WhereNode does not have to resolve to anything itself, but it has to give nested expressions the chance to resolve.

comment:3 Changed 2 years ago by Anssi Kääriäinen

The way expressions in filter() calls work is that if you do .filter(Greater(F('field'), 3)), then Greater is resolved to an expression, and that expression is added to WhereNode.

Similarly, if you do .filter(Q(Greater(F('field'), 3)) & Q(Greater(F('other_field'), 4))), then the Q is resolved, and this also resolves the nested expressions. Finally the resolved expression is added to the query's WhereNode.

Notably you can't do .filter(WhereNode(Greater(F('field', 3)))). But this is alright, the user facing API is Q-objects, not WhereNode objects. Also, you are free to generate an expression that resolves to a WhereNode, but WhereNode itself isn't resolvable.

What we should do is enforce that any expression added to a WhereNode is already resolved.

Now, we could also add a required attribute resolved to all compilables. Then we could check if that attribute is set in compiler.compile(). Those objects that aren't resolvable could set it always to True. I'm not convinced this is needed, but I won't object to doing this.

comment:4 Changed 2 years ago by Tim Graham

Triage Stage: UnreviewedAccepted
Type: BugCleanup/optimization
Note: See TracTickets for help on using tickets.
Back to Top