Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#24485 closed New feature (fixed)

Support for annotate(end=F('start') + F('duration'), output_field=DateTimeField)

Reported by: Michael Angeletti Owned by: Josh Smeaton
Component: Database layer (models, ORM) Version: 1.8beta2
Severity: Release blocker Keywords: durationfield, aggregation, annotation
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I've modeled a Ticket as such:

class Ticket(models.Model):
    # ... other fields omitted
    active_at = models.DateTimeField()
    duration = models.DurationField()

Using duration this way is more valuable than simply defining an expires_at DateTimeField, because it has the same amount of information and also allows me to do things like Ticket.objects.filter(duration__gt=timedelta(days=1)). However, I ran into a problem when I tried to run the following:

Ticket.objects.annotate(
    expires_at=models.Sum(
        models.F('active_at') + models.F('duration'),
        output_field=models.DateTimeField()
    )
)

The exception occurs during datetime parsing when a float is encountered during an attempted regexp search. Does this imply that all Sum results will be of the type float? If so, does that disqualify supporting the above example as a possible feature?

Change History (23)

comment:1 by Michael Angeletti, 9 years ago

Keywords: aggregation annotation added

comment:2 by Josh Smeaton, 9 years ago

What would you expect the SQL output of SUM(date_field) to be? Most databases (all?) do not allow Sum(date). The argument to Sum has to be numeric or convertable to numeric. Sum(interval) works for some backends.

What I suspect you really want though is this:

Ticket.objects.annotate(
    expires_at=F('active_at') + F('duration', output_field=models.DateTimeField())
)

Note that I've removed the Sum. We're now just adding two fields. I'm not 100% sure that this is supported on all backends, but if it's not it should be. Also note that I've provided a datetime output field to the duration F expression. That's so that both expressions have the same output type, so that the entire expression will have an output type of DateTimeField. We should probably have a new type whose purpose is to provide an output field for the entire expression.

Can you give that a go and report back please?

Last edited 9 years ago by Josh Smeaton (previous) (diff)

comment:3 by Michael Angeletti, 9 years ago

@jarshwah F only takes a name argument, so adding an output_field kwarg results in a TypeError. Without the ability to define which output field to use, I'm left with:

Ticket.objects.annotate(expires_at=F('active_at') + F('duration'))

Which results in:

FieldError: Expression contains mixed types. You must set output_field

I did, for the sake of completion, try:

Ticket.objects.annotate(expires_at=F('active_at') + F('duration'), output_field=models.DateTimeField())

This resulted in:

AttributeError: 'DateTimeField' object has no attribute 'resolve_expression'

comment:4 by Josh Smeaton, 9 years ago

So this has convinced me even more that we need some kind of wrapper type to provide a top level output_field attribute to the expression, unless someone has a nicer idea?

I'd suggest something along the lines of:

Ticket.objects.annotate(expires=Output(F('active_at')+F('duration'), output_field=DurationField())

But I'm not sold on the name (or the look of it to be honest).

In the mean time though, you can capture the output of F()+F() outside the queryset construction, and modify the output_field directly:

In [16]: expires = F('active_at')+F('duration')

In [17]: expires.output_field = models.DurationField()

In [18]: qs = Ticket.objects.annotate(expires_at=expires)

In [19]: for q in qs:
   ....:     print(q.expires_at)
   ....:
2015-03-15 23:53:18.065764+00:00
2015-03-16 02:53:53.953573+00:00
2015-03-15 18:53:56.391507+00:00
2015-03-15 21:53:58.441390+00:00

comment:5 by Carl Meyer, 9 years ago

If we went with a wrapper type (and I don't see a better solution), I would prefer a name like Expression (since that's what it is/wraps), rather than Output (which seems very purpose-specific).

comment:6 by Josh Smeaton, 9 years ago

I partially agree, although the only reason (that I can foresee now.. later hindsight may show this to be totally wrong) to have a wrapper type like this is purely to designate the output type.

The name Expression is already used, and has the signature Expression(left, connector, right, output_field):

Expression(F('active_at'), '+', F('duration'), output_field=DurationField())

There is already a type that could be used although it's not exported in django.db.models:

ExpressionNode(F('active_at')+F('duration'), output_field=DurationField())

I don't like the idea of exposing ExpressionNode purely because of the name. It's an internal implementation detail, but it fits the signature. I guess we could import it under any name we choose though:

# django.db.models.py

from django.db.expressions import ExpressionNode as Expr?

comment:7 by Josh Smeaton, 9 years ago

Owner: changed from nobody to Josh Smeaton
Status: newassigned
Triage Stage: UnreviewedAccepted

comment:8 by Josh Smeaton, 9 years ago

Has patch: set
Triage Stage: AcceptedReady for checkin

Pull Request: https://github.com/django/django/pull/4329

I think documenting is enough at this point. We already have a type exposed that is capable of wrapping the expression, and the use-case of combining different types is small enough to not need anything complicated. If there are others that run into this problem and it seems to cause a bit of pain, then we can investigate a different approach then.

comment:9 by Josh Smeaton, 9 years ago

Severity: NormalRelease blocker
Version: master1.8beta2

comment:10 by Anssi Kääriäinen, 9 years ago

Documentation should be good enough for now. If we need something more, then a possibility is to extend usage of F(). Maybe F(F('active_at')+F('duration'), output_field=DurationField()) could be made to work?

comment:11 by Michael Angeletti, 9 years ago

@jarshwah @akaariai - excellent, thanks for explaining that and adding a documentation patch for it. I had a suspicion that a missing wrapper was the only problem, so I found myself trying to use Aggregate, etc. I agree that a documentation patch is perfectly sufficient, since it's is already possible to do. This documentation also helps the reader better comprehend the result of operating on two F instances.

comment:12 by Tim Graham, 9 years ago

Resolution: fixed
Status: assignedclosed

Commit messages to fix this referenced the wrong ticket number.

In 820381d3:

Fixed #24486 -- Documented method to provide output_field to mixed F expressions

In 09062e95:

[1.8.x] Fixed #24486 -- Documented method to provide output_field to mixed F expressions

Backport of 820381d38bc02ea8b92837ce869e7332a7db9913 from master

comment:13 by Carl Meyer, 9 years ago

I think the documented solution is fine from an API perspective (the general approach of wrapping the entire construct in something with an output_field argument seems right), except that ExpressionNode is a bit of an internal-feeling name.

I don't think using F would be a clarity improvement - the expression may include fields, but as a whole it is not a field.

I still think the best name for the wrapper would be simply Expression, since that's what the whole thing is; it's too bad that name is already taken by something with an incompatible signature. From an API perspective (not considering internals), I think it would be quite sensible to be able to instantiate an Expression with nothing more than a sub-expression and an output field (rather than requiring a lhs, connector and rhs), but in terms of the implementation that would essentially mean having the Expression constructor return an ExpressionNode, which doesn't work. The implementation of Expression is quite tied to having both an lhs and rhs.

Importing ExpressionNode as Expr might be a small improvement, but results in having both Expr and Expression, which differ in a way that's not at all indicated by their names. That's not great either.

I don't have a better proposal, and I agree that if this is an edge use case we can just live with documenting ExpressionNode for now, and maybe for good. Just wanted to get these API thoughts down in case we ever do revisit.

comment:14 by Tim Graham <timograham@…>, 9 years ago

In 88d798d:

Refs #24485 -- Renamed some expression types

comment:15 by Tim Graham <timograham@…>, 9 years ago

In a0cebe8:

[1.8.x] Refs #24485 -- Renamed some expression types

Backport of 88d798d71a20662bdf5335f0586fb9eb6e660c57 from master

comment:16 by Michael Angeletti, 9 years ago

Resolution: fixed
Status: closednew

Sorry to reopen this ticket, guys. I've been so busy the past few days I didn't get a chance to try out the new code / docs until after work tonight. The issue with the docs update and code updates, as I see them, is that Expression takes only 2 arguments, namely self and output_field, so the example in the docs (fitted to my schema):

from django.db import models
from myapp.models import Ticket

Ticket.objects.annotate(
    expires_at=models.Expression(
        models.F('active_at') + models.F('duration'),
        output_field=models.DateTimeField()
    )
)

results in TypeError: __init__() got multiple values for argument 'output_field'

The aforementioned (by @jarshwah) method of simply adding an output_field to an a combined F didn't work either:

expires_at = models.F('active_at') + models.F('duration')
expires_at.output_field = models.DateTimeField()
Ticket.objects.annotate(expires_at=expires_at)

That resulted in:

django/db/models/query.py in iterator(self)
    259             if annotation_col_map:
    260                 for attr_name, col_pos in annotation_col_map.items():
--> 261                     setattr(obj, attr_name, row[col_pos])
    262 
    263             # Add the known related objects to the model, if there are any

AttributeError: can't set attribute

comment:17 by Josh Smeaton, 9 years ago

You're right, I thought I had tested and written the result here, but that was for 24486. Doing both tickets at the same time was a mistake. I'll take a look at this now, and any fix I come up (and document) will also be replicated in the tests to ensure I don't not-fix it again.

comment:18 by Tim Graham, 9 years ago

Has patch: unset
Triage Stage: Ready for checkinAccepted

comment:19 by Josh Smeaton, 9 years ago

Has patch: set

comment:20 by Tim Graham, 9 years ago

Triage Stage: AcceptedReady for checkin

comment:21 by Josh Smeaton <josh.smeaton@…>, 9 years ago

Resolution: fixed
Status: newclosed

In 02a2943e:

Fixed #24485 -- Allowed combined expressions to set output_field

comment:22 by Josh Smeaton <josh.smeaton@…>, 9 years ago

In e654123f:

Fixed #24485 -- Allowed combined expressions to set output_field

comment:23 by Tim Graham <timograham@…>, 9 years ago

In ce6062db:

[1.8.x] Fixed backport of refs #24485 tests.

Note: See TracTickets for help on using tickets.
Back to Top