Opened 9 months ago

Last modified 3 days ago

#35444 assigned New feature

Add generic support for order by to Aggregate

Reported by: Chris M Owned by: Chris M
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: orderableaggmixin aggregate
Cc: Simon Charette Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

We would like to add general support for ordering within SQL aggregate functions that allow it. Currently, this behavior is only supported in PostgreSQL via the OrderableAggMixin. With changes made in #35339 , it should be possible to merge the behaviors of that mixin into the core behaviors of the Aggregate class and then apply it to functions that support order by like Sqlite's STRING_AGG, MySQL's GROUP_CONCAT, and Oracle's LIST_AGG WITHIN GROUP (ORDER BY).

These are some implementation notes from Simon discussed on the original bug ticket in this comment ticket:35339#comment:13 that I'm capturing some of here to avoid jumping around.

A few things I believe we should focus on along the way

  1. We should make the argument Aggregate(order_by) instead of ordering to be consistent with Window
  2. Just like we do with allow_distinct we should use a allow_order_by attribute for aggregates that support this feature
  3. We should make OrderableAggMixin a shim that sets allow_order_by=True, redirect init(ordering) to order_by, and emit a deprecation warning when doing so to allow a proper transition for contrib.postgres.ArrayAgg and friends
  4. We should reuse OrderByList as much as possible

Change History (15)

comment:1 by Simon Charette, 9 months ago

Cc: Simon Charette added
Triage Stage: UnreviewedAccepted

As discussed on the other ticket I think that moving this capability to the generic Aggregate class makes sense.

comment:2 by Simon Charette, 7 months ago

Hey Chris, have you run into any issues working on this ticket? Do you need any support.

comment:3 by Chris M, 7 months ago

Thanks for the bump, Simon. I got into a busy period at work, but I was just talking to my team about how I needed to get back to work on this PR if I wanted to get it in for 5.2. So the timing works out well.

https://github.com/django/django/compare/main...camuthig:django:merge-orderable-agg-mixin

I did hit a couple of bumps along the way but nothing blocking yet. I have what appears to be a working feature tested locally against Sqlite and Postgres and need to get the time to update to documentation and write tests. In my code I am porting over the StringAgg function into the standard DB module and I am shifting the postgres specific implementation to be a subclass of it, so the existing postgres StringAgg tests work as a full regression test for that database backend, but we don't have that same coverage for the other backends yet.

I'm curious your thoughts on a few things:

  1. Generally if you think the current path of the code is right or needs changes. I can convert this diff link into a draft PR if that would help.
  2. If you can point me in the right direction for any documentation you think should be updated. I haven't started working through that just yet.
  3. What do you think the best way to test this is? The Postgres aggregate tests are very thorough, so I was thinking it might work to take all of the string agg related tests from there and move them into a new test file that isn't just run for the Postgres backend?

comment:4 by Simon Charette, 7 months ago

Thanks for the bump, Simon. I got into a busy period at work, but I was just talking to my team about how I needed to get back to work on this PR if I wanted to get it in for 5.2. So the timing works out well.

No problem Chris, thanks for the update! I reaching out because I'm also want to find a way to make this land in 5.2.

Generally if you think the current path of the code is right or needs changes. I can convert this diff link into a draft PR if that would help.

From a cursory look at the branch it seems great, I would encourage you to open a draft PR when you have the chance though (no rush) as it makes collaboration through review much easier.

If you can point me in the right direction for any documentation you think should be updated. I haven't started working through that just yet.

Sure, since the patch includes a new feature and deprecation I can think of a few places.

  1. The 5.2 release notes.
  2. The documentation for ArrayAgg and JSONBAgg to document the change of ordering to order_by
  3. The complete deprecation of postgres.StringAgg, you could refer to how the documentation change was made when we deprecated postgres.JSONField in favor or models.JSONField
  4. Something similar to the documentation for Aggregate.allow_distinct for .allow_order_by.
  5. Documentation for models.StringAgg

What do you think the best way to test this is? The Postgres aggregate tests are very thorough, so I was thinking it might work to take all of the string agg related tests from there and move them into a new test file that isn't just run for the Postgres backend?

I think that's a good approach! Creating a PR out of your work should help with ensuring the per-backend implementations are working fine.

Maybe one last point of feedback looking at the MR is to try to split your work in logical commits that can be merged serially and built on top of each other. If you're not familiar with git-rebase I could help you slice it at first. I think a good logical breakdown could be

  1. Deprecate OrderableAggMixin's subclasses usage of ordering in favor of order_by
  2. Deprecate OrderableAggMixin and move support to Aggregate
  3. Deprecate postgres.StringAgg in favor of models.aggregate.StringAgg

Such a sequence should ease the review and allow merger to progressively merge slices of the work into main and reduces the risks of having to completely revert work.

comment:5 by Chris M, 7 months ago

Thanks for the feedback. I'll start working on some of that documentation. I'll definitely do a full rebase before I finish up and get that testing together with the last part of it.

comment:6 by Chris M, 4 months ago

Has patch: set

comment:7 by Simon Charette, 6 weeks ago

Patch appears to be in a good state and a solution was provided to adequately deal with composite primary key counts.

comment:8 by Sarah Boyce, 5 weeks ago

Patch needs improvement: set

comment:9 by Jacob Walls, 4 weeks ago

Patch needs improvement: unset

comment:10 by Sarah Boyce, 3 weeks ago

Triage Stage: AcceptedReady for checkin

Pulled the first commit to a separate PR which I aim to land: https://github.com/django/django/pull/18996

comment:11 by Sarah Boyce <42296566+sarahboyce@…>, 2 weeks ago

In d734f16:

Refs #35444 -- Deprecated contrib.postgres aggregates ordering for order_by.

Aligns the argument with SQL standards already used in Window.order_by and
sets up for adding support to Aggregate.

comment:12 by Sarah Boyce, 2 weeks ago

Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

comment:13 by Sarah Boyce, 2 weeks ago

Patch needs improvement: unset

comment:14 by Sarah Boyce, 2 weeks ago

Needs tests: set

comment:15 by Simon Charette, 3 days ago

Needs tests: unset
Note: See TracTickets for help on using tickets.
Back to Top