Opened 3 weeks ago
Closed 8 days ago
#36890 closed New feature (fixed)
Allow providing distinct=True to StringAgg on SQLite by specifying the default delimiter
| Reported by: | AJ Slater | Owned by: | Varun Kasyap Pentamaraju |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 6.0 |
| Severity: | Normal | Keywords: | Aggregate, sqlite3, StringAgg |
| Cc: | Triage Stage: | Ready for checkin | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
Attempting to use the new generic StringAgg with distinct=True raises exceptions because any Aggregate with multiple parameters cannot be distinct with an sqlite3 connection. Running StringAgg without distinct produces many duplicate entries aggregated together.
However this Aggregate runs fine under sqlite3, allows distinct and does not produce duplicates.
class GroupConcat(Aggregate):
"""Sqlite3 GROUP_CONCAT."""
# Defaults to " " separator which is all I need for now.
allow_distinct = True
allow_order_by = True
function = "GROUP_CONCAT"
name = "GroupConcat"
def __init__(self, *args, **kwargs):
"""output_field is set in the constructor."""
super().__init__(*args, output_field=CharField(), **kwargs)
It occurs to me that this may happen because delimiter can be an expression. If delimiter is submitted as some invariant like Value() it would be nice to escape this limitation.
Change History (10)
comment:1 by , 3 weeks ago
comment:2 by , 3 weeks ago
I'm not sure I understand your report here.
The code you provided states that " " is the default separator for GROUP_CONCAT but the documentation says otherwise.
The
group_concat()function returns a string which is the concatenation of all non-NULLvalues of X. If parameter Y is present then it is used as the separator between instances of X. A comma (",") is used as the separator if Y is omitted.
which I was able to confirm with
SQLite version 3.37.0 2021-12-09 01:34:53
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE TABLE ticket_36890 (value text);
sqlite> INSERT INTO ticket_36890 (value) VALUES ('foo'), ('bar'), ('foo');
sqlite> SELECT GROUP_CONCAT(value), GROUP_CONCAT(DISTINCT value) FROM ticket_36890;
foo,bar,foo|foo,bar
which means that we could support StringAgg("field", Value(","), distinct=True) on SQLite but not StringAgg("field", Value(" "), distinct=True) which I believe is what you are requesting?
comment:3 by , 3 weeks ago
| Keywords: | StringAgg added |
|---|---|
| Summary: | StringAgg doesn't allow DISTINCT in sqlite3 while native GROUP_CONCAT does → Allow providing distinct=True to StringAgg on SQLite by specifying the default delimiter |
| Triage Stage: | Unreviewed → Accepted |
| Type: | Uncategorized → New feature |
Seems reasonable, we should be able to drop the delimiter if self.delimiter.value == models.Value(",") similar to what the MySQL code path does:
# Drop the delimiter from the source expressions. c.source_expressions = c.source_expressions[:-1]
That should evade the check:
if ( self.distinct and not connection.features.supports_aggregate_distinct_multiple_argument and len(super().get_source_expressions()) > 1 ):
comment:4 by , 3 weeks ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
I'm interested to submit patch
comment:5 by , 3 weeks ago
Thanks Simon, for catching my mistake. The default is not " ", it's actually ",".
Jacob's proposed solution would work with the default case.
comment:6 by , 2 weeks ago
| Has patch: | set |
|---|
comment:7 by , 2 weeks ago
| Patch needs improvement: | set |
|---|
The added test is failing on everything but SQLite as it relies on the dataabase string representation of floats which differs between backends.
It also lacks support for ORDER BY and FILTER and coudl simplify it's eligibility heuristic logic.
comment:8 by , 13 days ago
| Needs documentation: | set |
|---|---|
| Patch needs improvement: | unset |
comment:9 by , 9 days ago
| Needs documentation: | unset |
|---|---|
| Triage Stage: | Accepted → Ready for checkin |
can i it assigned to me if it get's reviewed