Opened 6 years ago
Closed 6 years ago
#31228 closed Bug (fixed)
DISTINCT with GROUP_CONCAT() and multiple expressions raises NotSupportedError on SQLite.
| Reported by: | Andy Terra | Owned by: | Hongtao Ma |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | dev |
| Severity: | Normal | Keywords: | sqlite |
| Cc: | Simon Charette | Triage Stage: | Ready for checkin |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
Contrary to what is suggested in lines 60-64 of django.db.backends.sqlite3.operations.py, SQLite does support DISTINCT on aggregate functions.
One such example is GROUP_CONCAT, which is quite similar to PostgreSQL's STRING_AGG.
I can't find any canonical links which provide a useful explanation of GROUP_CONCAT, but this should be good enough: https://www.w3resource.com/sqlite/aggregate-functions-and-grouping-group_concat.php
I came across this issue when trying to create my own GroupConcat function subclassing Aggregate (similar to the StringAgg implementation from postgres) and noticed that skipping the check in django.db.backends.sqlite3.operations.py would allow my queries to run as advertised.
My code for GroupConcat is:
from django.db.models import Value
from django.db.models.aggregates import Aggregate
class GroupConcat(Aggregate):
function = 'GROUP_CONCAT'
template = '%(function)s(%(distinct)s %(expressions)s)'
allow_distinct = True
def __init__(self, expression, delimiter=None, **extra):
if delimiter is not None:
self.allow_distinct = False
delimiter_expr = Value(str(delimiter))
super().__init__(expression, delimiter_expr, **extra)
else:
super().__init__(expression, **extra)
def as_sqlite(self, compiler, connection, **extra_context):
return super().as_sql(
compiler, connection,
function=self.function,
template=self.template,
**extra_context
)
For the record, as the code above suggests, a separate issue is that GROUP_CONCAT only allows DISTINCT when a delimiter isn't specified.
After some discussion on #django, an argument was raised in favor of changing the message to say "Django doesn't support...", but I would imagine that skipping the check entirely would simply result in an OperationalError for malformed queries while still allowing users to extend the ORM as needed.
Change History (10)
comment:1 by , 6 years ago
| Cc: | added |
|---|---|
| Summary: | SQLite does support DISTINCT on aggregate functions → DISTINCT with GROUP_CONCAT() and multiple expressions raises NotSupportedError on SQLite. |
| Triage Stage: | Unreviewed → Accepted |
| Version: | 3.0 → master |
comment:2 by , 6 years ago
| Cc: | added |
|---|
comment:3 by , 6 years ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:4 by , 6 years ago
There seems to be problem for concatenation style aggregation in general, here is what i experimented on postgresql:
In [18]: Blog.objects.annotate(StringAgg('title', delimiter='x'))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
~/django/django/db/models/query.py in annotate(self, *args, **kwargs)
1071 try:
-> 1072 if arg.default_alias in kwargs:
1073 raise ValueError("The named annotation '%s' conflicts with the "
~/django/django/db/models/aggregates.py in default_alias(self)
64 return '%s__%s' % (expressions[0].name, self.name.lower())
---> 65 raise TypeError("Complex expressions require an alias")
66
TypeError: Complex expressions require an alias
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-18-b924dae7712a> in <module>
----> 1 Blog.objects.annotate(StringAgg('title', delimiter='x'))
~/django/django/db/models/manager.py in manager_method(self, *args, **kwargs)
80 def create_method(name, method):
81 def manager_method(self, *args, **kwargs):
---> 82 return getattr(self.get_queryset(), name)(*args, **kwargs)
83 manager_method.__name__ = method.__name__
84 manager_method.__doc__ = method.__doc__
~/django/django/db/models/query.py in annotate(self, *args, **kwargs)
1075 % arg.default_alias)
1076 except TypeError:
-> 1077 raise TypeError("Complex annotations require an alias")
1078 annotations[arg.default_alias] = arg
1079 annotations.update(kwargs)
TypeError: Complex annotations require an alias
In [19]:
follow-up: 6 comment:5 by , 6 years ago
An error is quite descriptive IMO, "TypeError: Complex expressions require an alias".
comment:6 by , 6 years ago
Replying to felixxm:
An error is quite descriptive IMO, "TypeError: Complex expressions require an alias".
Not quite sure about what kind of 'expressions' deserve being called complex. But I assume StringAgg('title', delimiter='x') shouldn't be one of them. The problem here is that the delimiter part of StringAgg is included in the expressions, which probably makes it complex.
From the syntax defined by Postgresql, the expression is explicitly separated from the delimiter:
string_agg(expression, delimiter) string_agg
comment:7 by , 6 years ago
Yes, because delimiter is treated as an expression in StringAgg. Please don't add unrelated comments/questions to tickets.
comment:9 by , 6 years ago
| Needs tests: | unset |
|---|---|
| Triage Stage: | Accepted → Ready for checkin |
Thanks for this ticket. I agree that it shouldn't raise an error in some cases, but removing this check is not an option, IMO.
Regression in bc05547cd8c1dd511c6b6a6c873a1bc63417b111.