Opened 17 months ago

Last modified 17 months ago

#21251 new New feature

Not all database backends support grouping by a column number

Reported by: manfre Owned by: manfre
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: mssql
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

"SELECT ... GROUP BY 1" is not valid for all databases. This can happen in ORM generated queries when Django moves anything found in the ORDER BY over to the GROUP BY. E.g. "SELECT ... ORDER BY 1".

There are two ways of proceeding:

1. Don't promote the positional value from ORDER BY to GROUP BY, if the database doesn't support it.

It's possible that not including the positional values from the GROUP BY will cause different backends (those that support positional columns and those that do not) to return different results for certain queries. I cannot think of any example queries and the only usage of positional columns that I've noticed in the test suite are due to "ORDER BY 1". This change allows django-mssql to pass many aggregation and aggregation_regres tests that it previously failed with an error.

This change is trivial and contained to a few lines in SQLCompiler.get_grouping().

2. Look up the appropriate entry from the select by its position (indexed from 1, not 0) and include the select + params in the GROUP BY, instead of the position number.

This will not generate the correct query in all situations, but it should generate a valid query for the backend. Depending on the specifics of the query and how columns are front loaded to the select, it's quite possible that the intended ORDER BY position no longer lines up with the desired position when attempting to do the look up for GROUP BY.

Change History (2)

comment:1 Changed 17 months ago by russellm

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

I'm moderately surprised that Django does this -- firstly because I wasn't aware it was valid SQL in the first place, but secondly because I don't remember implementing anything like that when I committed the aggregation codebase. It's entirely possible something has changed since I last looked at the code, but it doesn't strike me as a very "Django" thing to find in a codebase.

Can you give a specific example of a query (ideally one in Django's test suite) that does this?

comment:2 Changed 17 months ago by akaariai

  • Triage Stage changed from Unreviewed to Accepted

ORDER BY 1 isn't available to user, but it is used internally for at least date queries. See sql/subqueries.py:L238 in current master. The aggregate case is hit by aggregation_regress.test_more_more_more():L573. The query is this:

qs = Book.objects.annotate(num_authors=Count('authors')).filter(num_authors=2).dates('pubdate', 'day')

There is another similar test, aggregation.test_dates_with_aggregation(). These seem to be the only case where ORDER BY 1 ends up in GROUP BY, so this isn't a common issue at all.

The queries generated by .dates() have used ORDER BY 1 from at least Django 1.2. I didn't check further.

Option 2) might be hard to implement, but would of course be a better solution than 1).

Note: See TracTickets for help on using tickets.
Back to Top