Opened 4 years ago
Closed 4 years ago
#32862 closed Cleanup/optimization (needsinfo)
Order By in Postgres When Using Annotations Causes Ambiguous Field Name
| Reported by: | Cody Williams | Owned by: | nobody |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 3.0 |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Unreviewed | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
Running a query on a table with an annotation, joining a table via select_related that includes a field with the same name as the annotation, and ordering on the annotation Postgres will raise a ProgrammingError due to an ambiguous field name. This issue does not occur with SQLite.
To Reproduce: Using the django.db.backends.postgresql_psycopg2 database engine and the following models:
from django.db import models
from django.db.models.functions import Concat
from django.db.models import Value
class Company(models.Model):
name = models.CharField(max_length=200)
class PersonManager(models.Manager):
def get_queryset(self):
return super().get_queryset().annotate(
name=Concat('first_name', Value(' '), 'last_name')
)
class Person(models.Model):
objects = PersonManager()
first_name = models.CharField(max_length=200)
last_name = models.CharField(max_length=200)
company = models.ForeignKey('Company', models.PROTECT, null=True)
Running a query such as:
models.Person.objects.filter(name__contains='Smith').select_related('company').order_by('name')
Produces the following SQL:
SELECT
"people_person"."id",
"people_person"."first_name",
"people_person"."last_name", "people_person"."company_id",
CONCAT("people_person"."first_name", CONCAT(' ', "people_person"."last_name")) AS "name",
"people_company"."id",
"people_company"."name"
FROM "people_person"
LEFT OUTER JOIN "people_company" ON ("people_person"."company_id" = "people_company"."id")
WHERE CONCAT("people_person"."first_name", CONCAT(' ', "people_person"."last_name"))::text LIKE '%Smith%'
ORDER BY "name" ASC;
In Postgres this query will fail because "name" is ambiguous between the name field on the Company model and the name annotation on the Person model. If rather than referencing the annotation by name, the query produced the following SQL, using the annotation definition rather than the name (as it does for the WHERE clause) the query works perfectly:
SELECT
"people_person"."id",
"people_person"."first_name",
"people_person"."last_name", "people_person"."company_id",
CONCAT("people_person"."first_name", CONCAT(' ', "people_person"."last_name")) AS "name",
"people_company"."id",
"people_company"."name"
FROM "people_person"
LEFT OUTER JOIN "people_company" ON ("people_person"."company_id" = "people_company"."id")
WHERE CONCAT("people_person"."first_name", CONCAT(' ', "people_person"."last_name"))::text LIKE '%Smith%'
ORDER BY CONCAT("people_person"."first_name", CONCAT(' ', "people_person"."last_name")) ASC;
Change History (2)
comment:1 by , 4 years ago
comment:2 by , 4 years ago
| Component: | Uncategorized → Database layer (models, ORM) |
|---|---|
| Resolution: | → needsinfo |
| Status: | new → closed |
| Type: | Bug → Cleanup/optimization |
Thanks for this report, however, I agree with Simon, IMO it's not worth additional complexity. We can reconsider this decision if someone provides PoC.
I guess the ORM could detect when a reference is ambiguous and inline the expression when it's the case but I'm not sure of the benefits versus the complexity it introduces.
If you want to give a shot at writing a patch yourself the logic should live in
sql.SQLCompiler.get_order_byand act upon expressions whereis_refis true.https://github.com/django/django/blob/225d96533a8e05debd402a2bfe566487cc27d95f/django/db/models/sql/compiler.py#L381-L420
I guess the logic could then be added in a
django.db.models.expressions.Ref.is_ambiguous(query: sql.Query) -> boolmethod that would introspectquery.selectandquery.annotation_selectand be relied onget_order_by.