Opened 9 years ago

Closed 5 years ago

#24386 closed Bug (duplicate)

Querysets with filters and exclusions based on deep relations build invalid queries.

Reported by: Raphael Gaschignard Owned by:
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Can Sarıgöl Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I've hit this bug in 1.6, and reproduced it on master (django v1.9.dev20150221182749)

I have the following models:

class Owner(models.Model):
    pass

class Employee(models.Model):
    owner = models.ForeignKey(Owner)
    status = models.CharField(max_length=100)
    start_date = models.DateField()

When trying to find the oldest active employees for each owner, I tried the following query:

>>> query = Employee.objects.filter(status='active').exclude(owner__employee__start_date__lte=F('start_date'),owner__employee__status='active').distinct()
>>> query
  [...]
  File "/Users/rtpg/proj/test_django/django-trunk/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/Users/rtpg/proj/test_django/django-trunk/django/db/utils.py", line 95, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/Users/rtpg/proj/test_django/django-trunk/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
ProgrammingError: missing FROM-clause entry for table "u1"
LINE 1: ...e_employee" U0 INNER JOIN "core_employee" U2 ON ( U1."id" = ...
                                                             ^

The SQL query generated by the queryset seems to be at fault

>>> print query.query
SELECT DISTINCT "core_employee"."id", "core_employee"."owner_id", "core_employee"."status", "core_employee"."start_date" FROM "core_employee" WHERE ("core_employee"."status" = active AND NOT ("core_employee"."owner_id" IN (SELECT U2."owner_id" AS Col1 FROM "core_employee" U2 WHERE U2."status" = active) AND "core_employee"."owner_id" IN (SELECT U2."owner_id" AS Col1 FROM "core_employee" U0 INNER JOIN "core_employee" U2 ON ( U1."id" = U2."owner_id" ) WHERE U2."start_date" <= (U0."start_date"))))

I'm not very good at reading complex SQL queries, so I'm not sure what this is supposed to look like but in any case this query is refused by PostgreSQL

Change History (9)

comment:1 by Shai Berger, 9 years ago

Needs tests: set
Triage Stage: UnreviewedAccepted

Accepting based on details in report, didn't reproduce myself.

Just wanted to note for the submitter that even without the bug, the query you wrote would not achieve the stated goals -- it would return no employees. To do what you intended, you need to replace your lte with lt. Also, another approach for this relies on annotations (annotate employee with Min(owner__employee__startdate) and filter on that).

comment:2 by Raphael Gaschignard, 9 years ago

I don't know if annotations allow me to work around this. I need to find the minimum of only active employees, but to my knowledge the aggregates over related fields don't really work in combination with filters:

I added a number field to my Employee model:

In [8]: Employee.objects.annotate(Min('owner__employee__number')).values()
Out[8]: [{'status': u'not_active', 'number': 3, u'id': 2, 'owner__employee__number__min': 3, 'start_date': datetime.date(2015, 2, 23), u'owner_id': 1}, {'status': u'active', 'number': 4, u'id': 1, 'owner__employee__number__min': 3, 'start_date': datetime.date(2015, 2, 23), u'owner_id': 1}]

In [9]: Employee.objects.filter(number=4).annotate(Min('owner__employee__number')).values()
Out[9]: [{'status': u'active', 'number': 4, u'id': 1, 'owner__employee__number__min': 3, 'start_date': datetime.date(2015, 2, 23), u'owner_id': 1}]
Last edited 9 years ago by Raphael Gaschignard (previous) (diff)

comment:3 by Simon Charette, 9 years ago

To find the oldest active employees for each owner you could use a combination of DISTINCT ON and ORDER BY:

Employee.objects.filter(status='active').order_by('owner', 'start_time').distinct('owner')

comment:4 by Raphael Gaschignard, 9 years ago

Seemed I lacked a bit of imagination on that one, thanks for the workaround for my specific use case charettes. Though I guess that wouldn't work on a non-Postgres DB.

comment:5 by David Gouldin, 9 years ago

It looks like this is as a result of using the combination of an exclude filter with multiple joins and an F object. The multi-join exclude causes the ORM to represent the filter as a subquery, but since that Query object doesn't know about its parent, it has to try to make sense of the F object in the context of the subquery.

comment:6 by Can Sarıgöl, 5 years ago

Owner: changed from nobody to Can Sarıgöl
Status: newassigned

this issue has been solved by this code of Simon's PR

Last edited 5 years ago by Can Sarıgöl (previous) (diff)

comment:7 by Can Sarıgöl, 5 years ago

Cc: Can Sarıgöl added
Owner: Can Sarıgöl removed
Status: assignednew

comment:8 by Can Sarıgöl, 5 years ago

I think, this test covers this.

comment:9 by Mariusz Felisiak, 5 years ago

Needs tests: unset
Resolution: duplicate
Status: newclosed
Summary: Querysets with filters and exclusions based on deep relations build invalid queriesQuerysets with filters and exclusions based on deep relations build invalid queries.
Note: See TracTickets for help on using tickets.
Back to Top