Opened 13 months ago

Closed 12 months ago

Last modified 12 months ago

#22429 closed Bug (fixed)

Wrong SQL generated when using ~Q and F

Reported by: sipp11 Owned by: superemily
Component: Database layer (models, ORM) Version: master
Severity: Release blocker Keywords:
Cc: loic@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I found this problem on 1.6.2 and also on 1.8.0alpha (master as of today commit b82f30785ff0f9fedf38fc79624a9064a903de4a)

How to reproduce:
create school app. Then add models and run a query.

models.py

Code highlighting:

class School(models.Model):
    school_id = models.CharField('School ID', max_length=10,
                                 unique=True) 


class Student(models.Model):
    school = models.ForeignKey(School, related_name='students')
    prefix = models.CharField('Prefix', max_length=3, blank=True, default='')
    student_id = models.CharField('Student ID', max_length=15)


class Classroom(models.Model):
    school = models.ForeignKey(School, related_name='classrooms')
    students = models.ManyToManyField(Student, related_name='classroom', blank=True)

Then go to django shell

Code highlighting:

>>> Student.objects.filter(~Q(classroom__school=F('school')))
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/Users/sipp11/.virtualenvs/psis/lib/python2.7/site-packages/Django-1.8.dev20140413012625-py2.7.egg/django/db/models/query.py", line 116, in __repr__
    data = list(self[:REPR_OUTPUT_SIZE + 1])
  File "/Users/sipp11/.virtualenvs/psis/lib/python2.7/site-packages/Django-1.8.dev20140413012625-py2.7.egg/django/db/models/query.py", line 141, in __iter__
    self._fetch_all()
  File "/Users/sipp11/.virtualenvs/psis/lib/python2.7/site-packages/Django-1.8.dev20140413012625-py2.7.egg/django/db/models/query.py", line 961, in _fetch_all
    self._result_cache = list(self.iterator())
  File "/Users/sipp11/.virtualenvs/psis/lib/python2.7/site-packages/Django-1.8.dev20140413012625-py2.7.egg/django/db/models/query.py", line 265, in iterator
    for row in compiler.results_iter():
  File "/Users/sipp11/.virtualenvs/psis/lib/python2.7/site-packages/Django-1.8.dev20140413012625-py2.7.egg/django/db/models/sql/compiler.py", line 698, in results_iter
    for rows in self.execute_sql(MULTI):
  File "/Users/sipp11/.virtualenvs/psis/lib/python2.7/site-packages/Django-1.8.dev20140413012625-py2.7.egg/django/db/models/sql/compiler.py", line 784, in execute_sql
    cursor.execute(sql, params)
  File "/Users/sipp11/.virtualenvs/psis/lib/python2.7/site-packages/Django-1.8.dev20140413012625-py2.7.egg/django/db/backends/utils.py", line 74, in execute
    return super(CursorDebugWrapper, self).execute(sql, params)
  File "/Users/sipp11/.virtualenvs/psis/lib/python2.7/site-packages/Django-1.8.dev20140413012625-py2.7.egg/django/db/backends/utils.py", line 59, in execute
    return self.cursor.execute(sql, params)
  File "/Users/sipp11/.virtualenvs/psis/lib/python2.7/site-packages/Django-1.8.dev20140413012625-py2.7.egg/django/db/utils.py", line 94, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/Users/sipp11/.virtualenvs/psis/lib/python2.7/site-packages/Django-1.8.dev20140413012625-py2.7.egg/django/db/backends/utils.py", line 59, in execute
    return self.cursor.execute(sql, params)
  File "/Users/sipp11/.virtualenvs/psis/lib/python2.7/site-packages/Django-1.8.dev20140413012625-py2.7.egg/django/db/backends/sqlite3/base.py", line 480, in execute
    return Database.Cursor.execute(self, query, params)
OperationalError: no such column: U1.student_id
>>>

Apparently SQL was missing U1 declaration.

Code highlighting:

>>> print Student.objects.filter(~Q(classroom__school=F('school'))).query
SELECT "school_student"."id", "school_student"."school_id", "school_student"."prefix", "school_student"."student_id" 
FROM "school_student" 
WHERE NOT ("school_student"."id" IN (
      SELECT U1."student_id" AS "student_id" 
      FROM "school_student" U0
      INNER JOIN "school_classroom_students" U2 ON ( U0."id" = U2."student_id" ) 
      INNER JOIN "school_classroom" U3 ON ( U2."classroom_id" = U3."id" ) 
      WHERE U3."school_id" = (U0."school_id"))
)

Above is information I got from testing with master (1.8.0alpha)

This is another trackback from 1.6.2, https://dpaste.de/Yzja, which is in pretty much the same.

Change History (11)

comment:1 Changed 13 months ago by bmispelon

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted
  • Version changed from 1.6 to master

Hi,

I can indeed reproduce this on master and 1.6 but the example code seems to work in 1.5 (at least it doesn't trigger any traceback).

Using git bisect, I found that commit b4492a8ca4a7ae4daa3a6b03c3d7a845fad74931 is the one that introduced the change.

Thanks.

comment:2 Changed 13 months ago by superemily

  • Owner changed from nobody to superemily
  • Status changed from new to assigned

comment:3 Changed 12 months ago by timo

  • Has patch set
  • Severity changed from Normal to Release blocker

Here's a PR. Since it's a regression 1.6, it qualifies for a backport there so a mention in the 1.6.4 release notes would be good.

comment:4 Changed 12 months ago by loic84

I updated the patch from @superemily: https://github.com/loic/django/compare/review22429

We discussed this with @akaariai on IRC a couple of days ago, he'll do a final review.

The release notes will need to be pushed to 1.6.5, if that doesn't make it on time for 1.6.4.

comment:5 Changed 12 months ago by loic84

  • Cc loic@… added

comment:6 Changed 12 months ago by timo

  • Triage Stage changed from Accepted to Ready for checkin

RFC pending final review as noted above.

comment:7 Changed 12 months ago by akaariai

  • Patch needs improvement set
  • Triage Stage changed from Ready for checkin to Accepted

I don't think the patch is correct. Patched code works a little better than current code, but I am sure it will fail for more complex queries.

Below is some background information needed for understanding how the code is failing.

When an exclude() filter condition needs to be pushed into a subquery, Django proceeds as follows:

  • Construct a new query (that is, start from a freshly created Query object) for the filter condition using build_filter(). The new Query object is going to be used as subquery.
    • In build_filter() the F() object is added to the query first
    • Then, the joins for the lookup path are generated
    • End result is that Q(classroom__school=F('school')) generates first one join for the F('school'), then needed joins for the classroom__school part
  • Next, joins from the beginning of the newly generated query are trimmed. The aim is to find a split position as deep as possible in the new query. (That is, we try to add all possible joins to the original query, and remove as many joins as possible from the start of the new query).
  • The code that does the trimming assumes that the joins generated for the classroom__school path are the first joins generated for the query, that is, the join paths used for clasroom__school are first joins in self.tables. As noted above, this is not true when using F() objects.
  • End result is that due to the incorrect assumption, the code ends up trimming the join for F('school'), thus the reference to unused alias.

The proposed fix doesn't actually correct the join trimming logic to do the right thing. The trimming logic in beginning of trim_start() still might unref joins that it shouldn't.

I tried to fix this by changing the order of join generation for F() objects and the lookup path, that is first generate the joins needed for lookup path, then joins needed for the F() object. This results in numerous failures, so I didn't investigate that option further.

I am going to investigate how to make the join trimming a bit more reliable, that is, make sure it trims only the correct tables. I am afraid the changes are going to look somewhat ugly.

comment:8 Changed 12 months ago by akaariai

  • Patch needs improvement unset

Another try available from https://github.com/akaariai/django/compare/ticket_22429

The idea this time is to explicitly communicate the joins generated for the lookup path to trim_start(). This is done in an ugly way, that is the joins generated are stored in self._lookup_joins in build_filter(). Works, but this isn't pretty.

When the joins generated for the lookup path are known in trim_start() the code can operate on the tables used for the lookup path, and thus avoid trimming wrong joins. I think the fix is correct, but extra eyes are very welcome.

comment:9 Changed 12 months ago by Anssi Kääriäinen <akaariai@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In 5e1f4656b98816c96a1cc051224c1b699db480e0:

Fixed #22429 -- Incorrect SQL when using ~Q and F

comment:10 Changed 12 months ago by Anssi Kääriäinen <akaariai@…>

In 402fc4f6c9921f56eecbb6e3fc69154270be6468:

[1.7.x] Fixed #22429 -- Incorrect SQL when using ~Q and F

Backport of 5e1f4656b9 from master

comment:11 Changed 12 months ago by Anssi Kääriäinen <akaariai@…>

In 0e3704963693ddfaaa525054c42333f6ffdb4b47:

[1.6.x] Fixed #22429 -- Incorrect SQL when using ~Q and F

Backpatch of 5e1f4656b98816c96a1cc051224c1b699db480e0 from master.

Conflicts:

django/db/models/sql/query.py
tests/queries/models.py
tests/queries/tests.py

Note: See TracTickets for help on using tickets.
Back to Top