Code

Opened 5 years ago

Closed 5 years ago

Last modified 3 years ago

#11082 closed (fixed)

exclude subquery executed twice

Reported by: handrews Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description (last modified by Alex)

When excluding using a subquery, I'm seeing the subquery being

executed first as its own query, then seeing the correct query being
run, including the subquery. Using filter instead of exclude, this
does not happen.

Here's what I mean (Select fields edited out of the MySQL log for
readability- connection 374 is from the Python prompt, 332 is me
poking at the mysql command line in another window to make sure I
don't confuse which queries go with which python statements). Version
and platform info follows:

>>> baz=Publisher.objects.all() 
>>> for s in Series.objects.filter(publisher__in=baz): 

...   pass 
... 
>>> for s in Series.objects.exclude(publisher__in=baz): 

...   pass 
... 
090509 19:30:22     332 Query       select count(*) from core_issue 
090509 19:30:28     374 Query       SELECT * FROM `core_series` WHERE `core_series`.`publisher_id` IN (SELECT U0.`id` FROM `core_publisher` U0) ORDER BY `core_series`.`name` ASC, `core_series`.`year_began` ASC 
090509 19:30:41     332 Query       select count(*) from core_issue 
090509 19:30:54     374 Query       SELECT * FROM `core_publisher` ORDER BY `core_publisher`.`name` ASC 
090509 19:30:55     374 Query       SELECT * FROM `core_series` WHERE NOT (`core_series`.`publisher_id` IN (SELECT U0.`id` FROM `core_publisher` U0)) ORDER BY `core_series`.`name` ASC, `core_series`.`year_began` ASC 

Version stuff:
Django 1.1 beta 1
Mac OS X 10.4.11
MySQL 5.0.45
Python 2.4.4
MySQLdb 1.2.1_p2

I checked on django-users first, where Alex Gaynor suggested that the issue is here:
http://code.djangoproject.com/browser/django/trunk/django/db/models/sql/query.py#L1622

Attachments (1)

11082_patch_r10742.git.diff (3.1 KB) - added by clement 5 years ago.

Download all attachments as: .zip

Change History (9)

comment:1 Changed 5 years ago by Henry Andrews <hha1@…>

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

Forgot to include the rest of Alex's comment about the code link "with the if not value triggering the issue. It's probably solvable by
changing that to if (not hasattr(value, 'as_sql') and not hasattr(value,
'_as_sql') and not value), but I haven't spent a ton of time thinking about
it."

Also, Malcolm Tredinnick replied that he could reproduce this.

comment:2 Changed 5 years ago by Alex

  • Description modified (diff)
  • milestone set to 1.1
  • Triage Stage changed from Unreviewed to Accepted
  • Version changed from 1.1-beta-1 to SVN

Fixed the formatting.

comment:3 Changed 5 years ago by clement

Did a bit of digging, and it seems that this extra query is issued (gets in django.db.connection.queries) when the Q object for the exclude clause is inverted (then deepcopied). See django.db.models.query and django.db.models.query_utils.

Interestingly enough, as Q._combine uses also deepcopy, the same problem exists when using a subquery in a Q object being a left hand operand on an & or | operation. Example :

>>> pub_all = Publisher.objects.all()
>>> django.db.connection.queries
[]
>>> Book.objects.filter( Q(publisher__in=pub_all) & Q(name='aa') )
[]
>>> django.db.connection.queries
[{'time': '0.001', 'sql': u'SELECT `app_publisher`.`id`, `app_publisher`.`name` FROM `app_publisher`'},
 {'time': '0.001', 'sql': u'SELECT `app_book`.`id`, `app_book`.`name`, `app_book`.`publisher_id` FROM
                            `app_book` WHERE (`app_book`.`publisher_id` IN (SELECT U0.`id` FROM
                            `app_publisher` U0) OR `app_book`.`name` = aa ) LIMIT 21'}]

Changed 5 years ago by clement

comment:4 Changed 5 years ago by clement

In attached patch :

  • Added __deepcopy__ in QuerySet to avoid having __getstate__ forcing the result cache to be populated
  • Alex's fix (on django-users mailing list) to avoid evaluating the QuerySet in add_filter
  • Simple regression test

comment:5 Changed 5 years ago by Alex

The regressiontest should monkey patch settings.DEBUG and use connection.queries to actually test the number of SQL queries.

comment:6 Changed 5 years ago by russellm

  • Resolution set to fixed
  • Status changed from new to closed

(In [10929]) Fixed #11082 -- Ensured that subqueries used in an exclude(Xin=) clause aren't pre-evaluated. Thanks to Henry Andrews for the report, and clement for the fix.

comment:7 Changed 5 years ago by ccahoon

(In [10997]) Fixed #11082 -- Ensured that subqueries used in an exclude(Xin=) clause aren't pre-evaluated. Thanks to Henry Andrews for the report, and clement for the fix.

comment:8 Changed 3 years ago by jacob

  • milestone 1.1 deleted

Milestone 1.1 deleted

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.