Code

Opened 4 years ago

Closed 4 years ago

Last modified 3 years ago

#12328 closed (fixed)

subqueries in django 1.1 behave oddly

Reported by: wdoekes Owned by: marcosmoyano
Component: Database layer (models, ORM) Version: 1.1
Severity: Keywords:
Cc: galli.87@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

Hi, I'm not sure what is wrong exactly. But it looks like subqueries in filters (introduced in django 1.1?) do not always work as intended.

A basic model:

from django.db import models
class M(models.Model):
    def __unicode__(self):
        return 'M(id=%d)' % self.id

The setup:

$ ./manage shell
Python 2.5.2 (r252:60911, Jan  4 2009, 21:59:32) 
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from osso.test.models import M
>>> M.objects.create(id=1)
<M: M(id=1)>
>>> M.objects.create(id=2)
<M: M(id=2)>
>>> M.objects.create(id=3)
<M: M(id=3)>
>>> M.objects.all()
[<M: M(id=1)>, <M: M(id=2)>, <M: M(id=3)>]
>>> M.objects.order_by('id')[:1]
[<M: M(id=1)>]
>>> M.objects.order_by('id')[:1].delete()
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/opt/django11/django/db/models/query.py", line 379, in delete
    "Cannot use 'limit' or 'offset' with delete."
AssertionError: Cannot use 'limit' or 'offset' with delete.

Bear with me. You have noticed my goal is to delete old records.

Watch django1.0:

>>> M.objects.filter(id__in=M.objects.order_by('id')[0:1])
<snip>
TypeError: int() argument must be a string or a number, not 'M'
>>> M.objects.filter(id__in=M.objects.order_by('id')[0:1].values_list('id', flat=True))
[<M: M(id=1)>]
>>> M.objects.filter(id__in=M.objects.order_by('id')[0:1].values_list('id', flat=True)).delete()
>>> M.objects.all()
[<M: M(id=2)>, <M: M(id=3)>]

That works as intended.

And now django 1.1:

>>> M.objects.filter(id__in=M.objects.order_by('id')[:1])
[<M: M(id=1)>]
>>> M.objects.filter(id__in=M.objects.order_by('id')[:1]).delete()
>>> M.objects.all()
[]

(The same happens if I use the values_list() notation as used above.)

What happens is that the del_query in QuerySet.delete() refers to a newer object every time, until all M objects have been deleted.

The following could be related, or not :)

After deleting an M object and re-adding it, I notice that ordering in subqueries does not work as intended either:

>>> M.objects.order_by('id')
[<M: M(id=1)>, <M: M(id=2)>, <M: M(id=3)>]
>>> M.objects.order_by('id')[0:1]
[<M: M(id=1)>]
>>> M.objects.filter(id__in=M.objects.order_by('id')[0:1])
[<M: M(id=2)>]
>>> M.objects.filter(id__in=M.objects.order_by('id')[0:1].values_list('id', flat=True))
[<M: M(id=2)>]

This was tested with postgresql_psycopg2. With MySQL I cannot reproduce it, as it disallows LIMIT in subqueries, sending me an early error instead of behaving errantly.

My current workaround is to wrap "M.objects.order_by('id')[:1]" in a list() so I do not get a dynamic object as id\_\_in value. Having to force early evaluation is a bit frightening though.

I assume one should be able to reproduce this easily. If not, I'd be happy to share more information about my setup.

Regards,
Walter Doekes

Attachments (2)

bug12328.patch (3.9 KB) - added by marcosmoyano 4 years ago.
New patch
t12328.diff (5.0 KB) - added by russellm 4 years ago.
Fix for slicing/order_by problems in subqueries

Download all attachments as: .zip

Change History (15)

comment:1 Changed 4 years ago by russellm

  • milestone set to 1.2
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

#12814 is a duplicate, albeit with a different presentation.

comment:2 Changed 4 years ago by anonymous

  • Owner changed from nobody to anonymous
  • Status changed from new to assigned

comment:3 Changed 4 years ago by marcosmoyano

  • Owner changed from anonymous to marcosmoyano
  • Status changed from assigned to new

comment:4 Changed 4 years ago by marcosmoyano

  • Has patch set
  • Status changed from new to assigned

Attached patch and tests.

Thanks to Fabian Gallina (fgallina) for the help

comment:5 Changed 4 years ago by fgallina

  • Cc galli.87@… added

comment:6 Changed 4 years ago by marcosmoyano

Fixed the delete() problem. The subqueries issue still happens.

comment:7 Changed 4 years ago by marcosmoyano

New patch with fixes and tests for both subqueries and delete issue.

comment:8 Changed 4 years ago by marcosmoyano

There's still an issue in the delete() method. Although modifying the while statemente solves the issue, this solution requires and extra query, which is not the desired solution.
Here's a closer look at the problem.

In [1]: from ticket_12328.models import *

In [2]: M.objects.create(id=1)
Out[2]: <M: M(id=1)>

In [3]: M.objects.create(id=2)
Out[3]: <M: M(id=2)>

In [4]: M.objects.create(id=3)
Out[4]: <M: M(id=3)>

In [5]: M.objects.all()
Out[5]: [<M: M(id=1)>, <M: M(id=2)>, <M: M(id=3)>]

In [6]: M.objects.filter(id__in=M.objects.order_by('id')[:1]).delete()
> /usr/lib/python2.6/site-packages/django/db/models/query.py(433)delete()
    432         import ipdb;ipdb.set_trace()
--> 433         while 1:
    434         #for element in xrange(del_query.count()):


ipdb> print del_query.query
SELECT "ticket_12328_m"."id" FROM "ticket_12328_m" WHERE "ticket_12328_m"."id" IN (SELECT U0."id" FROM "ticket_12328_m" U0 ORDER BY U0."id" ASC LIMIT 1)
ipdb> n
> /usr/lib/python2.6/site-packages/django/db/models/query.py(437)delete()
    436             # objects that are related to the objects that are to be deleted.

--> 437             seen_objs = CollectedObjects(seen_objs)
    438             for object in del_query[:CHUNK_SIZE]:

On line #438 every time del_query gets evaluated it will return the first element from the table (ordered by id) until there are no more elements to consume.

comment:9 Changed 4 years ago by Alex

Ok, the problem here is that you can't use LIMIT/OFFSET in a delete() query (look at the assert at the top), nesting this works around it, but clearly breaks the invariant that the query will return the same items on every execution (less those that have been deleted). An extra count() on every delete is clearly unacceptable, so I'd have to say that we need a way to raise an exception here.

Changed 4 years ago by marcosmoyano

New patch

comment:10 Changed 4 years ago by marcosmoyano

New patch. Added a private method to traverse the sql tree and verify inner QuerySets.

Changed 4 years ago by russellm

Fix for slicing/order_by problems in subqueries

comment:11 Changed 4 years ago by russellm

  • Resolution set to fixed
  • Status changed from assigned to closed

(In [12912]) Fixed #12328 -- Corrected the handling of subqueries with ordering and slicing, especially when used in delete subqueries. Thanks to Walter Doekes for the report.

This fixes a feature that isn't available under MySQL and Oracle (Refs #10099).

comment:12 Changed 4 years ago by russellm

(In [12914]) [1.1.X] Fixed #12328 -- Corrected the handling of subqueries with ordering and slicing, especially when used in delete subqueries. Thanks to Walter Doekes for the report.

This fixes a feature that isn't available under MySQL and Oracle (Refs #10099).

Backport of r12912 from trunk.

comment:13 Changed 3 years ago by jacob

  • milestone 1.2 deleted

Milestone 1.2 deleted

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.