Code

Opened 14 months ago

Closed 13 months ago

Last modified 13 months ago

#19964 closed Bug (fixed)

A query including other query affects and breaks the first one

Reported by: Kronuz Owned by: akaariai
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords:
Cc: Kronuz Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This code breaks the parents query

# I have:
from django.db import models

class MyObject(models.Model):
    parent = models.ForeignKey('self', null=True, blank=True, related_name='children')
    data = models.CharField(max_length=100)
    created_at = models.DateTimeField(auto_now_add=True)


def test():
    # Then I do:
    parents = MyObject.objects.filter(models.Q(parent=models.F('id'))).order_by('-created_at')[:10]

    children = MyObject.objects.filter(parent__in=parents).exclude(parent=models.F('id'))

    # and this is what I get:

    print 'A:', parents.query
    # A: SELECT "myapp_myobject"."id", "myapp_myobject"."parent_id", "myapp_myobject"."data", "myapp_myobject"."created_at" FROM "myapp_myobject" WHERE "myapp_myobject"."parent_id" =  "myapp_myobject"."id" ORDER BY "myapp_myobject"."created_at" DESC LIMIT 10

    children.count()
    # 0

    print 'B:', parents.query
    # B: SELECT "myapp_myobject"."id", "myapp_myobject"."parent_id", "myapp_myobject"."data", "myapp_myobject"."created_at" FROM "myapp_myobject" WHERE "myapp_myobject"."parent_id" =  "U0"."id" ORDER BY "myapp_myobject"."created_at" DESC LIMIT 10

    print parents
    # missing FROM-clause entry for table "U0"
    # LINE 1: ...ll_myobject" WHERE "myapp_myobject"."parent_id" =  "U0"."id" ...

Attachments (0)

Change History (8)

comment:1 Changed 14 months ago by Kronuz

  • Cc Kronuz added
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

comment:2 Changed 14 months ago by akaariai

Confirmed, this is a regression caused by #16759.

A dirty patch available from: https://github.com/akaariai/django/compare/ticket_19964. It fixes the issue but isn't ready for commit...

comment:3 Changed 14 months ago by akaariai

  • Owner changed from nobody to akaariai
  • Severity changed from Normal to Release blocker
  • Status changed from new to assigned
  • Triage Stage changed from Unreviewed to Accepted
  • Type changed from Uncategorized to Bug
  • Version changed from 1.4 to master

Quoting Aymeric from #19963: I guess I'm supposed to fix my own mess...

comment:4 Changed 14 months ago by Kronuz

  • Severity changed from Release blocker to Normal
  • Triage Stage changed from Accepted to Unreviewed
  • Type changed from Bug to Uncategorized
  • Version changed from master to 1.4

This comes from #16759, I'm adding a full patch that replaces the patch on #16759, which fixes this.

comment:5 Changed 14 months ago by claudep

  • Triage Stage changed from Unreviewed to Accepted
  • Type changed from Uncategorized to Bug

comment:6 Changed 13 months ago by akaariai

I have been thinking about this problem somewhat. It seems the current way of relabel_aliases() + clone() is fragile. Before, when __deepcopy__ was used it was a bit less fragile.

I have a couple of ideas how to improve the situation:

  1. Rename relabel_aliases() to relabeled_clone() for those things where the only mutable element is the alias. Thus, you can't accidentally only implement relabel_aliases() but not clone. This will also make cloning more efficient, as normal cloning will not need to clone all the structures. The clone needs to be done only on relabel time, and those times are rare.
  2. Instead of storing an alias in the outer structures (like the where.Constraint, SQLEvaluator, ...), just store an ID that the query can map back to alias. This way the outer structures are immutable, and there is no way to miss writing correct relabel_aliases().
  3. Do the same as in 2. but for all aliases in the query. Generate the final aliases at compile time and convert to those at as_sql() time everywhere.

Of these 2. and 3. seem good as there is no way to miss or write incorrect relabel_aliases() methods. Both will be a lot of work, but no. 3 is even harder to implement correctly. No. 3 doesn't give any extra performance gain as the query structures need to be cloned in any case because alias is not the only mutable thing.

So, I will see if no. 2 is easy to implement. If not, then no.1 it is.

comment:7 Changed 13 months ago by Anssi Kääriäinen <akaariai@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In d744c550d51955fd623897884446b7f34318e94d:

Fixed #19964 -- Removed relabel_aliases from some structs

Before there was need to have both .relabel_aliases() and .clone() for
many structs. Now there is only relabeled_clone() for those structs
where alias is the only mutable attribute.

comment:8 Changed 13 months ago by akaariai

I solved this with 1) after all.

2) ended up being too clever (patch is available from: https://github.com/akaariai/django/commit/22c6136adb7eec3cca366511d3b4d29ff7509282). The two main problems is that it is hard to verify everything is actually using alias_id instead of alias, and that I have a feeling that sometimes you really want to have the alias instead of an ID in the leaf structures. Also, if you don't happen to have quote_name_unless_alias available, then the alias_id alone is useless (gis has this problem, and it isn't solved in the patch for 2).

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.