Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#19964 closed Bug (fixed)

A query including other query affects and breaks the first one

Reported by: German M. Bravo Owned by: Anssi Kääriäinen
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords:
Cc: German M. Bravo Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This code breaks the parents query

# I have:
from django.db import models

class MyObject(models.Model):
    parent = models.ForeignKey('self', null=True, blank=True, related_name='children')
    data = models.CharField(max_length=100)
    created_at = models.DateTimeField(auto_now_add=True)


def test():
    # Then I do:
    parents = MyObject.objects.filter(models.Q(parent=models.F('id'))).order_by('-created_at')[:10]

    children = MyObject.objects.filter(parent__in=parents).exclude(parent=models.F('id'))

    # and this is what I get:

    print 'A:', parents.query
    # A: SELECT "myapp_myobject"."id", "myapp_myobject"."parent_id", "myapp_myobject"."data", "myapp_myobject"."created_at" FROM "myapp_myobject" WHERE "myapp_myobject"."parent_id" =  "myapp_myobject"."id" ORDER BY "myapp_myobject"."created_at" DESC LIMIT 10

    children.count()
    # 0

    print 'B:', parents.query
    # B: SELECT "myapp_myobject"."id", "myapp_myobject"."parent_id", "myapp_myobject"."data", "myapp_myobject"."created_at" FROM "myapp_myobject" WHERE "myapp_myobject"."parent_id" =  "U0"."id" ORDER BY "myapp_myobject"."created_at" DESC LIMIT 10

    print parents
    # missing FROM-clause entry for table "U0"
    # LINE 1: ...ll_myobject" WHERE "myapp_myobject"."parent_id" =  "U0"."id" ...

Change History (8)

comment:1 Changed 4 years ago by German M. Bravo

Cc: German M. Bravo added

comment:2 Changed 4 years ago by Anssi Kääriäinen

Confirmed, this is a regression caused by #16759.

A dirty patch available from: https://github.com/akaariai/django/compare/ticket_19964. It fixes the issue but isn't ready for commit...

comment:3 Changed 4 years ago by Anssi Kääriäinen

Owner: changed from nobody to Anssi Kääriäinen
Severity: NormalRelease blocker
Status: newassigned
Triage Stage: UnreviewedAccepted
Type: UncategorizedBug
Version: 1.4master

Quoting Aymeric from #19963: I guess I'm supposed to fix my own mess...

comment:4 Changed 4 years ago by German M. Bravo

Severity: Release blockerNormal
Triage Stage: AcceptedUnreviewed
Type: BugUncategorized
Version: master1.4

This comes from #16759, I'm adding a full patch that replaces the patch on #16759, which fixes this.

comment:5 Changed 4 years ago by Claude Paroz

Triage Stage: UnreviewedAccepted
Type: UncategorizedBug

comment:6 Changed 4 years ago by Anssi Kääriäinen

I have been thinking about this problem somewhat. It seems the current way of relabel_aliases() + clone() is fragile. Before, when __deepcopy__ was used it was a bit less fragile.

I have a couple of ideas how to improve the situation:

  1. Rename relabel_aliases() to relabeled_clone() for those things where the only mutable element is the alias. Thus, you can't accidentally only implement relabel_aliases() but not clone. This will also make cloning more efficient, as normal cloning will not need to clone all the structures. The clone needs to be done only on relabel time, and those times are rare.
  2. Instead of storing an alias in the outer structures (like the where.Constraint, SQLEvaluator, ...), just store an ID that the query can map back to alias. This way the outer structures are immutable, and there is no way to miss writing correct relabel_aliases().
  3. Do the same as in 2. but for all aliases in the query. Generate the final aliases at compile time and convert to those at as_sql() time everywhere.

Of these 2. and 3. seem good as there is no way to miss or write incorrect relabel_aliases() methods. Both will be a lot of work, but no. 3 is even harder to implement correctly. No. 3 doesn't give any extra performance gain as the query structures need to be cloned in any case because alias is not the only mutable thing.

So, I will see if no. 2 is easy to implement. If not, then no.1 it is.

comment:7 Changed 4 years ago by Anssi Kääriäinen <akaariai@…>

Resolution: fixed
Status: assignedclosed

In d744c550d51955fd623897884446b7f34318e94d:

Fixed #19964 -- Removed relabel_aliases from some structs

Before there was need to have both .relabel_aliases() and .clone() for
many structs. Now there is only relabeled_clone() for those structs
where alias is the only mutable attribute.

comment:8 Changed 4 years ago by Anssi Kääriäinen

I solved this with 1) after all.

2) ended up being too clever (patch is available from: https://github.com/akaariai/django/commit/22c6136adb7eec3cca366511d3b4d29ff7509282). The two main problems is that it is hard to verify everything is actually using alias_id instead of alias, and that I have a feeling that sometimes you really want to have the alias instead of an ID in the leaf structures. Also, if you don't happen to have quote_name_unless_alias available, then the alias_id alone is useless (gis has this problem, and it isn't solved in the patch for 2).

Note: See TracTickets for help on using tickets.
Back to Top