Opened 12 years ago

Closed 11 years ago

Last modified 11 years ago

#18785 closed Bug (fixed)

Extra-Join Regression from ticket #15316

Reported by: famousactress@… Owned by: nobody
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords:
Cc: trbs@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The change to prevent over-trimming in trim_joins() for https://code.djangoproject.com/ticket/15316 appears to have introduced new, unneeded joins (LEFT OUTERs) as a side effect. The result is that in certain cases Django 1.4+ creates less efficient SQL than Django 1.3.

The following test passes in 1.3.x and fails w/ #15316's change. The failure is due to a new LEFT OUTER JOIN appearing against Document's delete_log.

queries/models.py

class AuditLog(models.Model):
    info = models.CharField(max_length=50)

class Person(models.Model):
    name = models.CharField(max_length=50)
    delete_log = models.ForeignKey(AuditLog)

class Document(models.Model):
    status = models.CharField(max_length=50)
    delete_log = models.ForeignKey(AuditLog)
    person = models.ForeignKey(Person)

queries/tests:

class OneFourRegressionProof(unittest.TestCase):
    """
      Should evolve this into a better test case that proves something positive, but for now this
      is attempting to provide evidence that for some queries we got less efficient in Django 1.4
    """
    def setUp(self):
        pass
    
    def test_it(self):
        """
         Ideally, this would only create on INNER join on person.
        """
        qs = Document.objects.exclude(delete_log__isnull=False).filter(status='something', person__delete_log__isnull=True)
        self.assertEquals(1, str(qs.query).count('INNER JOIN'))
        self.assertEquals(0, str(qs.query).count('OUTER JOIN'))

git bisect results:

(django-base)~/projects/elation/django-fork  ((2e56066...)|BISECTING) philltornroth → git bisect bad
2e56066a5b93b528fbce37285bac591b44bc6ed7 is the first bad commit
commit 2e56066a5b93b528fbce37285bac591b44bc6ed7
Author: Malcolm Tredinnick <malcolm.tredinnick@gmail.com>
Date:   Tue Aug 23 03:38:42 2011 +0000

    Fixed an isnull=False filtering edge-case. Fixes #15316.
    
    The bulk of this patch is due to some fine analysis from Aleksandra
    Sendecka.
    
    git-svn-id: http://code.djangoproject.com/svn/django/trunk@16656 bcc190cf-cafb-0310-a4f2-bffc1f526a37

:100644 100644 5635f2141d5ed22bfcc03df7fd5456a48f7f051e 38c7601579c5b366a5027358a3806b24baeaddfb M	AUTHORS
:040000 040000 33a384a63b8e8d7e3b325640339f13a14babc315 fe57cf0e7bc932f15d181d3a186bffe453d6bb2a M	django
:040000 040000 afb2570e431195108c3d8b0b634e4b30eac42168 e1213b550270d389ffdca827116642b036855a6b M	tests

It's worth noting that this is effectively an exacerbation of https://code.djangoproject.com/ticket/10790 . I began the discussion on that ticket and a patch is starting to take shape, but it looks too drastic to be a 1.4 merge candidate. I wanted to stand up this related ticket so that a more surgical fix to 1.4 could be considered, if reasonable.

Anecdotally, I'm seeing queries in our system where this change added 3-6 new joins, making an upgrade to 1.4 a complete non-starter for us unless there's a workaround that I haven't turned up yet.

Change History (9)

comment:1 by Anssi Kääriäinen, 12 years ago

Triage Stage: UnreviewedAccepted

I am accepting this as a regression in 1.4. To get this fixed in 1.4 will need a simple patch. I have hopes that 1.5 will contain a more thorough setup_joins() refactoring from #10790, but as said, that will not help this ticket as the changes are way too large to backpatch.

comment:2 by famousactress@…, 12 years ago

Awesome! Thanks much for the support on this ticket and #10790. It's definitely appreciated. I'm still not well enough versed in the ORM code to be much help putting a patch together in short order, but I'd be more than willing to write more/better unit tests that verify the regression/fix... at this point I'm plenty familiar with the queries regression tests :)

comment:3 by Anssi Kääriäinen, 12 years ago

I think you will need two thing here:

I think you should try to avoid changing existing return value of setup_joins() or adding new mandatory parameters to trim_joins(). Breaking these APIs in a minor release doesn't look nice, even if they are private APIs.

Alternatively you might also want to check if you could solve #15316 in a way that doesn't introduce the regression. It seems that handling nonnull_check in the way it is done currently will simply not work.

If you can test #10790 and see if it solves this regression for you, that would be very welcome.

comment:4 by Anssi Kääriäinen, 12 years ago

I have a pretty good understanding of what is going on. The trim joins logic should go as follows: if you have joins to forward direction (you are following a foreign key), and the value you want is the thing the foreign key points to (usually the PK of the foreign model), then you can trim the join. This will always be valid, as the value of the foreign key _must_ be the same it is one the referenced model, and there can only be one value on the other side (the other side must be unique). If the foreign key is nullable and contains a null, the join wouldn't produce anything anyways.

On the other hand if you are following a foreign key to the reverse direction (usually from some model's PK to the foreign key in the referenced model), then you can _never_ trim the join. The reason for this is that you can not know if there is anything on the other side, that is, even if you know the value the foreign key must have, you can't know if there is anything on the other side. As an example:

table a     table b
id 1        a_id 1
id 2

Now, if you make the assumption that if you know the value of id in table a, then you know the value of a_id in table b then you are wrong, as the id=2 value doesn't have any matching row on the other side.

Current Django code makes the assumption you can trim joins in the reverse case. This was blocked for the OneToOneKey case in #15316 by the nonnull_check, but I don't believe that to be the correct fix. Just block all trimming in the reverse case and you have your solution. The question is how do you pass the direction information from setup_joins to trim_joins with minimal (internal) API breakages. You might want to add a completely new variable to the Query class ("join_directions" or something like that), which tracks the join directions for each alias generated and nothing else. This way you need only internal changes to setup_joins, trim_joins, Query.__init__ and !Query.clone(). This solution isn't too beautiful, but I don't expect it to bite us. In any case it seems obvious code changes in this are will not be backportable, as the API is going to change a lot in 1.5.

I might be overcautious above. Another solution is to just add the direction to alias_map items and be done with it. This will break 3rd party code that uses alias_map. As this is totally private API one could claim that they had it coming all along... Still, it seems we can avoid this break somewhat easily, so why be evil...

You can get the directions from the patch in #10790, names_to_path().

If you still keep the left outer joins in place in trim_joins(), I believe you don't need any changes to !Query.combine(). In #10790 the left joins are trimmed, too, as I can't see any reason why they are non-trimmable.

The reverse join trimming issue isn't commonly seen, as it is somewhat rare to have reverse filters which are trimmable to begin with. The most common case is OneToOneKey, or doing something like A.objects.filter(b__a_id=1), but that usually doesn't make sense, instead you would just do A.objects.filter(id=1).

comment:5 by trbs, 12 years ago

Cc: trbs@… added

comment:6 by famousactress@…, 11 years ago

This regression from 1.4 still appears broken in the last 1.4 and 1.5 builds. Are there any plans to fix it? Will improvements in 1.6 resolve this issue? Our project is orphaned on 1.3 because this change causes a huge query spike and abysmal performance.

comment:7 by famousactress@…, 11 years ago

I should point out though, the problem looks fixed in 1.6. I confirmed that our project unit test is now failing in an awesome direction using the newest bits.. the join count for the little test I wrote to remind us to look out for this error is 2 joins in v1.3, 4 joins in 1.4 & 1.5, and 1 join in 1.6. Huge thanks to Anssi who's work on https://code.djangoproject.com/ticket/10790 I believe is responsible for the major improvements. This ought to mean that basically everyone with a non-trivial Django project just got a really significant performance boost. If a 1.4 or 1.5 patch is too hairy, I suppose we can jump past them to 1.6 when it's stable.

comment:8 by Anssi Kääriäinen <akaariai@…>, 11 years ago

Resolution: fixed
Status: newclosed

In 84909377f2f2eea07ab8bf398dafc69e2d736b50:

Fixed #18785 -- Added Test join trimming regression

The regression was caused by patch to ticket #15316 and was fixed by a
patch to #10790.

comment:9 by Anssi Kääriäinen, 11 years ago

Backpatching of all the ORM changes to 1.5 can't be done. Way too much risk, I am sure there are still regressions hiding in there. If a very targeted patch for this issue is written then that might be a candidate to 1.5 as a regression fix.

Note: See TracTickets for help on using tickets.
Back to Top