Opened 19 years ago

Closed 17 years ago

Last modified 17 years ago

#2076 closed defect (fixed)

order_by with related table does not work

Reported by: Malcolm Tredinnick Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: normal Keywords: qs-rf-fixed
Cc: gary.wilson@…, real.human@…, arthur.case@…, tom@…, mark@…, portland@…, floguy@…, django@…, remco@…, eliott@…, pythonmailing@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

This came up on IRC today: if you look at the order_by query that uses the related table Blogs in http://www.djangoproject.com/documentation/db_api/#order-by-fields, it will not actually work. The SQL we produce does not join the blogs_blog table to the blogs_entry table in the SQL query, although it does put in an "order by blogs_blog.name ASC" bit. We need to add in the extra table to the join as well, although this requires reordering the execution in django/db/models/query.py (in the _get_sql_clause() method).

A temporary hack for the moment is Entry.objects.select_related().order_by('blogs_blog.name', 'headline') but that is a bit clunky.

We need a test for this, too.

(assigning to myself for now; just noting this here in case I get hit by a bus.)

Attachments (7)

ticket-2076.diff (9.3 KB ) - added by ramiro <rm0 _at_ gmx.net> 18 years ago.
Patch originally posted by serbaut to #2210 with tests added
ticket-2076.2.diff (9.4 KB ) - added by Michael Radziej <mir@…> 18 years ago.
slightly corrected version of previous patch (missing backticks)
ticket-2076.3.diff (8.9 KB ) - added by mir@… 18 years ago.
corrected patch without changes in django/newforms/models.py
ticket-2076.4.diff (12.9 KB ) - added by Ramiro Morales <rm0 _at_ gmx.net> 18 years ago.
ticket-2076.5.diff (13.1 KB ) - added by Ramiro Morales <rm0 _at_ gmx.net> 18 years ago.
ticket-2076.6.diff (13.2 KB ) - added by Ramiro Morales <rm0 _at_ gmx.net> 18 years ago.
Same as ticket-2076.5.diff with corrections in the documentation and testcases comments
order.diff (859 bytes ) - added by Michael Radziej <mir@…> 18 years ago.

Download all attachments as: .zip

Change History (53)

comment:1 by oefe, 19 years ago

Apparently, the select_related() workaround doesn't work for ForeignKey with null=True.

comment:2 by anonymous, 19 years ago

Cc: gary.wilson@… added

comment:3 by Ramiro Morales, 19 years ago

See also #2210 where a fix is suggested

comment:4 by Russell Cloran <russell@…>, 19 years ago

The fix suggested will not fix this problem.

comment:5 by serbaut@…, 18 years ago

See patch in #2210.

comment:6 by mrmachine <real dot human at mrmachine dot net>, 18 years ago

Cc: real.human@… added

comment:7 by ramiro <rm0 _at_ gmx.net>, 18 years ago

#3002 was a duplicate of this one.

comment:8 by mir@…, 18 years ago

Resolution: duplicate
Status: newclosed

duplicate of #2210, which has a working patch.

by ramiro <rm0 _at_ gmx.net>, 18 years ago

Attachment: ticket-2076.diff added

Patch originally posted by serbaut to #2210 with tests added

comment:9 by ramiro <rm0 _at_ gmx.net>, 18 years ago

Just attached a patch for this issue, created by serbaut and originally posted to ticket #2210 but IMHO this is better suited, see http://groups.google.ca/group/django-users/msg/c98e1631ce4b7c8e for the reasoning behind this.

The patch itself is updated to r4455 and I've added some tests of the new funcionality implemented by it.

comment:10 by Michael Radziej <mir@…>, 18 years ago

Has patch: set
Keywords: reopen added
Triage Stage: UnreviewedReady for checkin

#2210 is actually a different issue, see this comment on #2210

Can somebody from core please reopen this bug?

I'm attaching a slightly modified patch, two backticks were missing in a docstring :-)

Thanks, serbaut for the original patch, and Ramiro Morales for the tests!

by Michael Radziej <mir@…>, 18 years ago

Attachment: ticket-2076.2.diff added

slightly corrected version of previous patch (missing backticks)

comment:11 by ramiro <rm0 _at_ gmx.net>, 18 years ago

I will repeat/expand here some thing I've noted when implementing the tests so Django developers can take them in account when they decide if/when apply this patch:

  • The patch allows one to use 'relationship_name__related_model_field_name' ordering specification in the order_by() QuerySet method but not in the Meta.ordering attribute.
  • When you use a bare 'relationship_name' ordering specification the patch orders things by the related model PK and not by its Meta.ordering attribute. By examining the Admin app it seems that at least in that app the intent is to do the ordering by Meta.ordering if that exists and fallback to the PK otherwise. I don't know if this may be considered an inconsistency.
  • I have tested only ForeignKey relationships.
  • I have not tested if the ordering specification can be used by spanning through reverse relationships.
  • The patch does not seem to preserve backwards compatibility with the '[-]appname_tablename.fieldname' order_by specification notation suggested by the Django documentation (http://www.djangoproject.com/documentation/db_api/#order-by-fields). Would this be worth noting if the patch gets applied?.

comment:12 by Jacob, 18 years ago

Resolution: duplicate
Status: closedreopened

comment:13 by Michael Radziej <mir@…>, 18 years ago

Keywords: reopen removed

comment:14 by Michael Radziej <mir@…>, 18 years ago

#3476 marked as duplicate.

comment:15 by Malcolm Tredinnick, 18 years ago

Status: reopenednew
Triage Stage: Ready for checkinAccepted

This is really great work. Thanks to everybody's who's pitched in so far.

Not quite "ready for checkin" yet because of Ramiro's detective work when constructing the tests. A few things I would like to see working before this goes in:

  1. Should work with the !Meta inner class as well, since that provides default ordering for a model and it's going to be the source of a lot of complaints (from me, no less!) if you can only do something manually and not automatically.
  2. When just specifying a relation field as the ordering attribute, we should use the related model's "natural" ordering (that specified in Meta) before primary key ordering. Again, it's the intuitive behaviour when selecting a model on its own, so making it consistent across the board is worthwhile.
  3. Check it works with all relations (not just ForeignKeys).
  4. Let's work out what happens with backwards relations (and also spanning across multiple models: does model1__model2__some_field work?)

I'm not too worried about the backwards-compat issue, since that's a bit of an ugly hack anyway -- it's the only place you are ever required to know the database table names (or even care that the database exists) in the "non-direct-SQL" cases of a QuerySet. Providing we can get this fixed before 1.0, I'm willing to defend the case for breaking it in a small way.

The current patch is a great start and looks like the right way to approach the problem. So if somebody feels enthusiastic and wants to do some work on this, these are the things I'd like to see done before it can be checked in.

comment:16 by Malcolm Tredinnick, 18 years ago

Patch needs improvement: set

comment:17 by arthur.case@…, 18 years ago

Cc: arthur.case@… added

comment:18 by Thomas Steinacher <tom@…>, 18 years ago

Cc: tom@… added

in reply to:  10 ; comment:19 by ramiro <rm0 _at_ gmx.net>, 18 years ago

Replying to Michael Radziej <mir@noris.de>:

[...]

I'm attaching a slightly modified patch, two backticks were missing in a docstring :-)

it's me or an unrelated modification to django/newforms/models.py slipped in your ticket-2076.2.diff?.

in reply to:  19 comment:20 by mir@…, 18 years ago

Replying to ramiro <rm0 _at_ gmx.net>:

it's me or an unrelated modification to django/newforms/models.py slipped in your ticket-2076.2.diff?.

You're right. It probably crept in from a different ticket when I forgot to clean up.

by mir@…, 18 years ago

Attachment: ticket-2076.3.diff added

corrected patch without changes in django/newforms/models.py

comment:21 by Michael Radziej <mir@…>, 18 years ago

Note: #2210 is an almost-duplicate and has been closed in favor of this ticket (it's a corner case). It should be checked separately that the solution for #2076 also fixes #2210.

comment:22 by Lau Bech Lauritzen <lau__@…>, 18 years ago

Just stubbed my toe on this bug. It would probably be a good idea to remove the information in the documentation or change it to the workaround.

in reply to:  15 comment:23 by Ramiro Morales <rm0 _at_ gmx.net>, 18 years ago

Replying to mtredinnick:

Here is another iteration of the patch, not there yet because I don't have the required Django fu:

  1. Should work with the !Meta inner class as well, since that provides default ordering for a model and it's going to be the source of a lot of complaints (from me, no less!) if you can only do something manually and not automatically.

This is working now, it wasn't a problem with the patch per se but a validation made in django/code/management.py:get_validation_errors() was triggering. I modified it to skip the validation just as it already does with the tablename.field form.

  1. When just specifying a relation field as the ordering attribute, we should use the related model's "natural" ordering (that specified in Meta) before primary key ordering. Again, it's the intuitive behaviour when selecting a model on its own, so making it consistent across the board is worthwhile.

This is the one I haven't the skills to implement. serbaut@gmail.com implemented it by calling the lookup_inner function from QuerySet._get_sql_clause() but I don know what other internal function should be used to take in account a recursive scenario like this: The example at http://www.djangoproject.com/documentation/db_api/ with the Blog model having a relname relationship to a 4th model X and a Meta.ordering = (relname,) and trying to order the Entry model instances by its blog relationship field.

  1. Check it works with all relations (not just ForeignKeys).
  2. Let's work out what happens with backwards relations...

I've added (although a bit contrived) test cases for both ManyToMany and reverse ForeignKey and ManyToMany relationships . Please verify the coverage of these tests.

...(and also spanning across multiple models: does model1__model2__some_field work?)

There was already a test case for this.

I'm not too worried about the backwards-compat issue, since that's a bit of an ugly hack anyway -- it's the only place you are ever required to know the database table names (or even care that the database exists) in the "non-direct-SQL" cases of a QuerySet. Providing we can get this fixed before 1.0, I'm willing to defend the case for breaking it in a small way.

Well, there is something strange about this because as I see the original patch intended to maintain that backwards compatibility but it didn't achieve it, so it's a case of a broken attempt to maintain compatibility with a broken (as per your original report) but documented feature.

For now I've just eliminated a obvious dead code path but I didn't delete all the related code. I will attach another iteration of the patch where the tablename.filename is rejected in the validation and all traces of its support is eliminated from QuerySet._get_sql_clause().

Also, now db-api.txt is modified to describe the feature, please check and expand because my English is not very good (the same applies for the
tests/modeltests/ordering/models.py mods).

Test and enhance so it can be included in 0.96 :)

by Ramiro Morales <rm0 _at_ gmx.net>, 18 years ago

Attachment: ticket-2076.4.diff added

comment:24 by Ramiro Morales <rm0 _at_ gmx.net>, 18 years ago

Replying to Ramiro Morales <rm0 _at_ gmx.net>:

[...] I will attach another iteration of the patch where the tablename.filename is rejected in the validation and all traces of its support is eliminated from QuerySet._get_sql_clause().

Done, see attached ticket-2076.5.diff

This is the one I haven't the skills to implement. serbaut@gmail.com implemented it by calling the lookup_inner function from QuerySet._get_sql_clause() but I don know what other internal function should be used to take in account a recursive scenario like this: The example at http://www.djangoproject.com/documentation/db_api/ with the Blog model having a relname relationship to a 4th model X and a Meta.ordering = (relname,) and trying to order the Entry model instances by its blog relationship field.

... or this better (worse?) scenario:

  1. Check it works with all relations (not just ForeignKeys).
  2. Let's work out what happens with backwards relations...

I've added (although a bit contrived) test cases for both ManyToMany and reverse ForeignKey and ManyToMany relationships . Please verify the coverage of these tests.

Almost forgot to add this note: When creating the doctests for ManyToManyField relationships I was getting
duplicate objects in the order_by() query results so I needed to insert distinct() calls (see the last three tests), this may be sign of an issue/shortcomming with the patch when interacting with that kind of relationships. I can't compare this with any previous pre-patch behavior because the select_related() workaround doesn't work in the M2M case.

by Ramiro Morales <rm0 _at_ gmx.net>, 18 years ago

Attachment: ticket-2076.5.diff added

comment:25 by anonymous, 18 years ago

Cc: mark@… added; gary.wilson@… real.human@… arthur.case@… tom@… removed

comment:26 by anonymous, 18 years ago

Cc: gary.wilson@… real.human@… arthur.case@… tom@… added

by Ramiro Morales <rm0 _at_ gmx.net>, 18 years ago

Attachment: ticket-2076.6.diff added

Same as ticket-2076.5.diff with corrections in the documentation and testcases comments

comment:27 by anonymous, 18 years ago

There's one problem.

class A(models.Model):
  name = models.CharField()

class B(models.Model):
  [...]
  a = ForeignKey(A, null=True)

The following query will use an INNER JOIN to A:

B.objects.all().order_by(a__name)

This means that all objects B with B.a==None will be missing. So, this should use an OUTER JOIN. I'm not sure whether this qualifies the patch as wrong, since the ORM fails to switch to OUTER joins in similar circumstances with, too.

Michael

comment:28 by Michael Radziej <mir@…>, 18 years ago

The supplied testcases do not work with a postgresql database, since it requires that there is an order-by clause that matches the distinct clause. I get these error messages:

======================================================================
FAIL: Doctest: modeltests.ordering.models.__test__.API_TESTS
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/mir/lib/python/django/test/doctest.py", line 2156, in runTest
    raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for modeltests.ordering.models.__test__.API_TESTS
  File "/home/mir/src/django/trunk/tests/modeltests/ordering/models.py", line unknown line number, in API_TESTS

----------------------------------------------------------------------
File "/home/mir/src/django/trunk/tests/modeltests/ordering/models.py", line ?, in modeltests.ordering.models.__test__.API_TESTS
Failed example:
    Reporter.objects.all().distinct()
Exception raised:
    Traceback (most recent call last):
      File "/home/mir/lib/python/django/test/doctest.py", line 1243, in __run
        compileflags, 1) in test.globs
      File "<doctest modeltests.ordering.models.__test__.API_TESTS[38]>", line 1, in ?
        Reporter.objects.all().distinct()
      File "/home/mir/lib/python/django/db/models/query.py", line 104, in __repr__
        return repr(self._get_data())
      File "/home/mir/lib/python/django/db/models/query.py", line 505, in _get_data
        self._result_cache = list(self.iterator())
      File "/home/mir/lib/python/django/db/models/query.py", line 189, in iterator
        cursor.execute("SELECT " + (self._distinct and "DISTINCT " or "") + ",".join(select) + sql, params)
      File "/home/mir/lib/python/django/db/backends/postgresql/base.py", line 43, in execute
        return self.cursor.execute(sql, [smart_basestring(p, self.charset) for p in params])
    ProgrammingError: ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list

    SELECT DISTINCT "ordering_reporter"."id","ordering_reporter"."name" FROM "ordering_reporter" LEFT OUTER JOIN "ordering_reporter_artic
les" AS "m2m_ordering_reporter__articles" ON "ordering_reporter"."id" = "m2m_ordering_reporter__articles"."reporter_id" INNER JOIN "order
ing_article" AS "ordering_reporter__articles" ON "m2m_ordering_reporter__articles"."article_id" = "ordering_reporter__articles"."id" ORDE
R BY "ordering_reporter__articles"."headline" ASC

(and more for the other distinct queries). Unfortunately, the tests cannot be easily rephrased, since they really require distinct.

The postgresql docs state:

The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s).

Though, the patch works great for me ;-)

comment:29 by Michael Radziej <mir@…>, 18 years ago

With postgresql, test 'reserved_names' also fails. There's a strange line in it:

>>> Thing.objects.order_by('select.when')

The model is:

class Thing(models.Model):
    when = models.CharField(maxlength=1, primary_key=True)
    join = models.CharField(maxlength=1)
    like = models.CharField(maxlength=1)
    drop = models.CharField(maxlength=1)
    alter = models.CharField(maxlength=1)
    having = models.CharField(maxlength=1)
    where = models.DateField(maxlength=1)
    has_hyphen = models.CharField(maxlength=1, db_column='has-hyphen')

    class Meta:
       db_table = 'select'

So, select.when does not make sense, but it used to work since the database table was called 'select'. IMHO, this is not a valid use. But the test case needs to be removed and the minor incompatibility should be noted.

comment:30 by Michael Radziej <mir@…>, 18 years ago

Another test failure. The following test gives random results:

EditorResponse.objects.order_by('reader_letter__reader_name')

The appended patch fixes this test by requesting a second level ordering.

by Michael Radziej <mir@…>, 18 years ago

Attachment: order.diff added

comment:31 by anonymous, 18 years ago

I'm a Django newbie and innocently stumbled into the recursive case that Ramiro mentioned:

class A(models.Model):
    ...

class B(models.Model):
    a = models.ForeignKey(A)
    ...
    class Meta:
        ordering = ('a',)

class C(models.Model):
    b = models.ForeignKey(B)
    ...
    class Meta:
        ordering = ('b',)

In svn trunk, trying to list Cs in the admin interface gave me a MySQL error that there was no field app_b.a

Because A just orders by the PK anyway, I was able to work around it by specifying 'a_id' in for B's ordering (though this raises an error in model validation that B has no 'a_id' field):

class B(models.Model):
    a = models.ForeignKey(A)
    ...
    class Meta:
        ordering = ('a_id',)

Just for kicks, I applied ticket-2076.6 patch, and now it won't work either way. The original code errors out looking for app_c.app_b.a and the hacked code looks for app_c.app_b.a_id

I know you're already aware that this is a problem with the patch. Just chiming in to note that this two-layer FK relationship is quite natural and easy to write, even for a beginner, so I would think that it needs to Just Work in 1.0.

comment:32 by Wesley Fok <portland@…>, 18 years ago

Cc: portland@… added

comment:33 by anonymous, 18 years ago

Cc: floguy@… added

comment:34 by simonbun <simonbun@…>, 18 years ago

Cc: django@… added

comment:35 by anonymous, 18 years ago

Cc: remco@… added

comment:36 by anonymous, 18 years ago

Cc: eliott@… added

comment:37 by Malcolm Tredinnick, 17 years ago

Keywords: qs-rf added

comment:38 by Marek Kubica <pythonmailing@…>, 17 years ago

Cc: pythonmailing@… added

comment:39 by miracle2k, 17 years ago

Not sure if this has been mentioned already, but this patch breaks functionality to refer to columns added by extra().

in reply to:  39 comment:40 by miracle2k, 17 years ago

Replying to miracle2k:

Not sure if this has been mentioned already, but this patch breaks functionality to refer to columns added by extra().


To clarify: This is because the call to lookup_inner() in line 495 throws a TypeError when it encounters the new column. I fixed this with a try-except-pass for me, but that probably means that I can't use both (extra-columns and related tables) at the same time.

comment:41 by davep@…, 17 years ago

FWIW I just tripped across what I think is this bug also. Briefly, I have a model a bit like this...

class DayLocationHit(models.Model):
    location=models.ForeignKey(Location)
    date=models.DateField()
    hits=models.IntegerField(default=0)
    
    class Admin:
        pass
    class Meta:
        ordering=('location','date') 

And can't use the foreign key as the "major" sort criteria in the admin interface. The exception is "missing FROM-clause entry for table "keywordsites_location" LINE 1: ...ques" FROM "keywordsites_daylocationhit" ORDER BY "keywordsi... " on Postgres.

comment:42 by Malcolm Tredinnick, 17 years ago

(In [6510]) queryset-refactor: Fixed a large bag of order_by() problems.

This also picked up a small bug in some twisted select_related() handling.

Introduces a new syntax for cross-model ordering: foobarbaz, using field
names, instead of a strange combination of table names and field names. This
might turn out to be backwards compatible (although the old syntax leads to
bugs and is not to be recommended).

Still to come: fixes for extra() handling, since the new syntax can't handle
that and doc updates.

Things are starting to get a bit slow here, so we might eventually have to
remove ordering by many-many and other multiple-result fields, since they don't
make a lot of sense in any case. For now, it's legal.

Refs #2076, #2874, #3002 (although the admin bit doesn't work yet).

comment:43 by Malcolm Tredinnick, 17 years ago

(In [6511]) queryset-refactor: Added an order_by parameter to extra(). Refs #2076.

comment:44 by Malcolm Tredinnick, 17 years ago

Keywords: qs-rf-fixed added; qs-rf removed

comment:45 by Malcolm Tredinnick, 17 years ago

Resolution: fixed
Status: newclosed

(In [7477]) Merged the queryset-refactor branch into trunk.

This is a big internal change, but mostly backwards compatible with existing
code. Also adds a couple of new features.

Fixed #245, #1050, #1656, #1801, #2076, #2091, #2150, #2253, #2306, #2400, #2430, #2482, #2496, #2676, #2737, #2874, #2902, #2939, #3037, #3141, #3288, #3440, #3592, #3739, #4088, #4260, #4289, #4306, #4358, #4464, #4510, #4858, #5012, #5020, #5261, #5295, #5321, #5324, #5325, #5555, #5707, #5796, #5817, #5987, #6018, #6074, #6088, #6154, #6177, #6180, #6203, #6658

comment:46 by aaron, 17 years ago

This still seems broken to me (even after qs refactor, which is awesome in every other way, thanks Malcom!) when the related field may be null. For example:

# same as in modeltests.many_to_many_null.models
class Reporter(models.Model):
    name = models.CharField(max_length=30)

    def __unicode__(self):
        return self.name

class Article(models.Model):
    headline = models.CharField(max_length=100)
    reporter = models.ForeignKey(Reporter, null=True)

    class Meta:
        ordering = ('headline',)

    def __unicode__(self):
        return self.headline
"""
# Create a Reporter.
>>> r = Reporter(name='John Smith')
>>> r.save()

# Create an Article with no Reporter by passing "reporter=None".
>>> a = Article(headline="Ghost article", reporter=None)
>>> a.save()
>>> print a.reporter
None

>>> a = Article.objects.all()
>>> a
[<Article: Ghost article>]

>>> a = Article.objects.all().order_by('reporter__name')
>>> a
[<Article: Ghost article>]
"""

The last test there fails. But maybe I don't understand what the semantics should be here. It looks like the last test uses an inner join with reporter, when a left outer join might be more appropriate.

The same thing happens if you use 'reporter__name' in the default ordering in the Meta class of the Article model as well.

Note: See TracTickets for help on using tickets.
Back to Top