Code

Opened 8 years ago

Closed 6 years ago

Last modified 6 years ago

#2076 closed defect (fixed)

order_by with related table does not work

Reported by: mtredinnick Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: normal Keywords: qs-rf-fixed
Cc: gary.wilson@…, real.human@…, arthur.case@…, tom@…, mark@…, portland@…, floguy@…, django@…, remco@…, eliott@…, pythonmailing@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: UI/UX:

Description

This came up on IRC today: if you look at the order_by query that uses the related table Blogs in http://www.djangoproject.com/documentation/db_api/#order-by-fields, it will not actually work. The SQL we produce does not join the blogs_blog table to the blogs_entry table in the SQL query, although it does put in an "order by blogs_blog.name ASC" bit. We need to add in the extra table to the join as well, although this requires reordering the execution in django/db/models/query.py (in the _get_sql_clause() method).

A temporary hack for the moment is Entry.objects.select_related().order_by('blogs_blog.name', 'headline') but that is a bit clunky.

We need a test for this, too.

(assigning to myself for now; just noting this here in case I get hit by a bus.)

Attachments (7)

ticket-2076.diff (9.3 KB) - added by ramiro <rm0 _at_ gmx.net> 7 years ago.
Patch originally posted by serbaut to #2210 with tests added
ticket-2076.2.diff (9.4 KB) - added by Michael Radziej <mir@…> 7 years ago.
slightly corrected version of previous patch (missing backticks)
ticket-2076.3.diff (8.9 KB) - added by mir@… 7 years ago.
corrected patch without changes in django/newforms/models.py
ticket-2076.4.diff (12.9 KB) - added by Ramiro Morales <rm0 _at_ gmx.net> 7 years ago.
ticket-2076.5.diff (13.1 KB) - added by Ramiro Morales <rm0 _at_ gmx.net> 7 years ago.
ticket-2076.6.diff (13.2 KB) - added by Ramiro Morales <rm0 _at_ gmx.net> 7 years ago.
Same as ticket-2076.5.diff with corrections in the documentation and testcases comments
order.diff (859 bytes) - added by Michael Radziej <mir@…> 7 years ago.

Download all attachments as: .zip

Change History (53)

comment:1 Changed 8 years ago by oefe

Apparently, the select_related() workaround doesn't work for ForeignKey with null=True.

comment:2 Changed 8 years ago by anonymous

  • Cc gary.wilson@… added

comment:3 Changed 8 years ago by ramiro

See also #2210 where a fix is suggested

comment:4 Changed 8 years ago by Russell Cloran <russell@…>

The fix suggested will not fix this problem.

comment:5 Changed 8 years ago by serbaut@…

See patch in #2210.

comment:6 Changed 7 years ago by mrmachine <real dot human at mrmachine dot net>

  • Cc real.human@… added

comment:7 Changed 7 years ago by ramiro <rm0 _at_ gmx.net>

#3002 was a duplicate of this one.

comment:8 Changed 7 years ago by mir@…

  • Resolution set to duplicate
  • Status changed from new to closed

duplicate of #2210, which has a working patch.

Changed 7 years ago by ramiro <rm0 _at_ gmx.net>

Patch originally posted by serbaut to #2210 with tests added

comment:9 Changed 7 years ago by ramiro <rm0 _at_ gmx.net>

Just attached a patch for this issue, created by serbaut and originally posted to ticket #2210 but IMHO this is better suited, see http://groups.google.ca/group/django-users/msg/c98e1631ce4b7c8e for the reasoning behind this.

The patch itself is updated to r4455 and I've added some tests of the new funcionality implemented by it.

comment:10 follow-up: Changed 7 years ago by Michael Radziej <mir@…>

  • Has patch set
  • Keywords reopen added
  • Triage Stage changed from Unreviewed to Ready for checkin

#2210 is actually a different issue, see this comment on #2210

Can somebody from core please reopen this bug?

I'm attaching a slightly modified patch, two backticks were missing in a docstring :-)

Thanks, serbaut for the original patch, and Ramiro Morales for the tests!

Changed 7 years ago by Michael Radziej <mir@…>

slightly corrected version of previous patch (missing backticks)

comment:11 Changed 7 years ago by ramiro <rm0 _at_ gmx.net>

I will repeat/expand here some thing I've noted when implementing the tests so Django developers can take them in account when they decide if/when apply this patch:

  • The patch allows one to use 'relationship_name__related_model_field_name' ordering specification in the order_by() QuerySet method but not in the Meta.ordering attribute.
  • When you use a bare 'relationship_name' ordering specification the patch orders things by the related model PK and not by its Meta.ordering attribute. By examining the Admin app it seems that at least in that app the intent is to do the ordering by Meta.ordering if that exists and fallback to the PK otherwise. I don't know if this may be considered an inconsistency.
  • I have tested only ForeignKey relationships.
  • I have not tested if the ordering specification can be used by spanning through reverse relationships.
  • The patch does not seem to preserve backwards compatibility with the '[-]appname_tablename.fieldname' order_by specification notation suggested by the Django documentation (http://www.djangoproject.com/documentation/db_api/#order-by-fields). Would this be worth noting if the patch gets applied?.

comment:12 Changed 7 years ago by jacob

  • Resolution duplicate deleted
  • Status changed from closed to reopened

comment:13 Changed 7 years ago by Michael Radziej <mir@…>

  • Keywords reopen removed

comment:14 Changed 7 years ago by Michael Radziej <mir@…>

#3476 marked as duplicate.

comment:15 follow-up: Changed 7 years ago by mtredinnick

  • Status changed from reopened to new
  • Triage Stage changed from Ready for checkin to Accepted

This is really great work. Thanks to everybody's who's pitched in so far.

Not quite "ready for checkin" yet because of Ramiro's detective work when constructing the tests. A few things I would like to see working before this goes in:

  1. Should work with the !Meta inner class as well, since that provides default ordering for a model and it's going to be the source of a lot of complaints (from me, no less!) if you can only do something manually and not automatically.
  2. When just specifying a relation field as the ordering attribute, we should use the related model's "natural" ordering (that specified in Meta) before primary key ordering. Again, it's the intuitive behaviour when selecting a model on its own, so making it consistent across the board is worthwhile.
  3. Check it works with all relations (not just ForeignKeys).
  4. Let's work out what happens with backwards relations (and also spanning across multiple models: does model1__model2__some_field work?)

I'm not too worried about the backwards-compat issue, since that's a bit of an ugly hack anyway -- it's the only place you are ever required to know the database table names (or even care that the database exists) in the "non-direct-SQL" cases of a QuerySet. Providing we can get this fixed before 1.0, I'm willing to defend the case for breaking it in a small way.

The current patch is a great start and looks like the right way to approach the problem. So if somebody feels enthusiastic and wants to do some work on this, these are the things I'd like to see done before it can be checked in.

comment:16 Changed 7 years ago by mtredinnick

  • Patch needs improvement set

comment:17 Changed 7 years ago by arthur.case@…

  • Cc arthur.case@… added

comment:18 Changed 7 years ago by Thomas Steinacher <tom@…>

  • Cc tom@… added

comment:19 in reply to: ↑ 10 ; follow-up: Changed 7 years ago by ramiro <rm0 _at_ gmx.net>

Replying to Michael Radziej <mir@noris.de>:

[...]

I'm attaching a slightly modified patch, two backticks were missing in a docstring :-)

it's me or an unrelated modification to django/newforms/models.py slipped in your ticket-2076.2.diff?.

comment:20 in reply to: ↑ 19 Changed 7 years ago by mir@…

Replying to ramiro <rm0 _at_ gmx.net>:

it's me or an unrelated modification to django/newforms/models.py slipped in your ticket-2076.2.diff?.

You're right. It probably crept in from a different ticket when I forgot to clean up.

Changed 7 years ago by mir@…

corrected patch without changes in django/newforms/models.py

comment:21 Changed 7 years ago by Michael Radziej <mir@…>

Note: #2210 is an almost-duplicate and has been closed in favor of this ticket (it's a corner case). It should be checked separately that the solution for #2076 also fixes #2210.

comment:22 Changed 7 years ago by Lau Bech Lauritzen <lau__@…>

Just stubbed my toe on this bug. It would probably be a good idea to remove the information in the documentation or change it to the workaround.

comment:23 in reply to: ↑ 15 Changed 7 years ago by Ramiro Morales <rm0 _at_ gmx.net>

Replying to mtredinnick:

Here is another iteration of the patch, not there yet because I don't have the required Django fu:

  1. Should work with the !Meta inner class as well, since that provides default ordering for a model and it's going to be the source of a lot of complaints (from me, no less!) if you can only do something manually and not automatically.

This is working now, it wasn't a problem with the patch per se but a validation made in django/code/management.py:get_validation_errors() was triggering. I modified it to skip the validation just as it already does with the tablename.field form.

  1. When just specifying a relation field as the ordering attribute, we should use the related model's "natural" ordering (that specified in Meta) before primary key ordering. Again, it's the intuitive behaviour when selecting a model on its own, so making it consistent across the board is worthwhile.

This is the one I haven't the skills to implement. serbaut@gmail.com implemented it by calling the lookup_inner function from QuerySet._get_sql_clause() but I don know what other internal function should be used to take in account a recursive scenario like this: The example at http://www.djangoproject.com/documentation/db_api/ with the Blog model having a relname relationship to a 4th model X and a Meta.ordering = (relname,) and trying to order the Entry model instances by its blog relationship field.

  1. Check it works with all relations (not just ForeignKeys).
  2. Let's work out what happens with backwards relations...

I've added (although a bit contrived) test cases for both ManyToMany and reverse ForeignKey and ManyToMany relationships . Please verify the coverage of these tests.

...(and also spanning across multiple models: does model1__model2__some_field work?)

There was already a test case for this.

I'm not too worried about the backwards-compat issue, since that's a bit of an ugly hack anyway -- it's the only place you are ever required to know the database table names (or even care that the database exists) in the "non-direct-SQL" cases of a QuerySet. Providing we can get this fixed before 1.0, I'm willing to defend the case for breaking it in a small way.

Well, there is something strange about this because as I see the original patch intended to maintain that backwards compatibility but it didn't achieve it, so it's a case of a broken attempt to maintain compatibility with a broken (as per your original report) but documented feature.

For now I've just eliminated a obvious dead code path but I didn't delete all the related code. I will attach another iteration of the patch where the tablename.filename is rejected in the validation and all traces of its support is eliminated from QuerySet._get_sql_clause().

Also, now db-api.txt is modified to describe the feature, please check and expand because my English is not very good (the same applies for the
tests/modeltests/ordering/models.py mods).

Test and enhance so it can be included in 0.96 :)

Changed 7 years ago by Ramiro Morales <rm0 _at_ gmx.net>

comment:24 Changed 7 years ago by Ramiro Morales <rm0 _at_ gmx.net>

Replying to Ramiro Morales <rm0 _at_ gmx.net>:

[...] I will attach another iteration of the patch where the tablename.filename is rejected in the validation and all traces of its support is eliminated from QuerySet._get_sql_clause().

Done, see attached ticket-2076.5.diff

This is the one I haven't the skills to implement. serbaut@gmail.com implemented it by calling the lookup_inner function from QuerySet._get_sql_clause() but I don know what other internal function should be used to take in account a recursive scenario like this: The example at http://www.djangoproject.com/documentation/db_api/ with the Blog model having a relname relationship to a 4th model X and a Meta.ordering = (relname,) and trying to order the Entry model instances by its blog relationship field.

... or this better (worse?) scenario:

  1. Check it works with all relations (not just ForeignKeys).
  2. Let's work out what happens with backwards relations...

I've added (although a bit contrived) test cases for both ManyToMany and reverse ForeignKey and ManyToMany relationships . Please verify the coverage of these tests.

Almost forgot to add this note: When creating the doctests for ManyToManyField relationships I was getting
duplicate objects in the order_by() query results so I needed to insert distinct() calls (see the last three tests), this may be sign of an issue/shortcomming with the patch when interacting with that kind of relationships. I can't compare this with any previous pre-patch behavior because the select_related() workaround doesn't work in the M2M case.

Changed 7 years ago by Ramiro Morales <rm0 _at_ gmx.net>

comment:25 Changed 7 years ago by anonymous

  • Cc mark@… added; gary.wilson@…, real.human@…, arthur.case@…, tom@… removed

comment:26 Changed 7 years ago by anonymous

  • Cc gary.wilson@…, real.human@…, arthur.case@…, tom@… added

Changed 7 years ago by Ramiro Morales <rm0 _at_ gmx.net>

Same as ticket-2076.5.diff with corrections in the documentation and testcases comments

comment:27 Changed 7 years ago by anonymous

There's one problem.

class A(models.Model):
  name = models.CharField()

class B(models.Model):
  [...]
  a = ForeignKey(A, null=True)

The following query will use an INNER JOIN to A:

B.objects.all().order_by(a__name)

This means that all objects B with B.a==None will be missing. So, this should use an OUTER JOIN. I'm not sure whether this qualifies the patch as wrong, since the ORM fails to switch to OUTER joins in similar circumstances with, too.

Michael

comment:28 Changed 7 years ago by Michael Radziej <mir@…>

The supplied testcases do not work with a postgresql database, since it requires that there is an order-by clause that matches the distinct clause. I get these error messages:

======================================================================
FAIL: Doctest: modeltests.ordering.models.__test__.API_TESTS
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/mir/lib/python/django/test/doctest.py", line 2156, in runTest
    raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for modeltests.ordering.models.__test__.API_TESTS
  File "/home/mir/src/django/trunk/tests/modeltests/ordering/models.py", line unknown line number, in API_TESTS

----------------------------------------------------------------------
File "/home/mir/src/django/trunk/tests/modeltests/ordering/models.py", line ?, in modeltests.ordering.models.__test__.API_TESTS
Failed example:
    Reporter.objects.all().distinct()
Exception raised:
    Traceback (most recent call last):
      File "/home/mir/lib/python/django/test/doctest.py", line 1243, in __run
        compileflags, 1) in test.globs
      File "<doctest modeltests.ordering.models.__test__.API_TESTS[38]>", line 1, in ?
        Reporter.objects.all().distinct()
      File "/home/mir/lib/python/django/db/models/query.py", line 104, in __repr__
        return repr(self._get_data())
      File "/home/mir/lib/python/django/db/models/query.py", line 505, in _get_data
        self._result_cache = list(self.iterator())
      File "/home/mir/lib/python/django/db/models/query.py", line 189, in iterator
        cursor.execute("SELECT " + (self._distinct and "DISTINCT " or "") + ",".join(select) + sql, params)
      File "/home/mir/lib/python/django/db/backends/postgresql/base.py", line 43, in execute
        return self.cursor.execute(sql, [smart_basestring(p, self.charset) for p in params])
    ProgrammingError: ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list

    SELECT DISTINCT "ordering_reporter"."id","ordering_reporter"."name" FROM "ordering_reporter" LEFT OUTER JOIN "ordering_reporter_artic
les" AS "m2m_ordering_reporter__articles" ON "ordering_reporter"."id" = "m2m_ordering_reporter__articles"."reporter_id" INNER JOIN "order
ing_article" AS "ordering_reporter__articles" ON "m2m_ordering_reporter__articles"."article_id" = "ordering_reporter__articles"."id" ORDE
R BY "ordering_reporter__articles"."headline" ASC

(and more for the other distinct queries). Unfortunately, the tests cannot be easily rephrased, since they really require distinct.

The postgresql docs state:

The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s).

Though, the patch works great for me ;-)

comment:29 Changed 7 years ago by Michael Radziej <mir@…>

With postgresql, test 'reserved_names' also fails. There's a strange line in it:

>>> Thing.objects.order_by('select.when')

The model is:

class Thing(models.Model):
    when = models.CharField(maxlength=1, primary_key=True)
    join = models.CharField(maxlength=1)
    like = models.CharField(maxlength=1)
    drop = models.CharField(maxlength=1)
    alter = models.CharField(maxlength=1)
    having = models.CharField(maxlength=1)
    where = models.DateField(maxlength=1)
    has_hyphen = models.CharField(maxlength=1, db_column='has-hyphen')

    class Meta:
       db_table = 'select'

So, select.when does not make sense, but it used to work since the database table was called 'select'. IMHO, this is not a valid use. But the test case needs to be removed and the minor incompatibility should be noted.

comment:30 Changed 7 years ago by Michael Radziej <mir@…>

Another test failure. The following test gives random results:

EditorResponse.objects.order_by('reader_letter__reader_name')

The appended patch fixes this test by requesting a second level ordering.

Changed 7 years ago by Michael Radziej <mir@…>

comment:31 Changed 7 years ago by anonymous

I'm a Django newbie and innocently stumbled into the recursive case that Ramiro mentioned:

class A(models.Model):
    ...

class B(models.Model):
    a = models.ForeignKey(A)
    ...
    class Meta:
        ordering = ('a',)

class C(models.Model):
    b = models.ForeignKey(B)
    ...
    class Meta:
        ordering = ('b',)

In svn trunk, trying to list Cs in the admin interface gave me a MySQL error that there was no field app_b.a

Because A just orders by the PK anyway, I was able to work around it by specifying 'a_id' in for B's ordering (though this raises an error in model validation that B has no 'a_id' field):

class B(models.Model):
    a = models.ForeignKey(A)
    ...
    class Meta:
        ordering = ('a_id',)

Just for kicks, I applied ticket-2076.6 patch, and now it won't work either way. The original code errors out looking for app_c.app_b.a and the hacked code looks for app_c.app_b.a_id

I know you're already aware that this is a problem with the patch. Just chiming in to note that this two-layer FK relationship is quite natural and easy to write, even for a beginner, so I would think that it needs to Just Work in 1.0.

comment:32 Changed 7 years ago by Wesley Fok <portland@…>

  • Cc portland@… added

comment:33 Changed 7 years ago by anonymous

  • Cc floguy@… added

comment:34 Changed 7 years ago by simonbun <simonbun@…>

  • Cc django@… added

comment:35 Changed 7 years ago by anonymous

  • Cc remco@… added

comment:36 Changed 7 years ago by anonymous

  • Cc eliott@… added

comment:37 Changed 7 years ago by mtredinnick

  • Keywords qs-rf added

comment:38 Changed 7 years ago by Marek Kubica <pythonmailing@…>

  • Cc pythonmailing@… added

comment:39 follow-up: Changed 7 years ago by miracle2k

Not sure if this has been mentioned already, but this patch breaks functionality to refer to columns added by extra().

comment:40 in reply to: ↑ 39 Changed 7 years ago by miracle2k

Replying to miracle2k:

Not sure if this has been mentioned already, but this patch breaks functionality to refer to columns added by extra().


To clarify: This is because the call to lookup_inner() in line 495 throws a TypeError when it encounters the new column. I fixed this with a try-except-pass for me, but that probably means that I can't use both (extra-columns and related tables) at the same time.

comment:41 Changed 7 years ago by davep@…

FWIW I just tripped across what I think is this bug also. Briefly, I have a model a bit like this...

class DayLocationHit(models.Model):
    location=models.ForeignKey(Location)
    date=models.DateField()
    hits=models.IntegerField(default=0)
    
    class Admin:
        pass
    class Meta:
        ordering=('location','date') 

And can't use the foreign key as the "major" sort criteria in the admin interface. The exception is "missing FROM-clause entry for table "keywordsites_location" LINE 1: ...ques" FROM "keywordsites_daylocationhit" ORDER BY "keywordsi... " on Postgres.

comment:42 Changed 7 years ago by mtredinnick

(In [6510]) queryset-refactor: Fixed a large bag of order_by() problems.

This also picked up a small bug in some twisted select_related() handling.

Introduces a new syntax for cross-model ordering: foobarbaz, using field
names, instead of a strange combination of table names and field names. This
might turn out to be backwards compatible (although the old syntax leads to
bugs and is not to be recommended).

Still to come: fixes for extra() handling, since the new syntax can't handle
that and doc updates.

Things are starting to get a bit slow here, so we might eventually have to
remove ordering by many-many and other multiple-result fields, since they don't
make a lot of sense in any case. For now, it's legal.

Refs #2076, #2874, #3002 (although the admin bit doesn't work yet).

comment:43 Changed 7 years ago by mtredinnick

(In [6511]) queryset-refactor: Added an order_by parameter to extra(). Refs #2076.

comment:44 Changed 7 years ago by mtredinnick

  • Keywords qs-rf-fixed added; qs-rf removed

comment:45 Changed 6 years ago by mtredinnick

  • Resolution set to fixed
  • Status changed from new to closed

(In [7477]) Merged the queryset-refactor branch into trunk.

This is a big internal change, but mostly backwards compatible with existing
code. Also adds a couple of new features.

Fixed #245, #1050, #1656, #1801, #2076, #2091, #2150, #2253, #2306, #2400, #2430, #2482, #2496, #2676, #2737, #2874, #2902, #2939, #3037, #3141, #3288, #3440, #3592, #3739, #4088, #4260, #4289, #4306, #4358, #4464, #4510, #4858, #5012, #5020, #5261, #5295, #5321, #5324, #5325, #5555, #5707, #5796, #5817, #5987, #6018, #6074, #6088, #6154, #6177, #6180, #6203, #6658

comment:46 Changed 6 years ago by aaron

This still seems broken to me (even after qs refactor, which is awesome in every other way, thanks Malcom!) when the related field may be null. For example:

# same as in modeltests.many_to_many_null.models
class Reporter(models.Model):
    name = models.CharField(max_length=30)

    def __unicode__(self):
        return self.name

class Article(models.Model):
    headline = models.CharField(max_length=100)
    reporter = models.ForeignKey(Reporter, null=True)

    class Meta:
        ordering = ('headline',)

    def __unicode__(self):
        return self.headline
"""
# Create a Reporter.
>>> r = Reporter(name='John Smith')
>>> r.save()

# Create an Article with no Reporter by passing "reporter=None".
>>> a = Article(headline="Ghost article", reporter=None)
>>> a.save()
>>> print a.reporter
None

>>> a = Article.objects.all()
>>> a
[<Article: Ghost article>]

>>> a = Article.objects.all().order_by('reporter__name')
>>> a
[<Article: Ghost article>]
"""

The last test there fails. But maybe I don't understand what the semantics should be here. It looks like the last test uses an inner join with reporter, when a left outer join might be more appropriate.

The same thing happens if you use 'reporter__name' in the default ordering in the Meta class of the Article model as well.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.