#2076 closed defect (fixed)
order_by with related table does not work
Reported by: | Malcolm Tredinnick | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | normal | Keywords: | qs-rf-fixed |
Cc: | gary.wilson@…, real.human@…, arthur.case@…, tom@…, mark@…, portland@…, floguy@…, django@…, remco@…, eliott@…, pythonmailing@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
This came up on IRC today: if you look at the order_by query that uses the related table Blogs in http://www.djangoproject.com/documentation/db_api/#order-by-fields, it will not actually work. The SQL we produce does not join the blogs_blog table to the blogs_entry table in the SQL query, although it does put in an "order by blogs_blog.name ASC" bit. We need to add in the extra table to the join as well, although this requires reordering the execution in django/db/models/query.py (in the _get_sql_clause() method).
A temporary hack for the moment is Entry.objects.select_related().order_by('blogs_blog.name', 'headline')
but that is a bit clunky.
We need a test for this, too.
(assigning to myself for now; just noting this here in case I get hit by a bus.)
Attachments (7)
Change History (53)
comment:1 by , 18 years ago
comment:2 by , 18 years ago
Cc: | added |
---|
comment:6 by , 18 years ago
Cc: | added |
---|
comment:8 by , 18 years ago
Resolution: | → duplicate |
---|---|
Status: | new → closed |
duplicate of #2210, which has a working patch.
by , 18 years ago
Attachment: | ticket-2076.diff added |
---|
Patch originally posted by serbaut to #2210 with tests added
comment:9 by , 18 years ago
Just attached a patch for this issue, created by serbaut and originally posted to ticket #2210 but IMHO this is better suited, see http://groups.google.ca/group/django-users/msg/c98e1631ce4b7c8e for the reasoning behind this.
The patch itself is updated to r4455 and I've added some tests of the new funcionality implemented by it.
follow-up: 19 comment:10 by , 18 years ago
Has patch: | set |
---|---|
Keywords: | reopen added |
Triage Stage: | Unreviewed → Ready for checkin |
#2210 is actually a different issue, see this comment on #2210
Can somebody from core please reopen this bug?
I'm attaching a slightly modified patch, two backticks were missing in a docstring :-)
Thanks, serbaut for the original patch, and Ramiro Morales for the tests!
by , 18 years ago
Attachment: | ticket-2076.2.diff added |
---|
slightly corrected version of previous patch (missing backticks)
comment:11 by , 18 years ago
I will repeat/expand here some thing I've noted when implementing the tests so Django developers can take them in account when they decide if/when apply this patch:
- The patch allows one to use
'relationship_name__related_model_field_name'
ordering specification in theorder_by()
QuerySet method but not in theMeta.ordering
attribute. - When you use a bare
'relationship_name'
ordering specification the patch orders things by the related model PK and not by itsMeta.ordering
attribute. By examining the Admin app it seems that at least in that app the intent is to do the ordering byMeta.ordering
if that exists and fallback to the PK otherwise. I don't know if this may be considered an inconsistency. - I have tested only ForeignKey relationships.
- I have not tested if the ordering specification can be used by spanning through reverse relationships.
- The patch does not seem to preserve backwards compatibility with the
'[-]appname_tablename.fieldname' order_by
specification notation suggested by the Django documentation (http://www.djangoproject.com/documentation/db_api/#order-by-fields). Would this be worth noting if the patch gets applied?.
comment:12 by , 18 years ago
Resolution: | duplicate |
---|---|
Status: | closed → reopened |
comment:13 by , 18 years ago
Keywords: | reopen removed |
---|
follow-up: 23 comment:15 by , 18 years ago
Status: | reopened → new |
---|---|
Triage Stage: | Ready for checkin → Accepted |
This is really great work. Thanks to everybody's who's pitched in so far.
Not quite "ready for checkin" yet because of Ramiro's detective work when constructing the tests. A few things I would like to see working before this goes in:
- Should work with the !Meta inner class as well, since that provides default ordering for a model and it's going to be the source of a lot of complaints (from me, no less!) if you can only do something manually and not automatically.
- When just specifying a relation field as the ordering attribute, we should use the related model's "natural" ordering (that specified in Meta) before primary key ordering. Again, it's the intuitive behaviour when selecting a model on its own, so making it consistent across the board is worthwhile.
- Check it works with all relations (not just ForeignKeys).
- Let's work out what happens with backwards relations (and also spanning across multiple models: does
model1__model2__some_field
work?)
I'm not too worried about the backwards-compat issue, since that's a bit of an ugly hack anyway -- it's the only place you are ever required to know the database table names (or even care that the database exists) in the "non-direct-SQL" cases of a QuerySet. Providing we can get this fixed before 1.0, I'm willing to defend the case for breaking it in a small way.
The current patch is a great start and looks like the right way to approach the problem. So if somebody feels enthusiastic and wants to do some work on this, these are the things I'd like to see done before it can be checked in.
comment:16 by , 18 years ago
Patch needs improvement: | set |
---|
comment:17 by , 18 years ago
Cc: | added |
---|
comment:18 by , 18 years ago
Cc: | added |
---|
follow-up: 20 comment:19 by , 18 years ago
Replying to Michael Radziej <mir@noris.de>:
[...]
I'm attaching a slightly modified patch, two backticks were missing in a docstring :-)
it's me or an unrelated modification to django/newforms/models.py
slipped in your ticket-2076.2.diff
?.
comment:20 by , 18 years ago
Replying to ramiro <rm0 _at_ gmx.net>:
it's me or an unrelated modification to
django/newforms/models.py
slipped in yourticket-2076.2.diff
?.
You're right. It probably crept in from a different ticket when I forgot to clean up.
by , 18 years ago
Attachment: | ticket-2076.3.diff added |
---|
corrected patch without changes in django/newforms/models.py
comment:21 by , 18 years ago
comment:22 by , 18 years ago
Just stubbed my toe on this bug. It would probably be a good idea to remove the information in the documentation or change it to the workaround.
comment:23 by , 18 years ago
Replying to mtredinnick:
Here is another iteration of the patch, not there yet because I don't have the required Django fu:
- Should work with the !Meta inner class as well, since that provides default ordering for a model and it's going to be the source of a lot of complaints (from me, no less!) if you can only do something manually and not automatically.
This is working now, it wasn't a problem with the patch per se but a validation made in django/code/management.py:get_validation_errors()
was triggering. I modified it to skip the validation just as it already does with the tablename.field
form.
- When just specifying a relation field as the ordering attribute, we should use the related model's "natural" ordering (that specified in Meta) before primary key ordering. Again, it's the intuitive behaviour when selecting a model on its own, so making it consistent across the board is worthwhile.
This is the one I haven't the skills to implement. serbaut@gmail.com
implemented it by calling the lookup_inner
function from QuerySet._get_sql_clause()
but I don know what other internal function should be used to take in account a recursive scenario like this: The example at http://www.djangoproject.com/documentation/db_api/ with the Blog
model having a relname
relationship to a 4th model X and a Meta.ordering = (relname,)
and trying to order the Entry
model instances by its blog
relationship field.
- Check it works with all relations (not just ForeignKeys).
- Let's work out what happens with backwards relations...
I've added (although a bit contrived) test cases for both ManyToMany
and reverse ForeignKey
and ManyToMany
relationships . Please verify the coverage of these tests.
...(and also spanning across multiple models: does
model1__model2__some_field
work?)
There was already a test case for this.
I'm not too worried about the backwards-compat issue, since that's a bit of an ugly hack anyway -- it's the only place you are ever required to know the database table names (or even care that the database exists) in the "non-direct-SQL" cases of a QuerySet. Providing we can get this fixed before 1.0, I'm willing to defend the case for breaking it in a small way.
Well, there is something strange about this because as I see the original patch intended to maintain that backwards compatibility but it didn't achieve it, so it's a case of a broken attempt to maintain compatibility with a broken (as per your original report) but documented feature.
For now I've just eliminated a obvious dead code path but I didn't delete all the related code. I will attach another iteration of the patch where the tablename.filename
is rejected in the validation and all traces of its support is eliminated from QuerySet._get_sql_clause()
.
Also, now db-api.txt
is modified to describe the feature, please check and expand because my English is not very good (the same applies for the
tests/modeltests/ordering/models.py
mods).
Test and enhance so it can be included in 0.96 :)
by , 18 years ago
Attachment: | ticket-2076.4.diff added |
---|
comment:24 by , 18 years ago
Replying to Ramiro Morales <rm0 _at_ gmx.net>:
[...] I will attach another iteration of the patch where the
tablename.filename
is rejected in the validation and all traces of its support is eliminated fromQuerySet._get_sql_clause()
.
Done, see attached ticket-2076.5.diff
This is the one I haven't the skills to implement.
serbaut@gmail.com
implemented it by calling thelookup_inner
function fromQuerySet._get_sql_clause()
but I don know what other internal function should be used to take in account a recursive scenario like this: The example at http://www.djangoproject.com/documentation/db_api/ with theBlog
model having arelname
relationship to a 4th model X and aMeta.ordering = (relname,)
and trying to order theEntry
model instances by itsblog
relationship field.
... or this better (worse?) scenario:
- The example at http://www.djangoproject.com/documentation/db_api/
Blog
model has arelname
relationship to a 4th. model X- its has
Meta.ordering = ('relname__a_field_from_model_X',)
- We try to order the
Entry
model instances by itsblog
relationship field.
- Check it works with all relations (not just ForeignKeys).
- Let's work out what happens with backwards relations...
I've added (although a bit contrived) test cases for both
ManyToMany
and reverseForeignKey
andManyToMany
relationships . Please verify the coverage of these tests.
Almost forgot to add this note: When creating the doctests for ManyToManyField
relationships I was getting
duplicate objects in the order_by()
query results so I needed to insert distinct()
calls (see the last three tests), this may be sign of an issue/shortcomming with the patch when interacting with that kind of relationships. I can't compare this with any previous pre-patch behavior because the select_related()
workaround doesn't work in the M2M case.
by , 18 years ago
Attachment: | ticket-2076.5.diff added |
---|
comment:25 by , 18 years ago
Cc: | added; removed |
---|
comment:26 by , 18 years ago
Cc: | added |
---|
by , 18 years ago
Attachment: | ticket-2076.6.diff added |
---|
Same as ticket-2076.5.diff with corrections in the documentation and testcases comments
comment:27 by , 18 years ago
There's one problem.
class A(models.Model): name = models.CharField() class B(models.Model): [...] a = ForeignKey(A, null=True)
The following query will use an INNER JOIN
to A:
B.objects.all().order_by(a__name)
This means that all objects B with B.a==None will be missing. So, this should use an OUTER JOIN
. I'm not sure whether this qualifies the patch as wrong, since the ORM fails to switch to OUTER joins in similar circumstances with, too.
Michael
comment:28 by , 18 years ago
The supplied testcases do not work with a postgresql database, since it requires that there is an order-by clause that matches the distinct clause. I get these error messages:
====================================================================== FAIL: Doctest: modeltests.ordering.models.__test__.API_TESTS ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/mir/lib/python/django/test/doctest.py", line 2156, in runTest raise self.failureException(self.format_failure(new.getvalue())) AssertionError: Failed doctest test for modeltests.ordering.models.__test__.API_TESTS File "/home/mir/src/django/trunk/tests/modeltests/ordering/models.py", line unknown line number, in API_TESTS ---------------------------------------------------------------------- File "/home/mir/src/django/trunk/tests/modeltests/ordering/models.py", line ?, in modeltests.ordering.models.__test__.API_TESTS Failed example: Reporter.objects.all().distinct() Exception raised: Traceback (most recent call last): File "/home/mir/lib/python/django/test/doctest.py", line 1243, in __run compileflags, 1) in test.globs File "<doctest modeltests.ordering.models.__test__.API_TESTS[38]>", line 1, in ? Reporter.objects.all().distinct() File "/home/mir/lib/python/django/db/models/query.py", line 104, in __repr__ return repr(self._get_data()) File "/home/mir/lib/python/django/db/models/query.py", line 505, in _get_data self._result_cache = list(self.iterator()) File "/home/mir/lib/python/django/db/models/query.py", line 189, in iterator cursor.execute("SELECT " + (self._distinct and "DISTINCT " or "") + ",".join(select) + sql, params) File "/home/mir/lib/python/django/db/backends/postgresql/base.py", line 43, in execute return self.cursor.execute(sql, [smart_basestring(p, self.charset) for p in params]) ProgrammingError: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list SELECT DISTINCT "ordering_reporter"."id","ordering_reporter"."name" FROM "ordering_reporter" LEFT OUTER JOIN "ordering_reporter_artic les" AS "m2m_ordering_reporter__articles" ON "ordering_reporter"."id" = "m2m_ordering_reporter__articles"."reporter_id" INNER JOIN "order ing_article" AS "ordering_reporter__articles" ON "m2m_ordering_reporter__articles"."article_id" = "ordering_reporter__articles"."id" ORDE R BY "ordering_reporter__articles"."headline" ASC
(and more for the other distinct queries). Unfortunately, the tests cannot be easily rephrased, since they really require distinct
.
The postgresql docs state:
The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s).
Though, the patch works great for me ;-)
comment:29 by , 18 years ago
With postgresql, test 'reserved_names' also fails. There's a strange line in it:
>>> Thing.objects.order_by('select.when')
The model is:
class Thing(models.Model): when = models.CharField(maxlength=1, primary_key=True) join = models.CharField(maxlength=1) like = models.CharField(maxlength=1) drop = models.CharField(maxlength=1) alter = models.CharField(maxlength=1) having = models.CharField(maxlength=1) where = models.DateField(maxlength=1) has_hyphen = models.CharField(maxlength=1, db_column='has-hyphen') class Meta: db_table = 'select'
So, select.when
does not make sense, but it used to work since the database table was called 'select'. IMHO, this is not a valid use. But the test case needs to be removed and the minor incompatibility should be noted.
comment:30 by , 18 years ago
Another test failure. The following test gives random results:
EditorResponse.objects.order_by('reader_letter__reader_name')
The appended patch fixes this test by requesting a second level ordering.
by , 18 years ago
Attachment: | order.diff added |
---|
comment:31 by , 18 years ago
I'm a Django newbie and innocently stumbled into the recursive case that Ramiro mentioned:
class A(models.Model): ... class B(models.Model): a = models.ForeignKey(A) ... class Meta: ordering = ('a',) class C(models.Model): b = models.ForeignKey(B) ... class Meta: ordering = ('b',)
In svn trunk, trying to list Cs in the admin interface gave me a MySQL error that there was no field app_b.a
Because A just orders by the PK anyway, I was able to work around it by specifying 'a_id' in for B's ordering (though this raises an error in model validation that B has no 'a_id' field):
class B(models.Model): a = models.ForeignKey(A) ... class Meta: ordering = ('a_id',)
Just for kicks, I applied ticket-2076.6 patch, and now it won't work either way. The original code errors out looking for app_c.app_b.a and the hacked code looks for app_c.app_b.a_id
I know you're already aware that this is a problem with the patch. Just chiming in to note that this two-layer FK relationship is quite natural and easy to write, even for a beginner, so I would think that it needs to Just Work in 1.0.
comment:32 by , 17 years ago
Cc: | added |
---|
comment:33 by , 17 years ago
Cc: | added |
---|
comment:34 by , 17 years ago
Cc: | added |
---|
comment:35 by , 17 years ago
Cc: | added |
---|
comment:36 by , 17 years ago
Cc: | added |
---|
comment:37 by , 17 years ago
Keywords: | qs-rf added |
---|
comment:38 by , 17 years ago
Cc: | added |
---|
follow-up: 40 comment:39 by , 17 years ago
Not sure if this has been mentioned already, but this patch breaks functionality to refer to columns added by extra().
comment:40 by , 17 years ago
Replying to miracle2k:
Not sure if this has been mentioned already, but this patch breaks functionality to refer to columns added by extra().
To clarify: This is because the call to lookup_inner() in line 495 throws a TypeError when it encounters the new column. I fixed this with a try-except-pass for me, but that probably means that I can't use both (extra-columns and related tables) at the same time.
comment:41 by , 17 years ago
FWIW I just tripped across what I think is this bug also. Briefly, I have a model a bit like this...
class DayLocationHit(models.Model): location=models.ForeignKey(Location) date=models.DateField() hits=models.IntegerField(default=0) class Admin: pass class Meta: ordering=('location','date')
And can't use the foreign key as the "major" sort criteria in the admin interface. The exception is "missing FROM-clause entry for table "keywordsites_location" LINE 1: ...ques" FROM "keywordsites_daylocationhit" ORDER BY "keywordsi... " on Postgres.
comment:42 by , 17 years ago
(In [6510]) queryset-refactor: Fixed a large bag of order_by() problems.
This also picked up a small bug in some twisted select_related() handling.
Introduces a new syntax for cross-model ordering: foobarbaz, using field
names, instead of a strange combination of table names and field names. This
might turn out to be backwards compatible (although the old syntax leads to
bugs and is not to be recommended).
Still to come: fixes for extra() handling, since the new syntax can't handle
that and doc updates.
Things are starting to get a bit slow here, so we might eventually have to
remove ordering by many-many and other multiple-result fields, since they don't
make a lot of sense in any case. For now, it's legal.
Refs #2076, #2874, #3002 (although the admin bit doesn't work yet).
comment:43 by , 17 years ago
comment:44 by , 17 years ago
Keywords: | qs-rf-fixed added; qs-rf removed |
---|
comment:45 by , 17 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
(In [7477]) Merged the queryset-refactor branch into trunk.
This is a big internal change, but mostly backwards compatible with existing
code. Also adds a couple of new features.
Fixed #245, #1050, #1656, #1801, #2076, #2091, #2150, #2253, #2306, #2400, #2430, #2482, #2496, #2676, #2737, #2874, #2902, #2939, #3037, #3141, #3288, #3440, #3592, #3739, #4088, #4260, #4289, #4306, #4358, #4464, #4510, #4858, #5012, #5020, #5261, #5295, #5321, #5324, #5325, #5555, #5707, #5796, #5817, #5987, #6018, #6074, #6088, #6154, #6177, #6180, #6203, #6658
comment:46 by , 17 years ago
This still seems broken to me (even after qs refactor, which is awesome in every other way, thanks Malcom!) when the related field may be null. For example:
# same as in modeltests.many_to_many_null.models class Reporter(models.Model): name = models.CharField(max_length=30) def __unicode__(self): return self.name class Article(models.Model): headline = models.CharField(max_length=100) reporter = models.ForeignKey(Reporter, null=True) class Meta: ordering = ('headline',) def __unicode__(self): return self.headline
""" # Create a Reporter. >>> r = Reporter(name='John Smith') >>> r.save() # Create an Article with no Reporter by passing "reporter=None". >>> a = Article(headline="Ghost article", reporter=None) >>> a.save() >>> print a.reporter None >>> a = Article.objects.all() >>> a [<Article: Ghost article>] >>> a = Article.objects.all().order_by('reporter__name') >>> a [<Article: Ghost article>] """
The last test there fails. But maybe I don't understand what the semantics should be here. It looks like the last test uses an inner join with reporter, when a left outer join might be more appropriate.
The same thing happens if you use 'reporter__name
' in the default ordering in the Meta class of the Article model as well.
Apparently, the select_related() workaround doesn't work for ForeignKey with null=True.