#9041 closed (invalid)
invalid count result when paired with select_related
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.0 |
Severity: | Keywords: | ||
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
When using a select_related, I sometimes get invalid results if the select_related method reject some data because they are not valid.
If, for example, I have a model that requires a reference to another object and that reference is, for some reason, None, these entries
will be rejected by the select_related.
The problem is that the Paginator object uses the count method to tell the user how many items are found, but when we try to use these items, we only get an empty page.
>>> a = Session.objects.filter(Key__value = "www").select_related() >>> a.count() 29L >>> len(a) 3
>>> from django.core.paginator import Paginator >>> from metrix2.manager.models import Session >>> q = Session.objects.filter(Key__string = "2PSpuaRIZGNj").select_related() >>> p = Paginator(q, 15) >>> p.count 29L >>> p.num_pages 2 >>> p.page(1).object_list [<Session: xxx>, <Session: yyy>, <Session: zzz>] >>> p.page(2).object_list []
Change History (7)
comment:1 by , 16 years ago
Component: | Uncategorized → Database layer (models, ORM) |
---|
comment:2 by , 16 years ago
milestone: | → 1.1 |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:3 by , 16 years ago
comment:4 by , 16 years ago
I also need to know the other models that Session
links to: in other words, what else should be being pulled in by the select_related()
call.
comment:5 by , 16 years ago
As I said in my first message, this happens only when the data in the db is not valid for django (when integrating legacy data, for example, or when the django model has changed). For example, if we have the following model
from django.db import models class ForeignModel(models.Model): pass class Key(models.Model): value = models.CharField(blank = False, max_length = 12) foreignmodel = models.ForeignKey(ForeignModel, blank = True, null = True) class Session(models.Model): key = models.ForeignKey(Key)
If we add some data (and some where the foreignmodel value is null), and we then change the foreignmodel field to require a value, this triggers the bug:
foreignmodel = models.ForeignKey(ForeignModel)
Of course, the data is then invalid and the database does not reflect the django model. The QuerySet count() method is using "select count(field)" for obvious speed reasons, but do not seem to perform the select_related.
For example, the current implementation of:
>>> Session.objects.filter(key__value = "1").select_related().count() 15
Seems to be translated to the following SQL code:
> SELECT COUNT(S.id) FROM example_session S, example_key K WHERE S.key_id = K.id AND K.value = "1"; 15
It should probably be modified to the following SQL code:
> SELECT COUNT(S.id) FROM example_session S, example_key K, example_foreignmodel F WHERE S.key_id = K.id AND K.foreignmodel_id = F.id AND K.value = "1"; 7
This would return the same data as the following django code:
>>> len(Session.objects.filter(key__value = "1").select_related()) 7
comment:6 by , 16 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
This example seems to be based on some misunderstanding, possibly in how select_related()
works (since you're telling Django to use it in an inappropriate situation).
Pulling in the extra table in your "preferred" SQL should have precisely zero effect for the count query you asked for. None of the fields in that table are being used, so Django (correctly) ignores the select_related()
piece, which is just an optimisation for when we are pulling back data from that table. You are trying to use select_related()
as some way to filter out rows by only counting rows that have a matching row in the corresponding related table (via the inner join on the tables).
This isn't a question of Django needing to know about all the columns in the related table or not. It simply doesn't care, since it makes no difference to the final result (counting the id values in the "S") table. You aren't asking for a count where values in the related table exist, you are asking simply for a count (you can do the former query, but you have to explicitly say so).
You have told Django that every instance of Key
has a corresponding entry in ForeignKey
and, yet, it appears that that isn't actually correct, which is why the count and the select_related()
are returning different results. You can omit columns in the database declaration, but you can't lie about the type of join required to pull in related columns.
Django only knows that you said an inner join would suffice for selecting all Key
entries when connecting to ForeignKey
, since every row has a corresponding value in the other table. This turns out not to be correct, which is why you're seeing what you're seeing.
Change your model definition to include null=True
on the related field and the results will make sense.
There's no Django bug here.
This is difficult to diagnose, since the problem is highly dependent on the original models. In particular, which relations are nullable (have
null=True
). Can you provide a description of the models you're using. I don't need all the gory details, but at least the outline ofSession
,Key
and how they're linked together.