#15200 closed (wontfix)
ManyToManyField combined with "|" leads to trouble
Reported by: | Klaas van Schelven | Owned by: | nobody |
---|---|---|---|
Component: | Uncategorized | Version: | 1.2 |
Severity: | Keywords: | ||
Cc: | Klaas van Schelven | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
ManyToManyField combined with "|" leads to trouble.
Specifically, the outer join that appears to be constructed for manytomany field lookups is not well cleaned up with where statements once an "or" ("|") is used.
$ cat theapp/models.py from django.db import models class AnObject(models.Model): nr = models.IntegerField() class HasRef(models.Model): obj = models.ManyToManyField(AnObject) field = models.IntegerField()
>>> from theapp.models import * >>> one = AnObject.objects.create(nr=1) >>> two = AnObject.objects.create(nr=2) >>> reffer = HasRef.objects.create(field=1) >>> reffer.obj.add(one) >>> reffer.obj.add(two) >>> HasRef.objects.all().filter(field=1) [<HasRef: HasRef object>] >>> HasRef.objects.all().filter(obj=one) [<HasRef: HasRef object>] >>> HasRef.objects.all().filter(models.Q(field=1) | models.Q(obj=one)) [<HasRef: HasRef object>, <HasRef: HasRef object>] >>> [o.pk for o in HasRef.objects.all().filter(models.Q(field=1) | models.Q(obj=one))] [1L, 1L]
I don't have the time right now to come up with tests proper, or a patch. The problem is real though...
My workaround:
>>> HasRef.objects.all().filter(models.Q(field=1) | models.Q(obj=one)).annotate(models.Count("id"))
The generated query (without workaround)
>>> print HasRef.objects.all().filter(models.Q(field=1) | models.Q(obj=AnObject.objects.get(nr=1))).query SELECT `theapp_hasref`.`id`, `theapp_hasref`.`field` FROM `theapp_hasref` LEFT OUTER JOIN `theapp_hasref_obj` ON (`theapp_hasref`.`id` = `theapp_hasref_obj`.`hasref_id`) WHERE (`theapp_hasref`.`field` = 1 OR `theapp_hasref_obj`.`anobject_id` = 1 )
Change History (5)
comment:1 by , 14 years ago
Cc: | added |
---|
comment:2 by , 14 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
comment:3 by , 14 years ago
Resolution: | invalid |
---|---|
Status: | closed → reopened |
Yes I'm indeed referring to the fact that there's a duplicate in the results. IMHO "abstraction leakage" is a bug and though it may be overcome with distinct() I think this is a little more fundamental.
comment:4 by , 14 years ago
Resolution: | → wontfix |
---|---|
Status: | reopened → closed |
Yes, but the issue is that it's a *documented* leaky abstraction, and one that is very difficult to avoid without imposing a large computational overhead -- which is why the leaky abstraction exists.
Trust me -- we've thought about this. That's why it's a documented leaky abstraction. If you want to advocate for a different, less leaky interpretation, feel free to raise the issue on django dev. But be warned: you're not going to get much attention unless you come armed with an actual solution in the form of working code, or, at the very least, a general SQL solution to the problem.
comment:5 by , 14 years ago
Hmmm... I guess you're talking about the following snippet:
By default, a QuerySet will not eliminate duplicate rows. In practice, this is rarely a problem, because simple queries such as Blog.objects.all() don't introduce the possibility of duplicate result rows. However, if your query spans multiple tables, it's possible to get duplicate results when a QuerySet is evaluated. That's when you'd use distinct().
IMHO this kind of leakiness is the worst kind (it happens only very rarely, and may happen intermittently depending on your data). Anyway I'm letting it rest for now.
Are you referring to the fact that there is a duplicate in the results? If this is the case, then this is a case of abstraction leakage, overcome with the use of distinct()