#24289 closed Bug (fixed)
Is usage of many_to_one and one_to_many terms confusing for relation flags?
Reported by: | Anssi Kääriäinen | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.8alpha1 |
Severity: | Release blocker | Keywords: | |
Cc: | Loic Bistuer, pirosb3, cmawebsite@… | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
EDIT: we haven't mixed the terms up, I did that myself. The docs might need an update though.
It seems we have mixed up the naming of many_to_one and one_to_many. For example these references point out that ForeignKey is many_to_one: http://www-01.ibm.com/support/knowledgecenter/SSEPGG_8.2.0/com.ibm.db2.udb.doc/admin/c0004733.htm, https://docs.djangoproject.com/en/1.7/topics/db/examples/many_to_one/.
But in Django, ForeignKey is one_to_many.
I'm not going to mark this as accepted, as I always get confused with many-to-one and one-to-many. So, please do triage this carefully!
If at all possible I think we should use something that is less easy to get mixed up. I never recall how these terms should be interpreted, and I guess I am not alone here. Some ideas:
- Just is_multivalued flag: Check if field is one to many: "not field.is_multivalued and field.rel.is_multivalued"
- local_multivalued and remote_multivalued flags
My experience from ORM work is that the interesting thing to check is if the relation can have multiple values on the remote side. The local multi-valuedness isn't usually that interesting (although it might be in other contexts).
Change History (24)
comment:1 by , 10 years ago
Description: | modified (diff) |
---|---|
Resolution: | → invalid |
Status: | new → closed |
Summary: | ForeignKey is many_to_one, not one_to_many → Is usage of many_to_one and one_to_many terms confusing for relation flags? |
comment:2 by , 10 years ago
For the record - This came up during the meta refactor work as well. I agree it's confusing. But the thing is, *either* way is going to confuse someone.
As currently defined, ForeignKeys are One-to-many, based on the interpretation that there is "one" value on the side that defines them, and "many" on the other side. However, it's easy to look at it the other way, and say that a ForeignKey is defining the fact that there where it is defined, it's going to point to "many" objects, and on the other side, it will point to "one" object.
It's worth noting that this same confusion exists in various Entity-Relationship diagram notations - the side on which you put the visual "many" varies (c.f. Martin/Crow foot and IDEF1X notation).
So, for me, this is a bikeshed; however, it's a bikeshed with a precedent that goes back 10 years, so I'd be against changing it.
comment:3 by , 10 years ago
Cc: | added |
---|
I understand that I'm probably wrong because this seems logical to a few people, but I just can't get it for some reason.
ForeignKeys are One-to-many, based on the interpretation that there is "one" value on the side that defines them, and "many" on the other side.
However, it's easy to look at it the other way, and say that a ForeignKey is defining the fact that there where it is defined, it's going to point to "many" objects, and on the other side, it will point to "one" object.
Both statements read the same to me, the first one says "one" value on the side that defines them
the other one says where it is defined, it's going to point to "many" objects
(i.e. if the side it points to is the "many" side, then it is itself the "one" side).
From my perspective:
class Employee(Model): company = ForeignKey(to=Company) # many-to-one >>> Employee.objects.first().company <Company: 1> # one-to-many >>> Company.objects.first().employee_set [<Employee: 1>, <Employee: 2>, <Employee: 3>]
Employee
where it is defined points to "exactly one object" (Company
), the reverse field however Company.employee_set
points to "many objects" (all the employees).
comment:4 by , 10 years ago
Or to give another example:
>>> Employee._meta.get_field('company).to <class 'app.models.Company'> >>> Employee._meta.get_field('company).one_to_many True
Therefore I'd expect many companies when I do employee.company
.
Technically to
lives on Field.rel
, and not on Field.to
but this is likely to go away when we introduce reverse fields. rel
is ambiguous because since the meta refactor it means both "the relation" and "the reverse field".
comment:5 by , 10 years ago
Resolution: | invalid |
---|---|
Status: | closed → new |
Triage Stage: | Unreviewed → Accepted |
I am reopening this because we have, at the very least, an inconsistency within Django itself. The documentation at https://docs.djangoproject.com/en/dev/topics/db/examples/many_to_one/ says that ForeignKey
defines a many-to-one relationship, but an actual ForeignKey
claims to be one-to-many in the field flags. I suppose this inconsistency could be justified by saying "yes, a ForeignKey -defines- a many-to-one in the opposite direction", but I think that's a pretty stretched justification.
Russell, I don't understand your claim that we have "a precedent that goes back 10 years" here. We're discussing the one_to_many
and many_to_one
field flags, which are new in Django 1.8. git grep one_to_many
returns zero results in the stable/1.7.x
branch. If we have any 10-year precedent here, it's in the documentation I linked, which uses the opposite terminology from what we've now introduced in the field flags.
It does seem to me that usage is mixed: the IBM docs at http://www-01.ibm.com/support/knowledgecenter/SSEPGG_8.2.0/com.ibm.db2.udb.doc/admin/c0004733.htm do say that "the relationship between employees (single-valued) and departments (multi-valued) is a one-to-many relationship" in a case where Employee has an FK to Department. That would be consistent with our current flags (but not with our docs).
But my searching seems to indicate that the opposite usage is much more common in practice:
- http://en.wikipedia.org/wiki/Foreign_key says that "the relationship between the two tables is called a one to many relationship between the referenced table and the referencing table" (note that the table with the FK is the "referencing" table, so this is saying that a forward FK is many-to-one and the reverse is one-to-many.)
- http://www.databaseprimer.com/pages/relationship_1tox/ says that "In a one-to-many relationship between Table A and Table B, each row in Table A is linked to 0, 1 or many rows in Table B." Again, this would indicate that an FK is many-to-one.
- The accepted answer at http://stackoverflow.com/questions/16119531/hibernate-jpa-manytoone-vs-onetomany indicates that in Hibernate, an FK is
ManyToOne
and the reverse relationship isOneToMany
.
On the whole, it seems to me that we should go with the more common (and, IMO, more intuitive) usage, which is also what's been in our docs for quite a long time. This would imply reversing the current field flags, so that a ForeignKey
becomes many_to_one
. But I certainly can't point to conclusive evidence that this is correct.
comment:6 by , 10 years ago
The other option, as akaariai suggested, would be to abandon this apparently ambiguous terminology and use something else instead, like remote_multivalued
and local_multivalued
or something. The downside of this is that ManyToMany
and OneToOne
are both firmly embedded in Django terminology, so it makes some sense to continue the pattern and have many_to_one
and one_to_many
cardinality flags.
comment:7 by , 10 years ago
I agree with comment 5. For me ForeignKey is ManyToOne. Many instances of the model that has the FK can point to one instance of the target model.
comment:8 by , 10 years ago
Severity: | Normal → Release blocker |
---|
Bumping to Release Blocker.
Even if we decide on the status quo, we need to make a decision before 1.8 is released.
comment:9 by , 10 years ago
It looks like we have myself, Loic, and Aymeric in favor of swapping the current usage of many_to_one
and one_to_many
field flags. Anssi or Russell, do you have objections to moving ahead with that change?
comment:10 by , 10 years ago
I'm more in favour of using different terminology, though I don't have any suggestions on what might be 'better'.
I had justified the status quo somehow when we discussed this on IRC awhile back, but I can't justify it anymore. The proposed change makes sense.
ForeignKey = many_to_one
ReverseRel = one_to_many
comment:11 by , 10 years ago
Cc: | added |
---|
Also adding PirosB3 to cc in case he has additional thoughts on this, since he wrote the code.
comment:12 by , 10 years ago
Cc: | added |
---|
I believe the precedent that Russ is talking about is ManyToOneRel
(ie author.book_set
), which I also think should be a one_to_many.
ForeignKeys are One-to-many, based on the interpretation that there is "one" value on the side that defines them, and "many" on the other side.
The UI for this is a dropdown where for a book, you choose (one) of "many" (possible) authors. However, once the value is chosen and the relationship is set, there's only one author.
follow-up: 16 comment:13 by , 10 years ago
I don't think reversing the many_to_one and one_to_many definitions will help. The terms are confusing, no matter which way they are used, and I believe the terms are used the correct way (as in, the most often seen definition)
I see two ways forward:
- Keep the flags as is. Maybe add a bit more documentation somewhere that tell explicitly what these flags mean in Django.
- Use some other flags for relations. A possibility is local_multiple and remote_multiple flags. These flags would tell if a single object can have multiple objects. remote_multiple tells that the local object can have multiple objects on the remote side. local_multiple is the opposite: can a single remote object have multiple objects on the local side. Still a bit confusing, but I think these are easier to recall correctly.
A bonus of the second approach is that I believe the local_multiple and remote_multiple properties are what we are most often interested about: we can select_related if the relation is remote_multiple=False. We need to use subqueries with .exclude() when the relation is remote_multiple=True. The widget in forms is SelectMultiple when remote_multiple=True.
We could also use just a has_many flag, but then it's hard to say if an ArrayField should have has_many set. The remote_multiple is better, as it is explicitly about relations.
We could even have both n_to_n flags, and the local_multiple and remote_multiple flags. We could use the checks framework to warn if the flags aren't defined in a consistent way.
comment:14 by , 10 years ago
From my understanding historically Field.rel
meant "relation", not "related field". This only changed with #21414 where we replaced RelatedObject
by *Rel
objects and the meta refactor that made these show up in get_fields()
.
So if anything I find that isinstance(ForeignKey.rel, ManyToOneRel) == True
is consistent with ForeignKey.many_to_one == True
.
Also I wouldn't get to hung up on the naming of *Rel
or *Descriptor
objects, naming of things in related.py
is extremely confusing and inconsistent.
I'm really +1 on swapping them, the "to" from ForeignKey(to=Company)
should mean the same thing as the "to" from "ForeignKey.many_to_one" IMO.
I really don't like has_many
(it was initially part of the meta refactor) specifically because of fields like ArrayField
.
comment:15 by , 10 years ago
Using something like "remote_many" is also fine with me.
One thing we many need to do it try going through the codebase and changing some isinstance() check to use these flags. I know the admin for example, needs to be able to predict how things will cascade when deleting an object, which a "field.one_to_one" flag is not a good enough on its own for that.
comment:16 by , 10 years ago
Hi Anssi,
Replying to akaariai:
I don't think reversing the many_to_one and one_to_many definitions will help. The terms are confusing, no matter which way they are used,
I think either way we use them, most people will simply adapt their code to Django's and it will be fine. But I still think we should use them as correctly and consistently as we can, and in the way that is least likely to result in confusion.
and I believe the terms are used the correct way (as in, the most often seen definition)
I am curious if you have evidence to support this, because I have concluded the opposite, based on my searches, as documented in comment 5.
The strongest evidence I have is that Hibernate uses a @ManyToOne
annotation to annotate the equivalent of a Django ForeignKey, and a @OneToMany
annotation to annotate the reverse relationship. Hibernate is a very well-known and widely-used Java ORM -- do you think that their usage is wrong/uncommon?
I see two ways forward:
- Keep the flags as is. Maybe add a bit more documentation somewhere that tell explicitly what these flags mean in Django.
I have not seen any good argument yet to stay with the status quo, instead of flipping them to match the usage in e.g. Hibernate.
- Use some other flags for relations. A possibility is local_multiple and remote_multiple flags. These flags would tell if a single object can have multiple objects. remote_multiple tells that the local object can have multiple objects on the remote side. local_multiple is the opposite: can a single remote object have multiple objects on the local side. Still a bit confusing, but I think these are easier to recall correctly.
Possibly. My only problem with this is that since we have ManyToManyField
and OneToOneField
(and those names are certainly not changing), it makes a lot of sense to keep the many_to_many
and one_to_one
flags. And once we have those two, it seems like a natural extension to also have many_to_one
and one_to_many
, rather than introducing an entirely new terminology.
A bonus of the second approach is that I believe the local_multiple and remote_multiple properties are what we are most often interested about: we can select_related if the relation is remote_multiple=False. We need to use subqueries with .exclude() when the relation is remote_multiple=True. The widget in forms is SelectMultiple when remote_multiple=True.
This is a good point.
We could also use just a has_many flag, but then it's hard to say if an ArrayField should have has_many set. The remote_multiple is better, as it is explicitly about relations.
We could even have both n_to_n flags, and the local_multiple and remote_multiple flags. We could use the checks framework to warn if the flags aren't defined in a consistent way.
I like this idea in theory, though I would like to see a patch for this to verify that local_multiple
and remote_multiple
are in fact as useful internally as it seems they should be.
I think if we introduce local_multiple
and remote_multiple
with the semantics you suggest, it makes it even more important that we swap the current usage of many_to_one
and one_to_many
. It just seems entirely wrong to have a field labeled one_to_many
with local_multiple=True
and remote_multiple=False
, which is what the status quo would mean.
comment:18 by , 10 years ago
It's worth noting that SQLAlchemy also uses both the "many to one" and "one to many" terminologies.
It's well documented http://docs.sqlalchemy.org/en/rel_0_9/orm/basic_relationships.html#basic-relationship-patterns
The relation accessor on the model with the FK column is a many-to-one, the relation accessor on the other model is a one-to-many, so our status quo is effectively reverse to SQLAlchemy.
That both Hibernate and SQLAlchemy use the "many to one" and "one to many" terminologies comforts me in the idea that these aren't too confusing provided they are used in a logical way and properly documented. I don't think we should scrap them on the premise that these will always confuse people, I see value in being consistent with other major ORMs.
follow-up: 21 comment:19 by , 10 years ago
Thanks for tracking that down, Loic. In my mind that settles the question definitively - unless we remove these flags entirely, we must make them consistent with the usage of this terminology in other major ORMs.
I say we move forward on this ticket before the beta release by swapping the current usage of many_to_one
and one_to_many
in both 1.8 and master, and not introducing any new flags at the moment (though if someone is motivated to work on that for 1.9, I'm open to it.)
Loic, I won't have time to do the PR today - any chance you might?
comment:20 by , 10 years ago
In that case, given that other major ORMs use the completely opposite naming, I'm +1 on adopting to their naming.
comment:21 by , 10 years ago
Loic, I won't have time to do the PR today - any chance you might?
https://github.com/django/django/pull/4127 let's see what Jenkins has to say.
comment:22 by , 10 years ago
Has patch: | set |
---|---|
Triage Stage: | Accepted → Ready for checkin |
Patch looks good to me; smaller than I expected. I guess we aren't using these flags that many places yet, as collinanderson pointed out.
Search-and-replace FTW :-)
comment:23 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
It seems I did the mixup based on the first sentence of Django's docs (which I guess is technically correct, you use ForeignKey to add the automatically created many-to-one relation), and table 3 of IBM's docs, the first line in there seems to be many-to-one, although the meaning of the header is that the relationships are many-to-one, but the direction the terms appear in the table can actually be one-to-many (nice!).
I'll mark this as invalid. I still do think the terms are confusing.