Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#24289 closed Bug (fixed)

Is usage of many_to_one and one_to_many terms confusing for relation flags?

Reported by: Anssi Kääriäinen Owned by: nobody
Component: Database layer (models, ORM) Version: 1.8alpha1
Severity: Release blocker Keywords:
Cc: Loic Bistuer, pirosb3, cmawebsite@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Anssi Kääriäinen)

EDIT: we haven't mixed the terms up, I did that myself. The docs might need an update though.

It seems we have mixed up the naming of many_to_one and one_to_many. For example these references point out that ForeignKey is many_to_one: http://www-01.ibm.com/support/knowledgecenter/SSEPGG_8.2.0/com.ibm.db2.udb.doc/admin/c0004733.htm, https://docs.djangoproject.com/en/1.7/topics/db/examples/many_to_one/.

But in Django, ForeignKey is one_to_many.

I'm not going to mark this as accepted, as I always get confused with many-to-one and one-to-many. So, please do triage this carefully!

If at all possible I think we should use something that is less easy to get mixed up. I never recall how these terms should be interpreted, and I guess I am not alone here. Some ideas:

  • Just is_multivalued flag: Check if field is one to many: "not field.is_multivalued and field.rel.is_multivalued"
  • local_multivalued and remote_multivalued flags

My experience from ORM work is that the interesting thing to check is if the relation can have multiple values on the remote side. The local multi-valuedness isn't usually that interesting (although it might be in other contexts).

Change History (24)

comment:1 by Anssi Kääriäinen, 9 years ago

Description: modified (diff)
Resolution: invalid
Status: newclosed
Summary: ForeignKey is many_to_one, not one_to_manyIs usage of many_to_one and one_to_many terms confusing for relation flags?

It seems I did the mixup based on the first sentence of Django's docs (which I guess is technically correct, you use ForeignKey to add the automatically created many-to-one relation), and table 3 of IBM's docs, the first line in there seems to be many-to-one, although the meaning of the header is that the relationships are many-to-one, but the direction the terms appear in the table can actually be one-to-many (nice!).

I'll mark this as invalid. I still do think the terms are confusing.

comment:2 by Russell Keith-Magee, 9 years ago

For the record - This came up during the meta refactor work as well. I agree it's confusing. But the thing is, *either* way is going to confuse someone.

As currently defined, ForeignKeys are One-to-many, based on the interpretation that there is "one" value on the side that defines them, and "many" on the other side. However, it's easy to look at it the other way, and say that a ForeignKey is defining the fact that there where it is defined, it's going to point to "many" objects, and on the other side, it will point to "one" object.

It's worth noting that this same confusion exists in various Entity-Relationship diagram notations - the side on which you put the visual "many" varies (c.f. Martin/Crow foot and IDEF1X notation).

So, for me, this is a bikeshed; however, it's a bikeshed with a precedent that goes back 10 years, so I'd be against changing it.

comment:3 by Loic Bistuer, 9 years ago

Cc: Loic Bistuer added

I understand that I'm probably wrong because this seems logical to a few people, but I just can't get it for some reason.

ForeignKeys are One-to-many, based on the interpretation that there is "one" value on the side that defines them, and "many" on the other side.

However, it's easy to look at it the other way, and say that a ForeignKey is defining the fact that there where it is defined, it's going to point to "many" objects, and on the other side, it will point to "one" object.

Both statements read the same to me, the first one says "one" value on the side that defines them the other one says where it is defined, it's going to point to "many" objects (i.e. if the side it points to is the "many" side, then it is itself the "one" side).

From my perspective:

class Employee(Model):
    company = ForeignKey(to=Company)

# many-to-one
>>> Employee.objects.first().company
<Company: 1>

# one-to-many
>>> Company.objects.first().employee_set
[<Employee: 1>, <Employee: 2>, <Employee: 3>]

Employee where it is defined points to "exactly one object" (Company), the reverse field however Company.employee_set points to "many objects" (all the employees).

comment:4 by Loic Bistuer, 9 years ago

Or to give another example:

>>> Employee._meta.get_field('company).to
<class 'app.models.Company'>
>>> Employee._meta.get_field('company).one_to_many
True

Therefore I'd expect many companies when I do employee.company.

Technically to lives on Field.rel, and not on Field.to but this is likely to go away when we introduce reverse fields. rel is ambiguous because since the meta refactor it means both "the relation" and "the reverse field".

comment:5 by Carl Meyer, 9 years ago

Resolution: invalid
Status: closednew
Triage Stage: UnreviewedAccepted

I am reopening this because we have, at the very least, an inconsistency within Django itself. The documentation at https://docs.djangoproject.com/en/dev/topics/db/examples/many_to_one/ says that ForeignKey defines a many-to-one relationship, but an actual ForeignKey claims to be one-to-many in the field flags. I suppose this inconsistency could be justified by saying "yes, a ForeignKey -defines- a many-to-one in the opposite direction", but I think that's a pretty stretched justification.

Russell, I don't understand your claim that we have "a precedent that goes back 10 years" here. We're discussing the one_to_many and many_to_one field flags, which are new in Django 1.8. git grep one_to_many returns zero results in the stable/1.7.x branch. If we have any 10-year precedent here, it's in the documentation I linked, which uses the opposite terminology from what we've now introduced in the field flags.

It does seem to me that usage is mixed: the IBM docs at http://www-01.ibm.com/support/knowledgecenter/SSEPGG_8.2.0/com.ibm.db2.udb.doc/admin/c0004733.htm do say that "the relationship between employees (single-valued) and departments (multi-valued) is a one-to-many relationship" in a case where Employee has an FK to Department. That would be consistent with our current flags (but not with our docs).

But my searching seems to indicate that the opposite usage is much more common in practice:

  • http://en.wikipedia.org/wiki/Foreign_key says that "the relationship between the two tables is called a one to many relationship between the referenced table and the referencing table" (note that the table with the FK is the "referencing" table, so this is saying that a forward FK is many-to-one and the reverse is one-to-many.)

On the whole, it seems to me that we should go with the more common (and, IMO, more intuitive) usage, which is also what's been in our docs for quite a long time. This would imply reversing the current field flags, so that a ForeignKey becomes many_to_one. But I certainly can't point to conclusive evidence that this is correct.

comment:6 by Carl Meyer, 9 years ago

The other option, as akaariai suggested, would be to abandon this apparently ambiguous terminology and use something else instead, like remote_multivalued and local_multivalued or something. The downside of this is that ManyToMany and OneToOne are both firmly embedded in Django terminology, so it makes some sense to continue the pattern and have many_to_one and one_to_many cardinality flags.

comment:7 by Aymeric Augustin, 9 years ago

I agree with comment 5. For me ForeignKey is ManyToOne. Many instances of the model that has the FK can point to one instance of the target model.

comment:8 by Loic Bistuer, 9 years ago

Severity: NormalRelease blocker

Bumping to Release Blocker.

Even if we decide on the status quo, we need to make a decision before 1.8 is released.

comment:9 by Carl Meyer, 9 years ago

It looks like we have myself, Loic, and Aymeric in favor of swapping the current usage of many_to_one and one_to_many field flags. Anssi or Russell, do you have objections to moving ahead with that change?

comment:10 by Josh Smeaton, 9 years ago

I'm more in favour of using different terminology, though I don't have any suggestions on what might be 'better'.

I had justified the status quo somehow when we discussed this on IRC awhile back, but I can't justify it anymore. The proposed change makes sense.

ForeignKey = many_to_one
ReverseRel = one_to_many

comment:11 by Carl Meyer, 9 years ago

Cc: pirosb3 added

Also adding PirosB3 to cc in case he has additional thoughts on this, since he wrote the code.

comment:12 by Collin Anderson, 9 years ago

Cc: cmawebsite@… added

I believe the precedent that Russ is talking about is ManyToOneRel (ie author.book_set), which I also think should be a one_to_many.

ForeignKeys are One-to-many, based on the interpretation that there is "one" value on the side that defines them, and "many" on the other side.

The UI for this is a dropdown where for a book, you choose (one) of "many" (possible) authors. However, once the value is chosen and the relationship is set, there's only one author.

comment:13 by Anssi Kääriäinen, 9 years ago

I don't think reversing the many_to_one and one_to_many definitions will help. The terms are confusing, no matter which way they are used, and I believe the terms are used the correct way (as in, the most often seen definition)

I see two ways forward:

  • Keep the flags as is. Maybe add a bit more documentation somewhere that tell explicitly what these flags mean in Django.
  • Use some other flags for relations. A possibility is local_multiple and remote_multiple flags. These flags would tell if a single object can have multiple objects. remote_multiple tells that the local object can have multiple objects on the remote side. local_multiple is the opposite: can a single remote object have multiple objects on the local side. Still a bit confusing, but I think these are easier to recall correctly.

A bonus of the second approach is that I believe the local_multiple and remote_multiple properties are what we are most often interested about: we can select_related if the relation is remote_multiple=False. We need to use subqueries with .exclude() when the relation is remote_multiple=True. The widget in forms is SelectMultiple when remote_multiple=True.

We could also use just a has_many flag, but then it's hard to say if an ArrayField should have has_many set. The remote_multiple is better, as it is explicitly about relations.

We could even have both n_to_n flags, and the local_multiple and remote_multiple flags. We could use the checks framework to warn if the flags aren't defined in a consistent way.

comment:14 by Loic Bistuer, 9 years ago

From my understanding historically Field.rel meant "relation", not "related field". This only changed with #21414 where we replaced RelatedObject by *Rel objects and the meta refactor that made these show up in get_fields().

So if anything I find that isinstance(ForeignKey.rel, ManyToOneRel) == True is consistent with ForeignKey.many_to_one == True.

Also I wouldn't get too hung up on the naming of *Rel or *Descriptor objects, naming of things in related.py is extremely confusing and inconsistent.

I'm really +1 on swapping them, the "to" from ForeignKey(to=Company) should mean the same thing as the "to" from "ForeignKey.many_to_one" IMO.

I really don't like has_many (it was initially part of the meta refactor) specifically because of fields like ArrayField.

Last edited 9 years ago by Loic Bistuer (previous) (diff)

comment:15 by Collin Anderson, 9 years ago

Using something like "remote_many" is also fine with me.

One thing we many need to do it try going through the codebase and changing some isinstance() check to use these flags. I know the admin for example, needs to be able to predict how things will cascade when deleting an object, which a "field.one_to_one" flag is not a good enough on its own for that.

in reply to:  13 comment:16 by Carl Meyer, 9 years ago

Hi Anssi,

Replying to akaariai:

I don't think reversing the many_to_one and one_to_many definitions will help. The terms are confusing, no matter which way they are used,

I think either way we use them, most people will simply adapt their code to Django's and it will be fine. But I still think we should use them as correctly and consistently as we can, and in the way that is least likely to result in confusion.

and I believe the terms are used the correct way (as in, the most often seen definition)

I am curious if you have evidence to support this, because I have concluded the opposite, based on my searches, as documented in comment 5.

The strongest evidence I have is that Hibernate uses a @ManyToOne annotation to annotate the equivalent of a Django ForeignKey, and a @OneToMany annotation to annotate the reverse relationship. Hibernate is a very well-known and widely-used Java ORM -- do you think that their usage is wrong/uncommon?

I see two ways forward:

  • Keep the flags as is. Maybe add a bit more documentation somewhere that tell explicitly what these flags mean in Django.

I have not seen any good argument yet to stay with the status quo, instead of flipping them to match the usage in e.g. Hibernate.

  • Use some other flags for relations. A possibility is local_multiple and remote_multiple flags. These flags would tell if a single object can have multiple objects. remote_multiple tells that the local object can have multiple objects on the remote side. local_multiple is the opposite: can a single remote object have multiple objects on the local side. Still a bit confusing, but I think these are easier to recall correctly.

Possibly. My only problem with this is that since we have ManyToManyField and OneToOneField (and those names are certainly not changing), it makes a lot of sense to keep the many_to_many and one_to_one flags. And once we have those two, it seems like a natural extension to also have many_to_one and one_to_many, rather than introducing an entirely new terminology.

A bonus of the second approach is that I believe the local_multiple and remote_multiple properties are what we are most often interested about: we can select_related if the relation is remote_multiple=False. We need to use subqueries with .exclude() when the relation is remote_multiple=True. The widget in forms is SelectMultiple when remote_multiple=True.

This is a good point.

We could also use just a has_many flag, but then it's hard to say if an ArrayField should have has_many set. The remote_multiple is better, as it is explicitly about relations.

We could even have both n_to_n flags, and the local_multiple and remote_multiple flags. We could use the checks framework to warn if the flags aren't defined in a consistent way.

I like this idea in theory, though I would like to see a patch for this to verify that local_multiple and remote_multiple are in fact as useful internally as it seems they should be.

I think if we introduce local_multiple and remote_multiple with the semantics you suggest, it makes it even more important that we swap the current usage of many_to_one and one_to_many. It just seems entirely wrong to have a field labeled one_to_many with local_multiple=True and remote_multiple=False, which is what the status quo would mean.

comment:17 by Collin Anderson, 9 years ago

idea: one_to_one, many_to_many, foreign_key, and one_to_many.

comment:18 by Loic Bistuer, 9 years ago

It's worth noting that SQLAlchemy also uses both the "many to one" and "one to many" terminologies.

It's well documented http://docs.sqlalchemy.org/en/rel_0_9/orm/basic_relationships.html#basic-relationship-patterns

The relation accessor on the model with the FK column is a many-to-one, the relation accessor on the other model is a one-to-many, so our status quo is effectively reverse to SQLAlchemy.

That both Hibernate and SQLAlchemy use the "many to one" and "one to many" terminologies comforts me in the idea that these aren't too confusing provided they are used in a logical way and properly documented. I don't think we should scrap them on the premise that these will always confuse people, I see value in being consistent with other major ORMs.

comment:19 by Carl Meyer, 9 years ago

Thanks for tracking that down, Loic. In my mind that settles the question definitively - unless we remove these flags entirely, we must make them consistent with the usage of this terminology in other major ORMs.

I say we move forward on this ticket before the beta release by swapping the current usage of many_to_one and one_to_many in both 1.8 and master, and not introducing any new flags at the moment (though if someone is motivated to work on that for 1.9, I'm open to it.)

Loic, I won't have time to do the PR today - any chance you might?

comment:20 by Markus Holtermann, 9 years ago

In that case, given that other major ORMs use the completely opposite naming, I'm +1 on adopting to their naming.

in reply to:  19 comment:21 by Loic Bistuer, 9 years ago

Loic, I won't have time to do the PR today - any chance you might?

https://github.com/django/django/pull/4127 let's see what Jenkins has to say.

comment:22 by Carl Meyer, 9 years ago

Has patch: set
Triage Stage: AcceptedReady for checkin

Patch looks good to me; smaller than I expected. I guess we aren't using these flags that many places yet, as collinanderson pointed out.

Search-and-replace FTW :-)

comment:23 by Loic Bistuer <loic.bistuer@…>, 9 years ago

Resolution: fixed
Status: newclosed

In 18c0aaa9123579375294fcc4a8ee7e3530176b88:

Fixed #24289 -- Reversed usage of Field.many_to_one and one_to_many.

Thanks Carl Meyer and Tim Graham for the reviews and to all involved
in the discussion.

comment:24 by Loic Bistuer <loic.bistuer@…>, 9 years ago

In 20b621eb3c3dd947a4fe77a2a11264b9072ef25c:

[1.8.x] Fixed #24289 -- Reversed usage of Field.many_to_one and one_to_many.

Thanks Carl Meyer and Tim Graham for the reviews and to all involved
in the discussion.

Backport of 18c0aaa9123579375294fcc4a8ee7e3530176b88 from master

Note: See TracTickets for help on using tickets.
Back to Top