#6095 closed (fixed)
Add the ability to manually create M2M intermediary models
Reported by: | Jacob | Owned by: | floguy |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Keywords: | ||
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | yes |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
Consider this code:
class Person(models.Model): # ... class Group(models.Model): #... members = models.ManyToManyField(Item, through="Membership") class Membership(models.Model): person = models.ForeignKey(Person) group = models.ForeignKey(Group) date_joined = models.DateTimeField() ########## >>> from myapp.models import Person >>> p = Person.objects.get(...) >>> p.groups.all() [<some list of groups>]
This should work.
That is, I should be able to manually provide the model to be used by the M2M intermediary table (which is currently hard-coded). Currently if you leave out the "through" bit this is valid Django code, but you won't be able use the convienent M2M methods -- you'd have to do something like [m.group for m in person.membership_set.all()]
instead of person.groups.all()
.
This fixes a lot of complaints about the M2M system (#785 is one of them, but there are many others).
Attachments (27)
Change History (111)
comment:1 by , 17 years ago
comment:2 by , 17 years ago
Cc: | added |
---|
comment:3 by , 17 years ago
It should be possible to implement this independent of qs-refactor. The changes required are all in the handling of model definitions and the construction of related object descriptors. While these descriptors do utilize the query syntax, they don't currently require access to the queryset internals, so they should be isolated from any qs-refactor changes.
Note for those planning to attempt work on this ticket - you may want to check some historical discussion on the topic, especially with regards to the handling of the descriptors associated with the m2m relationship.
comment:5 by , 17 years ago
Let me try again (and sorry for not previewing before I posted):
Replying to jacob:
> > class Person(models.Model): > # ... > > class Group(models.Model): > #... > > members = models.ManyToManyField(Item, through="Membership") > > class Membership(models.Model): > person = models.ForeignKey(Person) > group = models.ForeignKey(Group) > date_joined = models.DateTimeField() > >
Should that read as follows?
members = models.ManyToManyField(
Person, through="Membership")
comment:6 by , 17 years ago
First of all, this idea really rocks. I was going to take a stab at implementing it, but fist wanted to flesh out exactly what are the implications of the proposed syntax. I.E. What is explicitly defined, what is inferred, etc. So I created some doctests which I'll upload now. Could a core dev let me know if I have the right idea, or if I'm totally off base with these tests?
The tests basically cover three cases:
1) Explicitly defining a m2m field on one of the models. Person, Group with m2m field, Membership. I assume that accessing group from person is p.group_set...etc.
2) Explicitly defining a m2m field on both models. So Person with m2m field, Group with m2m field, and Membership.
3) Inferring that group and person are related because there exists a model with a foreign key to both. Is this desired behavior? If so, do we want some level of specificity regarding the m2m model (I'm thinking a Meta property like many_to_many = ('person', 'group') )?
comment:7 by , 17 years ago
floguy: awesome you'd like to work on this. Sorry about the terseness of the ticket; I'd written it mostly as notes for when I started working on this myself, but I'll happily watch over your shoulder if you'd like to work on it.
And yes, the example should read Person
, not Item
-- that leaked in from an earlier, worse example.
Looking through your examples, it seems you've got the idea down. A couple of points:
- I'm not sure you need to be able to specify the M2M on both objects (as in your *TwoWays examples) -- that doesn't match how M2M works currently. I suppose it doesn't hurt, but I think I'd prefer to be minimally invasive and keep the semantics of M2M as similar as possible. We've already got
related_name
for the "other side" of the relationship. - You've got a typo in that example, BTW -- look at
PersonExplicitTwoWays.groups
. - I'm -1 on the implicit option. There's a difference between a model with a relation to two others and a M2M intermediary table. There's no need to have multiple ways of spelling the same relationship -- this isn't Perl, after all :)
So if you want to press forward, do so with just the one-way version. We can look at the two-way/implicit versions later if you're still hot to add 'em.
by , 17 years ago
Attachment: | 6095-alpha-01.diff added |
---|
Rudimentary patch to get the base functionality working. Tests were first, then implementation.
comment:9 by , 17 years ago
I've gotten a start on getting the model definitions working and modifying the descriptors to work as well. There are probably more elegant ways of going about it, but after trying several of them, this was by far the least invasive.
If this approach is deemed OK, I'll go ahead and continue on with adding error propagation (currently if you have extra fields on the intermediary table which don't have a default value, the db will complain). After that, I can tackle figuring out how to make the query field lookups work.
comment:10 by , 17 years ago
Cc: | added |
---|
by , 17 years ago
Attachment: | 6095-alpha-02.diff added |
---|
Updated patch to handle custom related_name and custom db_column on manually created foreign keys.
follow-up: 13 comment:11 by , 17 years ago
Has patch: | set |
---|---|
Needs documentation: | set |
Owner: | changed from | to
Patch needs improvement: | set |
Status: | new → assigned |
Since I haven't gotten any comments from the other devs, I'll assume to just forge on ahead. I've added tests and code to handle related_name and db_column parameters. At this point, I'd consider the model declaration and m2m descriptors portion of this patch as usable. What's left to do is determine whether these objects are queryable or if patches are needed there. Then, on to documentation!
comment:12 by , 17 years ago
Cc: | added |
---|
follow-up: 14 comment:13 by , 17 years ago
Replying to floguy:
Since I haven't gotten any comments from the other devs, I'll assume to just forge on ahead. I've added tests and code to handle related_name and db_column parameters. At this point, I'd consider the model declaration and m2m descriptors portion of this patch as usable. What's left to do is determine whether these objects are queryable or if patches are needed there. Then, on to documentation!
I haven't done a full tear down of your code, but it looks good so far, and the the tests cases appear to be a good starting point.
The biggest problem I can see is that you haven't tackled querying yet - in particular, any querying of the intermediate model i.e., queries on membership.date_joined, especially starting from the Person or Group models. This is the messy part that will require a lot of work. However, I'm eager to see what you can do.
follow-up: 15 comment:14 by , 17 years ago
Replying to russellm:
Replying to floguy:
Since I haven't gotten any comments from the other devs, I'll assume to just forge on ahead. I've added tests and code to handle related_name and db_column parameters. At this point, I'd consider the model declaration and m2m descriptors portion of this patch as usable. What's left to do is determine whether these objects are queryable or if patches are needed there. Then, on to documentation!
I haven't done a full tear down of your code, but it looks good so far, and the the tests cases appear to be a good starting point.
The biggest problem I can see is that you haven't tackled querying yet - in particular, any querying of the intermediate model i.e., queries on membership.date_joined, especially starting from the Person or Group models. This is the messy part that will require a lot of work. However, I'm eager to see what you can do.
Ok so maybe I'm completely crazy, and please correct me if I'm wrong, but I think that due to the way querysets resolve keywords into fields, this is a non-issue.
Before making any modifications to the queryset code, I wanted to first see what was possible with the current API (both because I'm completely unfamiliar with the queryset internals, and because I know that those internals are about to change). A new patch will be uploaded with updated tests detailing how to filter through properties on the intermediary model. The relevant tests are just below tests/modeltests/m2m_manual/models.py line 93 (###QUERY TESTS###). Again, please let me know if I'm missing something here.
(Also added in this patch is an assertion error if someone tries to specify both 'db_table' and 'through' kwargs to a ManyToManyField. Changes to db_table should instead be done in the Meta class of the intermediary model.)
by , 17 years ago
Attachment: | 6095-alpha-03.diff added |
---|
follow-up: 16 comment:15 by , 17 years ago
Replying to floguy:
Ok so maybe I'm completely crazy, and please correct me if I'm wrong, but I think that due to the way querysets resolve keywords into fields, this is a non-issue.
That may be the case, but the last patch didn't contain any query tests so this point wasn't obvious. Even if it were obvious, regression tests would still be important so that any future refactoring didn't accidentally remove functionality that was required.
Before making any modifications to the queryset code, I wanted to first see what was possible with the current API (both because I'm completely unfamiliar with the queryset internals, and because I know that those internals are about to change).
A valid concern. We may need to put this partially on hold until QS-RF lands (or at least coordinate this work with QS-RF)
The relevant tests are just below tests/modeltests/m2m_manual/models.py line 93 (###QUERY TESTS###). Again, please let me know if I'm missing something here.
I can see at least two big queries (which are pretty much the original use cases) that aren't covered by these tests:
1) Get me all the groups that Jane joined after 1 Jan 2005.
2) Get me all the people that have joined 'Rock' since 1 Jan 2005.
Again - these may already be possible, but the tests need to prove that they are covered.
There are also some model validation issues that needs to be addressed:
1) Consider the following (degenerate case) model:
class Person(Model): name = CharField() membership = IntegerField() class Group(Model): name = CharField() membership = IntegerField() members = ManyToMany(Person, through='Membership') class Membership(Model): person = ForeignKey(Person) group = ForeignKey(Group) joined = DateField()
There is a name clash over the 'membership' query - how does 'Group.objects.filter(membership__name='foo') determine that membership__ refers to the relation model, not the local field? This should be raised as a model validation issue.
2) There is a need for initial validation that a model used in a 'through' clause contains 2 foreign keys, one of which is the model in which is using the 'through' clause. i.e., I should be able to point to:
class Membership(Model): person = ForeignKey(Person)
in a through clause.
3) I haven't thought this one through fully, but what is the significance of a model used in a 'through' clause that contains:
class Membership(Model): person = ForeignKey(Person) group = ForeignKey(Group) other = ForeignKey(Other)
?
comment:16 by , 17 years ago
Replying to russellm:
Replying to floguy:
Ok so maybe I'm completely crazy, and please correct me if I'm wrong, but I think that due to the way querysets resolve keywords into fields, this is a non-issue.
That may be the case, but the last patch didn't contain any query tests so this point wasn't obvious. Even if it were obvious, regression tests would still be important so that any future refactoring didn't accidentally remove functionality that was required.
Before making any modifications to the queryset code, I wanted to first see what was possible with the current API (both because I'm completely unfamiliar with the queryset internals, and because I know that those internals are about to change).
A valid concern. We may need to put this partially on hold until QS-RF lands (or at least coordinate this work with QS-RF)
The relevant tests are just below tests/modeltests/m2m_manual/models.py line 93 (###QUERY TESTS###). Again, please let me know if I'm missing something here.
I can see at least two big queries (which are pretty much the original use cases) that aren't covered by these tests:
1) Get me all the groups that Jane joined after 1 Jan 2005.
2) Get me all the people that have joined 'Rock' since 1 Jan 2005.
Again - these may already be possible, but the tests need to prove that they are covered.
There are also some model validation issues that needs to be addressed:
1) Consider the following (degenerate case) model:
class Person(Model): name = CharField() membership = IntegerField() class Group(Model): name = CharField() membership = IntegerField() members = ManyToMany(Person, through='Membership') class Membership(Model): person = ForeignKey(Person) group = ForeignKey(Group) joined = DateField()There is a name clash over the 'membership' query - how does 'Group.objects.filter(membership__name='foo') determine that membership__ refers to the relation model, not the local field? This should be raised as a model validation issue.
2) There is a need for initial validation that a model used in a 'through' clause contains 2 foreign keys, one of which is the model in which is using the 'through' clause. i.e., I should be able to point to:
class Membership(Model): person = ForeignKey(Person)in a through clause.
3) I haven't thought this one through fully, but what is the significance of a model used in a 'through' clause that contains:
class Membership(Model): person = ForeignKey(Person) group = ForeignKey(Group) other = ForeignKey(Other)?
Great points! I really appreciate the feedback and I'll think about it overnight. Tomorrow I'll get back to you with code, validation, and test cases.
by , 17 years ago
Attachment: | 6095-alpha-04.diff added |
---|
comment:17 by , 17 years ago
Regarding your two use cases:
I have added queries into the test suite which address both of your test cases. The only thing that irks me is that going backwards to get Person objects gives non-distinct results. This is remedied by chaining a .distinct(), but still, it's not desirable.
Regarding your model validation issues:
Case 1)
I tried out this case, and got the following model validation error:
Error: One or more models did not validate: m2m_manual.membership: Reverse query name for field 'person' clashes with field 'Person.membership'. Add a related_name argument to the definition for 'person'. m2m_manual.membership: Reverse query name for field 'group' clashes with field 'Group.membership'. Add a related_name argument to the definition for 'group'.
Case 2)
I've added checks for this in django/core/management/validation.py and tests in modeltests/invalid_models. Actually, I ran into an annoying bug that wouldn't let me run the invalid_models tests, which I reported on and fixed in ticket:6168.
Case 3)
I've added a third foreign key to the custom members model test, which still passes. I can say that due to the way I've written the code, this should not pose a problem.
Case 4)
Now going through these cases, I've thought of another case. With regular m2m fields, you can specify two identical relations, as spelled out in http://www.djangoproject.com/documentation/models/m2m_recursive/. With an intermediary model, however, this stops making sense. Instead of having two relations between two models, you could attach a property to the intermediary table to specify that data. I've added a validation check for this case as well.
comment:18 by , 17 years ago
Oops! The example that I meant to point out for Case 4 was actually: http://www.djangoproject.com/documentation/models/m2m_multiple/
comment:19 by , 17 years ago
I thought of and implemented a new model validation feature: Making sure that the extra fields on the intermediary model have either null=True or default=something. If not, it will raise a model validation error. Attaching new patch now.
by , 17 years ago
Attachment: | 6095-alpha-05.diff added |
---|
Added extra model validation checks and tests.
comment:20 by , 17 years ago
Cc: | added |
---|
by , 17 years ago
Added some (fairly extensive) documentation for this ticket. Please review it to make sure it's up to snuff. I'm not as good of a writer as I am a coder :)
comment:21 by , 17 years ago
At this point this ticket has a couple of iterations of the patch itself, a fairly extensive suite of tests, and now documentation. I'm not sure what the protocol is exactly for going forward now. I think it's going to need some people to try it out soon, so if there's no objections, I'll soon bring it up in django-dev and call for testing/comments there.
comment:22 by , 17 years ago
Cc: | added |
---|
follow-up: 25 comment:23 by , 17 years ago
Shouldn't all results be distinct() ? The second-last test in alpha-05 seemed like an odd test to fail.
comment:24 by , 17 years ago
Cc: | added |
---|
follow-up: 27 comment:25 by , 17 years ago
Replying to oggie_rob:
Shouldn't all results be distinct() ? The second-last test in alpha-05 seemed like an odd test to fail.
Yeah, it is a little weird. I'm not really sure if it's a bug or not though, since it's existing functionality, and the QuerySet documentation reads:
"However, if your query spans multiple tables, it’s possible to get duplicate results when a QuerySet is evaluated. That’s when you’d use distinct()."
comment:26 by , 17 years ago
Cc: | added |
---|
comment:27 by , 17 years ago
Replying to floguy:
Replying to oggie_rob:
Shouldn't all results be distinct() ? The second-last test in alpha-05 seemed like an odd test to fail.
Yeah, it is a little weird. I'm not really sure if it's a bug or not though, since it's existing functionality, and the QuerySet documentation reads:
"However, if your query spans multiple tables, it’s possible to get duplicate results when a QuerySet is evaluated. That’s when you’d use distinct()."
From what I can tell, this isn't a test failure. Historically, m2m relations have always contained (at most) 1 link for each pair of models in the relationship, so this issue never arose. This patch introduces the possibility of multiple links between any given pair. As a result, you can get multiple returns of the same object. Returning multiples would be consistent with the way the rest of Django handles cross-table relationships.
comment:28 by , 17 years ago
Cc: | added |
---|
comment:29 by , 17 years ago
Cc: | added |
---|
comment:30 by , 17 years ago
Cc: | added |
---|
comment:31 by , 17 years ago
Ok; I've finally had a chance to take a closer look at this patch, so here's some feedback -
- I'm really not a fan of all the memoizing going on here. Global caches are a bit of a dangerous hole to travel down if we can avoid it. The existing model construction and m2m framework contains all sorts of factories for model construction and the construction of the M2MManager and descriptors - is there any particular reason that this framework can't be used for constructing m2m-through tables/managers/descriptors?
- I can see the reason that you are imposing the 'extra fields must be null=True or have default' limitation - without it, you can't use the normal infrastructure to assign relations. However, the consequence of this is that you won't be able to rely upon the existence of any 'extra' data on an m2m relation. This seems like a fairly big restriction to place upon projects. Are there no other alternatives here? I would have thought it would be possible to modify the add method to something like
def add(extra=None, *objects):
, where extra is a dictionary of initial data for the relation.
- Your test doesn't test for a problem case that I identified in an earlier discussion. In particular, consider the case where you have *2* m2m relations through 'Membership'. In this case, the query:
>>> Group.objects.filter(membership__invite_reason = "She was just so awesome.")
will be ambiguous. The query attribute needs some other qualifier - for example, using the name of the m2m attribute:
>>> Group.objects.filter(members_membership__invite_reason = "She was just so awesome.")
- Line 125 - the comment doesn't describe the actual query (the comment describes a much more interesting query)
- A more meaningful comment for line 129 would be helpful. The fact that at least one person has raised this as a potential bug query means that it needs to be clarified. Also, this is a model test, so it is in part documentation. The example should be self explanatory - rather than just printing the list of people, try printing the person _and_ the extra data that makes the entries unique.
by , 17 years ago
Attachment: | 6095-beta-01.diff added |
---|
comment:32 by , 17 years ago
- Actually, I am using those facilities. The memoizing was a premature optimization on top of those. I wasn't thinking about the fact that it's curried later like the other attributes, so each of those two memoized functions are only called once, rendering the memoizing pointless. No matter, it's fixed now!
- I agree that there needs to be some easier way of accessing those properties. However, I'm not quite happy about the 'extra' idea, as I pointed out on django-dev here. To expand on my post there, I can see 3 options, none of which I'm 100% happy with (please weigh in and/or suggest alternatives):
- Your suggestion: add an 'extra' dictionary keyword which will automatically set the contents of the dict as properties on the relation of all of the input objects. I don't like the fact that one 'extra' should apply to many objects...a more granular control would be preferable IMHO.
- Adding a new method like add_extra(object, extra), which can only take in one object and requires an extra dict with the same semantics as previously described. I don't love this because we already have an add method, and there would be lots of overlapping functionality between the two.
- Modifying add(*args) to check first to see if an arg is actually a dictionary like {'object' : obj, 'extra' : {'prop1' : val1, 'prop2' : 'val2'}}. I don't like the verbosity of this method, although it does preserve backwards compatibility and DRY.
- Sorry, I'm not sure that I understand this. Are you talking about an intermediary model with three FKs, and two m2m relationships spanning three models which go 'through' it? If so, I don't see the ambiguity, although it's an interesting idea which may very well warrant a test case. If that's not what you meant, please elaborate. I'm sure there's ambiguity there, but that I'm missing the point somehow.
- Fixed in the new patch. Great feedback, thanks!
- Fixed in the new patch. Great feedback, thanks!
comment:33 by , 17 years ago
Regarding 2: I don't buy your 'lack of granular control' argument - you can always use multiple calls to add(), one for each set of grouping data. This option should also be backwards compatible - any existing calls to add will assume extra=None, which will be correct for m2m relations without a through model.
I'm not a fan of add_extra - that looks like needlessly complicating the m2m interface. In the case of an m2m relation with a through model, there won't be a need for add(); in the case of an m2m relation without a through model, there won't be a need for add_extra(). Providing both methods seems like a needless complication.
I'm not convinced that the add(*args) approach you are describing is much of an improvement, either - as far as I can make out, it should end up functionally identical to option 1, except that you won't be able to guarantee which argument is the dictionary providing relationship data.
There is at least 1 other option - disable the add method completely if you are using a through argument. Essentially, this sidesteps the problem; you will still be able to query neatly, but you will only be able to create new objects by creating instances of the through class.
Regardless of the option chosen, the same approach should also be used on the create method.
There will also be a need to disable the assignment method on the descriptor (set), since:
mygroup.person_set = [a,b,c,d]
will no longer be possible (which is not covered by your test cases).
There is an analogous problem with remove - since you can now have multiple relations between objects, a simple remove method is no longer an option (again - no tests here). I can't think of an obvious way that clear will be affected, but a test wouldn't hurt.
The admin interface will also be affected - the default admin widget for m2m won't be able to handle m2m relations with a through class.
The more I think about it, the more moving to a 'no assignment' approach sounds like a good idea :-)
Regarding 3: To demonstrate the problem, modify the Group model in your test case slightly:
class Group(Model): name = CharField() club_membership = ManyToManyField(Person, through=Membership) team_membership = ManyToManyField(Person, through=Membership)
If you then issue queries over the membership class:
Group.objects.filter(membership__invite_reason="she's cool") Person.objects.filter(membership__invite_reason="she's cool")
Using the new Group, this is ambiguous due to a name clash on 'membership'.
Looking closer, you are accounting for this problem - but you handle it by rejecting it as a validation error. I'm not sure this is the right approach. There are any number of reasons that you might want to have 2 M2M relations with the same associated data class (or, to put it a more convincing way - Other than the problem it introduces into the query language, I can't think of an reason that it _shouldn't_ be allowed)
I would suggest a better approach would be to allow a 'through_name' argument, which would allow you to customize the query term. This would be analogous to the 'related_name' argument that is used to resolve reverse M2M and FK descriptors. through_name would be optional, defaulting to the object name for the through class; it would only be required to resolve ambiguities like this one.
For example, the Group model would become:
class Group(Model): name = CharField() club_membership = ManyToManyField(Person, through=Membership, through_name='clubber') team_membership = ManyToManyField(Person, through=Membership, through_name='teamer') ... Group.objects.filter(clubber__invite_reason="she's cool") Group.objects.filter(teamer__invite_reason="she's cool") Person.objects.filter(clubber__invite_reason="she's cool")
Also - no problem adding your name to AUTHORS, but could you put it in alphabetical order, like all the other names :-)
comment:34 by , 17 years ago
Firstly, awesome catch on all of those missing tests--I'll implement all of them. Actually, I wasn't even aware that assignment was available on the descriptor until you wrote that and I looked at the code!
After some thought I agree with you about adding extra=None being the best of the options. But there does seem to be a lot of side effects. At this point I'd argue for keeping this particular patch as incremental a change as possible and simply require null=True or default, and another patch can be opened later requesting that the syntax for assigning extra intermediary properties be easier. Honestly I'm about a +0 on what I just said, and could easily go both ways.
Regarding through_name: I definitely agree. I've been avoiding touching the QuerySet stuff from the beginning, but your point about multiple relationships through one intermediary model makes clear that some changes need to be made. As I'm not familiar with the QuerySet code at all, I'll need to walk through it before making any big changes.
Finally, regarding AUTHORS: my bad! I just threw my name at the end there right before submitting, and it'll be in the right place for the next patch.
Russ, I'd like to thank you for continuing to oversee this patch. You're probably working on it about as much as I am and you're not losing patience with someone who's bitten off a bit more than he could chew :) This weekend is a bit busy for me, but as soon as I get the chance I'll sit down and work with the QS code to see what can be done.
comment:35 by , 17 years ago
I've been spending some time today walking query.py and thinking about this patch and the ideas brought up by Russ. I've got a couple of questions/thoughts:
- Regarding having 2 M2M relations with the same associated data class: While going through and determining how to correctly resolve the intermediary model (specified by through_name) as a field on the related models, all of a sudden I hit a block. How will the database layer differentiate between the two sets? This sort of double-m2m-relation worked without going through a model, because it could simply create two tables to differentiate the data. An intermediary model, however, defines its own table in its inner meta class. One way to describe the same data would be to add some BooleanFields on to the Membership model like is_club_member and is_team_member or something similar. That way you don't usurp the power of the intermediary model to define its own db_table and you still get to represent your data. This is just one suggestion. What do you think about this?
- I've thought more about shortcuts to field assignment on the intermediary model, and I remain convinced that requiring null=True or a default makes the most sense for now, but there's no reason that we shouldn't implement the extra=None kwarg on add and create to make things easier. If there's no objection, that's the route that I'll implement.
Thanks for any comments/suggestions that anyone may have about these things.
comment:36 by , 17 years ago
Re 1: I may have to retract my original comment on this one. Your analysis that the 'non-through' version of m2m would create two independent tables is correct, and points out a flaw in my original reasoning; essentially, you have provided the compelling reason to disallow this type of model construct. I suppose we could jump through all sorts of hoops to provide a workaround, but it would end up fairly ugly in implementation and interface, all to support an edge case. In retrospect, I think the validation approach will be fine.
Re 2: I'm not convinced that requiring null=True is the right approach here. Like I said before, setting null=True would be very convenient for setting up assignment, etc, but it would be very inconvenient for queries (or for applications using those queries) - even though you have an intermediate table with extra data fields, you won't be able to rely upon the existence of data in those fields because the null=True means that they could all be empty. Enforcing a default is just as burdensome, because there are many situations where no default is appropriate. These are pretty big constraints to place on models using m2m-intermediate tables.
This corner of the API is looking like a right mess, and we haven't even got into representing m2m-intermediates in the admin view. Based on what I've seen so far, I'd put my preferred option as disabling entirely the add/remove/assignment methods for m2m models that define a through clause. This would require that m2m-intermediate relations be created\destroyed\modified by manipulating the actual intermediate table, rather than providing a mechanism through the descriptor. i.e., rather than saying:
>>> g.members.add(p) # or some variant of syntax >>> print g.members.all() [<Person 1>]
you must call:
>>> m = Membership(person=p, group=g, date_joined=datetime.now()) >>> m.save() >>> print g.members.all() [<Person 1>]
This means that you are being explicit on the creation of complex m2m relationships, but you retain the convenience of the query syntax for traversing those relationships. It means that displaying these relationships in the admin interface is simple - you don't need a complex widget, you just need an admin block on the intermediate table.
If add(extra={...}) can be made to work in a backwards incompatible way we can look at including it as a convenience. However, I certainly wouldn't lose sleep if it were missing from the first iteration.
Note that none of this affects existing m2m implementations - the add/remove/clear/assignment options can continue to exist in those cases. Only the m2m-intermediate interface will be different, as is warranted by the complexities of the m2m-intermediate use case.
Regarding my oversight on this: I'm more than happy to provide whatever guidance I can. Sure, I spend some time reviewing code and responding to comments, but I get the benefit of a feature that I want without actually needing to write the code. As for biting off more than you can chew - I can't see any evidence of that. You've been doing some great work on a non-trivial problem. It's very difficult to grow impatient with that :-)
by , 17 years ago
Attachment: | 6095-beta-02.diff added |
---|
Updated patch which disallows .add() and .remove() and has some improved test cases.
by , 17 years ago
Attachment: | 6095-beta-03.diff added |
---|
Realized that there was some unneeded code since .add() and .remove() were disallowed when 'through' is specified.
comment:37 by , 17 years ago
Cc: | added |
---|
comment:38 by , 17 years ago
Cc: | removed |
---|
comment:39 by , 17 years ago
Cc: | added |
---|
by , 17 years ago
Attachment: | docs.2.diff added |
---|
Updated the documentation to match the newest patch. Also tried to clarify some of the main points so that they were easier to understand.
comment:40 by , 17 years ago
Cc: | added |
---|
comment:41 by , 17 years ago
Cc: | added |
---|
by , 17 years ago
Attachment: | 6095-beta-04.diff added |
---|
Updated patch to [7104] and now this includes the init.py file that all of the other patches didn't.
follow-up: 45 comment:42 by , 17 years ago
Can someone please try out this new patch and let me know if it works for them? I'm using git now to make development on this ticket easier, and this patch applies cleanly for me but I'm not sure if it'll apply to anyone else.
comment:43 by , 17 years ago
Cc: | added |
---|
comment:44 by , 17 years ago
Cc: | added |
---|
comment:45 by , 17 years ago
Replying to floguy:
Can someone please try out this new patch and let me know if it works for them? I'm using git now to make development on this ticket easier, and this patch applies cleanly for me but I'm not sure if it'll apply to anyone else.
The __init__.py didn't roll out on my end, but it otherwise applies cleanly.
The patch is starting to look pretty good, though. A few more (mostly minor) comments:
- I'm not sure that NotImplementedError is the right approach for the add/remove methods; I'd be inclined to just remove them (or not define them in the first place). NotImplemented implies that we'll get around to it one day - but we wont, because they can't ever work. If the method is completely missing, it won't turn up in autocompletes or otherwise lead to the temptation or suggestion that it will work.
- The GenericRel special case in sql.py is a bit of a wart; I can see why you've appended a new condition to this check, but I'm wondering if we can't abstract this out - i.e., put an attribute on a many_to_many field that would disable database table creation?
- A couple of typos in error messages - non-exhaustive list: "manualy", "Cannot set values on a ManyToManyFields which specify a through model"
- get_reverse_rel_field() - firstly, related_name is an unused argument; secondly, you can make this a little simpler (and more robust) using _meta.get_all_related_objects() (and possibly suggests an improvement required in the meta class - get_related_object(name) and get_related_many_to_many_object(name)).
- Minor doctest note: the modeltests serve a dual purpose, both as a regression test, and a source of examples. As such, it's helpful if the doctest comments explain what the doctest is actually testing. The comments that are currently there are are all correct, but not particularly illuminating for a newcomer. For example, the query docstrings would be more helpful if they were of the form "Find all the groups that have a member named 'Bob'; "Queries also work in the reverse direction: find all the People in the group called "Rock".
- Admin integration. If you put the test case models into a project, and try to add a Group, you get m2m widgets for members, custom_members and nodefaultsnonulls, which then fail on save because add() is not implemented. This may be something that needs to be addressed in newforms-admin, rather than patching the existing admin app.
- Regarding the documentation; I'm not sure the PizzaTopping example really illuminates the problem (and solution) that clearly. IMHO, the Person/Group/Membership is a really natural example - why not stick with it?
- Again on the documentation: Most users are coming to the documentation with a problem that they want to have solved (How do I associate data with an m2m relation?). However, the 'flavour' of the docs is very much one of 'this is how intermediate tables are created and used'. While the implementation of the solution is important, and may well be illuminating, it distracts from the explanation of the problem you are solving.
That should give you plenty to work on :-)
On a side note - thanks for the nice words on This Week in Django :-).
by , 17 years ago
Attachment: | 6095-beta-05.diff added |
---|
Made add() and remove() dynamically added depending on intermediary model. Added an attribute on ManyToManyField which specifies whether table creation is necessary. Fixed a few typos in the error messages. Moved get_reverse_rel_field logic into django.db.models.options as get_related_object() and refactored code accordingly. More to come.
by , 17 years ago
Attachment: | 6095-beta-06.diff added |
---|
Added/reworded lots of the comments to the m2m_manual doctests (maybe these tests need a rename, hmmm), and fixed some issues with the .diff that prevented parts from showing up last time.
by , 17 years ago
Attachment: | docs.3.diff added |
---|
Tried to rewrite the docs to be more user-centric instead of django-centric.
by , 17 years ago
Attachment: | 6095-qsrf.diff added |
---|
Updated the patch to apply to queryset-refactor. Currently two of my tests which pass on Trunk do not pass on qs-rf, and I'm not exactly sure why. Will have to wait to see more documentation on qs-rf before I can resolve.
by , 17 years ago
Attachment: | 6095-nfadmin.diff added |
---|
Updated patch for newforms-admin, with the added bonus of fixing the integration problem with the admin. Now when a ManyToManyField specifies an intermediary model, it does not display the field in the admin.
by , 17 years ago
Attachment: | 6095-nfadmin.2.diff added |
---|
Made sure that create is not allowed. This was an oversight in the original patch, but now there are tests which verify that create should not be allowed on a ManyRelatedManager. Also removed new.instancemethod craziness because it was not necessary. All tests pass.
comment:46 by , 17 years ago
Cc: | added |
---|
comment:47 by , 17 years ago
Cc: | added |
---|
comment:48 by , 17 years ago
Cc: | added |
---|
comment:49 by , 17 years ago
I'd like to say that in terms of the admin interface I don't see any reason why one wouldn't just make a Tabular/StackedInline for the intermediary model and be set. I think this should be documented once newforms-admin gets some documentation.
comment:50 by , 17 years ago
Apologies for taking so long to review this. Life has been a bit hectic of late, but hopefully things are settling down and I'll get a chance to do some more regular reviews.
The patch is starting to look pretty good. At this point, I can only see one minor improvement to the existing code, regarding the 'creates_table' option. This option is a great addition, but its use needs to be more widespread - in particular, Generic relations should be using this option, rather than being a special case during table construction. This isn't literally part of adding the m2m intermediate feature, but if we're introducing the option, we should make sure it is being used everywhere that it can be used, not just where we need it right now.
Other than that, I'm happy with the programmatic API at this point.
Brian's issue with admin rendering also warrants some thought. Simply removing the widget for m2m-intermediate fields as you have done is certainly one approach, but it's the easy way out. Providing Tabular/Stacked Inlines would be better - although I can see that if there were a lot of related objects, this could get unwieldy. Even if such an interface wasn't enabled by default, it would be good to know that it is _possible_ to provide an admin rendering inside newforms-admin. This would be a useful test case for the extensibility interfaces that Brian, Joseph et al have been building.
I'm not completely opposed to the 'no widget by default' approach, but I'd be interested in hearing some community opinion before we settle on the idea.
by , 17 years ago
Attachment: | 6095-trunk-withdocs.rc1.diff added |
---|
Updated to latest trunk, with some syntactic improvements and taking into account latest suggestions.
comment:51 by , 17 years ago
Don't worry about taking a while to review, as I had no time to work on it either. Now with school and other stuff out of the way, I got a chance to look at the patch again.
It's now updated to the latest trunk, with the creates_table suggestion you had implemented. I guess I wasn't seeing the bigger picture before on that one.
Regarding the admin: Not knowing very much about newforms-admin, I talked to Brian Rosner today on IRC ( http://oebfare.com/logger/django-dev/2008/06/04/ at 23:37 ), and it looks like the mechanism for adding an intermediary model to the admin page of either side of the m2m relation is already in place, and is the same as adding any other model to the admin. It's probably possible to provide an implicit admin interface, but according to him it would be problematic to override that implicit interface.
I tend to think that requiring explicitness isn't such a bad thing, especially since by their very nature, m2m intermediary models are more complex than regular m2m tables and will probably each need a bit of customization. But of course I'll defer to your judgment on whether you think we should work to provide a best-guess implicit inline. If you do believe that we should do this, we'll also need to look into how to then disable the inline if the user wants.
by , 17 years ago
Attachment: | 6095-trunk-withdocs.rc2.diff added |
---|
Fixed bug with self-referential M2M fields not working with an intermediary model, and added tests for that problem.
comment:52 by , 17 years ago
Cc: | added |
---|
comment:53 by , 17 years ago
Cc: | added |
---|
comment:54 by , 17 years ago
Cc: | added |
---|
comment:55 by , 17 years ago
Cc: | added |
---|
comment:56 by , 17 years ago
Cc: | added |
---|
comment:57 by , 17 years ago
Cc: | added |
---|
comment:58 by , 17 years ago
Cc: | added |
---|
comment:60 by , 16 years ago
Cc: | added |
---|---|
Resolution: | → fixed |
Status: | assigned → closed |
comment:61 by , 16 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
outch, sry didn't mean to close the ticket, don't know how this could happen.
comment:62 by , 16 years ago
Cc: | added |
---|
comment:63 by , 16 years ago
Cc: | added |
---|
comment:64 by , 16 years ago
If anyone cares, I for one would like to see an option to have a STACKED or TABULAR representation of this in the admin. I was toying with this patch tonight on a m2m model that we want to apply an order on and was a bit surprised it didn't show up that way, actually. I just kind of assumed.
comment:65 by , 16 years ago
It's getting late and I don't have time to test this fully, but I think I found a bug. The only major difference I can see in my case and the example is that one of my models is in another app separate from the 2 models. For example:
# In myproject/clip/models.py class Clip(models.Model): title = models.CharField(max_length=200) # In myproject/presentation/models.py from myproject.clip.models import Clip class Presentation(models.Model): title = models.CharField(max_length=200) clips = models.ManyToManyField(Clip, through='Sequence') class Sequence(models.Model): presentation = models.ForeignKey(Presentation) clip = models.ForeignKey(Clip) sequence = models.IntegerField(blank=True, null=True)
The error I'm getting is this:
Traceback: File "/Users/rob/sandbox/django/django_trunk/django/core/handlers/base.py" in get_response 86. response = callback(request, *callback_args, **callback_kwargs) File "/Users/rob/sandbox/django/django_trunk/django/contrib/admin/views/decorators.py" in _checklogin 62. return view_func(request, *args, **kwargs) File "/Users/rob/sandbox/django/django_trunk/django/contrib/admin/views/main.py" in change_stage 372. new_data = manipulator.flatten_data() File "/Users/rob/sandbox/django/django_trunk/django/db/models/fields/related.py" in flatten_data 856. instance_ids = [instance._get_pk_val() for instance in getattr(obj, self.name).all()] File "/Users/rob/sandbox/django/django_trunk/django/db/models/fields/related.py" in __get__ 543. target_col_name=qn(self.field.m2m_reverse_name()) File "/Users/rob/sandbox/django/django_trunk/django/db/models/fields/related.py" in _get_m2m_reverse_name 829. return related.field.get_attname_column()[1] Exception Type: AttributeError at /admin/presentation/presentation/1/ Exception Value: 'NoneType' object has no attribute 'field'
Inside the _get_m2m_reverse_name method is this if block.
if self.rel.through is not None: meta = related.parent_model._meta if self.parent == self.rel.to: related = meta.get_related_object(self.rel.through, self_ref = True) else: related = meta.get_related_object(self.rel.through) return related.field.get_attname_column()[1]
My self.rel.through is 'Sequence' which has a FK to Presentation (the same app) and an FK to Clip (different app) plus an integer that's the sequence used to order the clips of a presentation. The meta of the 2nd line if printed is 'clip.clip'. But if I get a Clip object and try to call c._meta.get_related_object('Sequence') on it, I get nothing back. If, however, I call the same on a Presentation object, I do get RelatedObject back.
I can supply more details if needed. Is this supposed to work across apps? Anything I'm doing wrong here?
comment:66 by , 16 years ago
Rob: This certainly looks like a bug to me! Seeing as get_related_object is a new addition to django/db/models/options.py, I'm almost certain that the bug lies in that function. Anyway, I'm in the middle of a cross-country move and until things settle down a bit for me (hopefully soon!) I won't be able to really sit down and debug this.
Regarding the STACKED or TABULAR inline, according to brosner it's possible to do this in NFA easily the same way any other inline is done in NFA. This is one big thing that I want to tackle in the California sprint, is investigating sensible defaults for inlines on these m2m intermediary models.
comment:67 by , 16 years ago
After some investigation, I discovered that the problem is indeed in get_related_object, and specifically its use of get_model. I'm looking into fixing it now, but this could be a bigger bug than I thought. The correct app_label is simply not available from the scope of that function as it stands right now, so it's a case of needing to find a way to extract that information at the correct time and make it available to that function.
by , 16 years ago
Attachment: | 6095-trunk-withdocs.rc3.diff added |
---|
Fixed cross-app problem. My solution is perhaps a bit hackish. Seems like the better solution would be to patch the add_lazy_relation function to work with intermediary models, but when I tried to do so I ran into non-obvious race conditions. This method works, is fairly transparent in how it works, and doesn't sacrificce performance. If anyone has feedback on better ways to do this, feel free to chime in.
comment:68 by , 16 years ago
Cc: | added |
---|
by , 16 years ago
Attachment: | 6095-r8068.2.diff added |
---|
Updated patch for r8068; some minor revisions
comment:69 by , 16 years ago
I've just uploaded an updated patch; I've cleaned up the docs a bit, and made a few minor tweaks to some of the internals. I'm getting ready to commit this, so the fine toothed comb has pulled up a couple of issues:
- I'm not completely happy with get_related_object(); every single time you want the name of the related m2m column, you have to iterate over all the related objects. This feels like there should be some caching going on - at the very least with the m2m column name, if not with the dictionary of RelatedObjects on the Options object.
- I've added a regression test for the case of linking to an external application. While I was at it, I added a regression test for defining through using instances, rather than strings (i.e., through=Membership rather than through='Membership'). This didn't work so well :-) so the test currently fails.
- What is the current status of admin support? It doesn't appear to be in the latest patch (in any form)
I have a couple of more little issues bouncing around in my head (mostly "isn't there an easier way to do X" type things) - if I can nail down exactly what is bugging me, i'll leave another comment.
comment:70 by , 16 years ago
Here is my stance on admin support for intermediary models. I think if we do some automatic admin work we are digging ourselves in a hole. I think it would be wrong to do any sort of automatic admin support when a ManyToManyField
has
through
specified. Reason being is that it prevents any ability to control the behavior/options on the inline. I would much prefer if it was documented as such and that the user create the inline themselves for the intermediary model.
comment:71 by , 16 years ago
With the new ModelAdmin functionality does there really need to be anything special done? I am thinking something like the following should work, assuming that the validation wouldn't complain about having the automatic generated column...? This will also let the user decide if they would like the m2m model to show up in admin.
class MyM2MModelAdmin(admin.ModelAdmin): raw_id_fields = ('my_m2m_column',) admin.site.register(MyM2MModel, MyM2MModelAdmin)
comment:72 by , 16 years ago
The point of this ticket to provide the ability to make a ManyToManyField
gain the ability to define the intermediary table/model. In theory dropping the field in to
raw_admin_fields
should Just Work (tm). If not then there is a legitimate bug there. However, what I am trying to get at is that when a
ManyToManyField
has
through
specified we shouldn't do anything special. Because it is likely the additional columns on the model are defined by the user and there is no widget to handle this other than an inline.
comment:73 by , 16 years ago
Actually now thinking about it, I can see where things can go wrong with raw_admin_fields
if
through
gets any special
editable
treatment. This was present in a patch Eric did by ignoring that field. I am not sure I see much value in allowing the field to work in a
filter_*
or on its own due to the intermediary model. Perhaps its just disallowed altogether and force the use of the inline (via documentation)?
by , 16 years ago
Attachment: | 6095-r8090.diff added |
---|
Revised patch, with cleaned up internals (especially w.r.t field lookup)
by , 16 years ago
Attachment: | 6095-r8090.2.diff added |
---|
Added a few more validation checks (along with corresponding invalid_model tests), fixed a typo in documentation, added a section in admin documentation on how to integrate m2m intermediary models inline with the admin.
by , 16 years ago
Attachment: | 6095-rc1.diff added |
---|
First (hopefully final) release candidate for m2m-intermediates
comment:74 by , 16 years ago
As discussed on #6778, it could make sense to refractor the ManyRelatedManager to use a generic intermediary model. This would also send appropriate signals when a relation is created, changed or deleted, which is a big issue right now.
comment:75 by , 16 years ago
Cc: | added |
---|
comment:76 by , 16 years ago
Cc: | added |
---|
comment:77 by , 16 years ago
Cc: | added |
---|
comment:78 by , 16 years ago
Cc: | added |
---|
comment:79 by , 16 years ago
Cc: | added |
---|
comment:80 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
comment:81 by , 16 years ago
Cc: | added; removed |
---|
comment:83 by , 16 years ago
Cc: | removed |
---|
comment:84 by , 16 years ago
Cc: | removed |
---|
comment:85 by , 16 years ago
Cc: | removed |
---|
comment:86 by , 16 years ago
Cc: | removed |
---|
floguy@…, lnielsen@…, waylan@…, mir@…, ludo@…, cephelo@…, msaelices@…, trevor@…, piranha@…, craig.ogg@…, odonian@…, thekook@…, martin@…, myers@…, yatiohi@…, erwin@…, tobias@…, micsco@…, jshaffer, dan90, rgl@…, smcoll@…, max@…, apollo13, boobsd@…, ristretto.rb@…, farhan@…, christian, flori@…, gabor@…, mmalone
Removed the above addresses as this ticket is closed and spam is attacking, if anyone has any issues add yourself back :).
Is this a qs-rf feature, or something that should be implemented separately?