Opened 5 years ago

Closed 4 years ago

Last modified 4 years ago

#30828 closed Cleanup/optimization (fixed)

Document how to add ManyToMany relationships in bulk for different objects and for manually defined "through".

Reported by: David Foster Owned by: David Foster
Component: Documentation Version: dev
Severity: Normal Keywords:
Cc: Patrick Cloke Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Given the following example model:

class M1(models.Model):
    m2_set = models.ManyToManyField('M2')

It is already possible to associate one M1 with many M2s with a single DB query:

m1.m2_set.add(*m2s)

However it's more difficult to associate many M1s with many M2s, particularly if you want to skip associations that already exist:

# NOTE: Does NOT skip associations that already exist!
m1_and_m2_id_tuples = [(m1_id, m2_id), ...]
M1_M2 = M1.m2_set.through
M1_M2.objects.bulk_create([
    M1_M2(m1_id=m1_id, m2_id=m2_id)
    for (m1_id, m2_id) in
    m1_and_m2_id_tuples
])

I propose adding the following APIs to bulk-associate relationships:

M1.m2_set.add_pairs(*[(m1, m2), ...], assert_no_collisions=False)
# --- OR ---
M1.m2_set.add_pair_ids(*[(m1_id, m2_id), ...], assert_no_collisions=False)

I also propose to add the following paired APIs to bulk-disassociate relationships:

M1.m2_set.remove_pairs(*[(m1, m2), ...])
# --- OR ---
M1.m2_set.remove_pair_ids(*[(m1_id, m2_id), ...])

I have already written code for both of these cases and have been using it in production for a few years. It probably needs to be extended to support non-default database connections. Documentation+tests need to be added of course.

Related thread on Django-developers: https://groups.google.com/forum/#!topic/django-developers/n8ZN5uuuM_Q

API docstrings, with further details:

def add_pairs(
        self: ManyToManyDescriptor,  # M1.m2_set
        m1_m2_tuples: 'List[Tuple[M1, M2]]',
        *, assert_no_collisions: bool=False) -> None:
    """
    Creates many (M1, M2) associations with O(1) database queries.
   
    If any requested associations already exist, then they will be left alone.
   
    If you assert that none of the requested associations already exist,
    you can pass assert_no_collisions=True to save 1 database query.
    """

def remove_pairs(
        self: ManyToManyDescriptor,  # M1.m2_set
        m1_m2_tuples: 'List[Tuple[M1, M2]]') -> None:
    """
    Deletes many (M1, M2) associations with O(1) database queries.
    """

Change History (14)

comment:1 by Carlton Gibson, 5 years ago

Triage Stage: UnreviewedAccepted

I'll accept this given the preliminary discussion on the mailing list. Let's see the patch. :)
Thanks David.

comment:2 by Patrick Cloke, 5 years ago

Cc: Patrick Cloke added

comment:3 by David Foster, 5 years ago

Draft documentation has been written. Implementation and tests are pending. Will post PR once I have all three.

comment:4 by David Foster, 5 years ago

Has patch: set

PR created.

comment:5 by Mariusz Felisiak, 5 years ago

However it's more difficult to associate many M1s with many M2s, ...

You can use set() on a reverse relationship, e.g.

m2.m1_set.add(*m1s)

Right?

comment:6 by David Foster, 5 years ago

Yes you can bulk add on a reverse relationship to associate many items with one other item in reverse. But you can't do some thing like adding the relations {(a1, b1), (a1, b2), (a2, b3), (a4, b4)} all at once.

comment:7 by Simon Charette, 5 years ago

But you can't do some thing like adding the relations ... all at once.

.bulk_create(ignore_conflicts=True) works reasonably well for this purpose with a small boilerplate increase at the profit of readability. The latter also works for manually defined through with fields without defaults.

comment:8 by Mariusz Felisiak, 5 years ago

Component: Database layer (models, ORM)Documentation
Summary: Add ManyToMany relationships in bulkDocument how to add ManyToMany relationships in bulk for different objects and for manually defined "through".
Type: New featureCleanup/optimization

Let's change this to a documentation issue. I think we should add a paragraph about adding ManyToManyField's relations with bulk_create() for manually defined through and for different objects (on LHS and RHS) to the "Insert in bulk" section (docs/topics/db/optimization.txt).

comment:9 by Mariusz Felisiak, 5 years ago

Needs documentation: set
Patch needs improvement: set

comment:10 by David Foster, 5 years ago

Needs documentation: unset
Patch needs improvement: unset

I have a new patch in PR beta that provides a documentation workaround. Comments requested.

The proposed documentation recommends that users inline a fair bit of boilerplate to perform bulk-associate and bulk-disassociate operations, which feels a bit verbose to me, so I suggest still considering an approach where we add dedicated methods for these operations.

comment:11 by Mariusz Felisiak, 4 years ago

Patch needs improvement: set

comment:12 by David Foster, 4 years ago

Patch needs improvement: unset

Patch revised and ready for another review.

comment:13 by Mariusz Felisiak <felisiak.mariusz@…>, 4 years ago

Resolution: fixed
Status: assignedclosed

In 6a04e69e:

Fixed #30828 -- Added how to remove/insert many-to-many relations in bulk to the database optimization docs.

comment:14 by Mariusz Felisiak <felisiak.mariusz@…>, 4 years ago

In afde9730:

[2.2.x] Fixed #30828 -- Added how to remove/insert many-to-many relations in bulk to the database optimization docs.

Backport of 6a04e69e686cf417b469d7676f93c2e3a9c8d6a3 from master

Note: See TracTickets for help on using tickets.
Back to Top