Opened 5 months ago

Last modified 3 months ago

#28692 new Cleanup/optimization

QuerySet.bulk_create() combine with select/prefetch_related()

Reported by: Дилян Палаузов Owned by: nobody
Component: Database layer (models, ORM) Version: 1.11
Severity: Normal Keywords: bulk_create select_related prefetch_related
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When several objects are created with Models.objects.bulk_create() the called might want to prefetch/JOIN some related models to the new objects, so that additional SELECTs are avoided.

for m in ModelA.objects.select_related('b').bulk_create([ModelA(...), ModelA(...), ModelA(...)]):
  print(m.b.description) # this line shall now cause a new SELECT on each iteration

Change History (6)

comment:1 Changed 4 months ago by Tomer Chachamu

https://docs.djangoproject.com/en/dev/internals/contributing/bugs-and-features/

First request the feature on the django-developers list, not in the ticket tracker. It’ll get read more closely if it’s on the mailing list. This is even more important for large-scale feature requests. We like to discuss any big changes to Django’s core on the mailing list before actually working on them.

Please send the feature request again to the mailing list. However, here's my opinion:

I don't think it will be possible to implement select_related as the behaviour of INSERT .. RETURNING .. varies a lot between databases. It would be possible to implement prefetch_related.

There are already two ways to reduce the number of queries, is either good for you? One is:

from django.db.models import prefetch_related_objects
new_objects = [ModelA(..., b_id=3), ModelA(..., b_id=4), ModelA(..., b_id=5)]
ModelA.objects.bulk_create(new_objects)
prefetch_related_objects(new_objects, 'b')
for m in new_objects:
    print m.b.description

Another is setting the b attribute on your models, instead of b_id

bs_by_id = ModelB.objects.in_bulk(relevant_b_ids)
new_objects = [ModelA(..., b=bs_by_id[3]), ModelA(..., b=bs_by_id[4]), ModelA(..., b=bs_by_id[5])]
ModelA.objects.bulk_create(new_objects)
for m in new_objects:
    print m.b.description

comment:2 Changed 4 months ago by Tomer Chachamu

Resolution: needsinfo
Status: newclosed

comment:3 Changed 4 months ago by Дилян Палаузов

Resolution: needsinfo
Status: closednew

#28600 asks for adding .prefetch_related() support to RawQuerySet . I don't see why that ticket is clear, but this one, asking for adding prefetch_related() support to bulk_create() is anyhow different and hence unclear.

Despite there are already two other ways to reduce the number of queries, prefetch_related() is more coherent, as this what users are used to in other circumstances.

comment:4 Changed 3 months ago by Tomer Chachamu

I think it will be impossible to implement on backends other than postgres, as you would want to do SELECT modela.id, modelb.* FROM modela JOIN modelb ON modela.b_id = modelb.id WHERE modela.id IN (...), however the value of (...) is only available in postgres.

Last edited 3 months ago by Tomer Chachamu (previous) (diff)

comment:5 Changed 3 months ago by Tim Graham

Triage Stage: UnreviewedAccepted
Type: UncategorizedCleanup/optimization

I guess a patch could be evaluated.

comment:6 Changed 3 months ago by Дилян Палаузов

I guess a patch will be provided for prefetch_related() once #28600 gets integrated, as stated in the fourth comment of that ticket.

Note: See TracTickets for help on using tickets.
Back to Top