Opened 7 years ago

Closed 2 years ago

#28692 closed Cleanup/optimization (wontfix)

QuerySet.bulk_create() combine with select/prefetch_related()

Reported by: Дилян Палаузов Owned by: nobody
Component: Database layer (models, ORM) Version: 1.11
Severity: Normal Keywords: bulk_create select_related prefetch_related
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When several objects are created with Models.objects.bulk_create() the called might want to prefetch/JOIN some related models to the new objects, so that additional SELECTs are avoided.

for m in ModelA.objects.select_related('b').bulk_create([ModelA(...), ModelA(...), ModelA(...)]):
  print(m.b.description) # this line shall now cause a new SELECT on each iteration

Change History (7)

comment:1 by Tomer Chachamu, 7 years ago

https://docs.djangoproject.com/en/dev/internals/contributing/bugs-and-features/

First request the feature on the django-developers list, not in the ticket tracker. It’ll get read more closely if it’s on the mailing list. This is even more important for large-scale feature requests. We like to discuss any big changes to Django’s core on the mailing list before actually working on them.

Please send the feature request again to the mailing list. However, here's my opinion:

I don't think it will be possible to implement select_related as the behaviour of INSERT .. RETURNING .. varies a lot between databases. It would be possible to implement prefetch_related.

There are already two ways to reduce the number of queries, is either good for you? One is:

from django.db.models import prefetch_related_objects
new_objects = [ModelA(..., b_id=3), ModelA(..., b_id=4), ModelA(..., b_id=5)]
ModelA.objects.bulk_create(new_objects)
prefetch_related_objects(new_objects, 'b')
for m in new_objects:
    print m.b.description

Another is setting the b attribute on your models, instead of b_id

bs_by_id = ModelB.objects.in_bulk(relevant_b_ids)
new_objects = [ModelA(..., b=bs_by_id[3]), ModelA(..., b=bs_by_id[4]), ModelA(..., b=bs_by_id[5])]
ModelA.objects.bulk_create(new_objects)
for m in new_objects:
    print m.b.description

comment:2 by Tomer Chachamu, 7 years ago

Resolution: needsinfo
Status: newclosed

comment:3 by Дилян Палаузов, 7 years ago

Resolution: needsinfo
Status: closednew

#28600 asks for adding .prefetch_related() support to RawQuerySet . I don't see why that ticket is clear, but this one, asking for adding prefetch_related() support to bulk_create() is anyhow different and hence unclear.

Despite there are already two other ways to reduce the number of queries, prefetch_related() is more coherent, as this what users are used to in other circumstances.

comment:4 by Tomer Chachamu, 7 years ago

I think it will be impossible to implement on backends other than postgres, as you would want to do SELECT modela.id, modelb.* FROM modela JOIN modelb ON modela.b_id = modelb.id WHERE modela.id IN (...), however the value of (...) is only available in postgres.

Last edited 7 years ago by Tomer Chachamu (previous) (diff)

comment:5 by Tim Graham, 7 years ago

Triage Stage: UnreviewedAccepted
Type: UncategorizedCleanup/optimization

I guess a patch could be evaluated.

comment:6 by Дилян Палаузов, 7 years ago

I guess a patch will be provided for prefetch_related() once #28600 gets integrated, as stated in the fourth comment of that ticket.

comment:7 by Simon Charette, 2 years ago

Resolution: wontfix
Status: newclosed

No patch was provided and I agree that this will be very hard to implement select_related in a backend agnostic way. Now that prefetch_related_objects is a documented public API it can be used it to achieve the same results as what prefetch_related would have done.

Closing as wontfix for now, I wouldn't be opposed to reopening once we get better support for RETURNING handling at the sql.Query level if a patch is provided.

Note: See TracTickets for help on using tickets.
Back to Top