Opened 6 weeks ago
Last modified 8 days ago
#36526 assigned Cleanup/optimization
bulk_update uses more memory than expected — at Version 2
Reported by: | Anže Pečar | Owned by: | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 5.2 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
I recently tried to update a large number of objects with:
things = list(Thing.objects.all()) # A large number of objects e.g. > 1_000_000 Thing.objects.bulk_update(things, ["description"], batch_size=300)
The first line above fits into the available memory (~2GB in my case), but the second line caused a SIGTERM, even though I had an additional 2GB of available memory. This was a bit surprising as I wasn't expecting bulk_update to use this much memory since all the objects to update were already loaded.
My solution was:
for batch in batched(things, 300): Thing.objects.bulk_update(batch, ["description"], batch_size=300)
The first example bulk_update
used 2.8GB of memory, but in the second example, it only used 62MB.
A GitHub repository that reproduces the problem with memray results.
As we can see from the memray flamegraph the majority of the memory in my example (2.1GB) is used to prepare the when statement for all the batches before executing them. If we change this to generate the when statement only for the current batch the memory consumption is going to be greatly reduced. I'd be happy to contribute this patch unless there are concerns on adding more compute between update queries and making the transactions longer. Let me know :)
This might be related to https://code.djangoproject.com/ticket/31202, but I decided to open a new issue because I wouldn't mind waiting longer for bulk_update to complete, but the SIGTERM surprised me.
Change History (2)
comment:1 by , 6 weeks ago
Description: | modified (diff) |
---|
comment:2 by , 6 weeks ago
Description: | modified (diff) |
---|