Per row result for dumpdata
|Reported by:||Gwildor||Owned by:||nobody|
|Component:||Core (Management commands)||Version:||master|
|Has patch:||no||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||no|
In response of ticket #22251, I'm opening this as a separate issue like requested. You can read the need for this option there, but basically it has to do with memory consumption. This was addressed in #5423 and improved drastically based on the results talked about in the ticket, but dumpdata is still consuming a fair amount of memory, and would benefit from further improvements. Besides that, in its current form, when the command stops unexpectedly, nothing is saved and you don't have an incomplete file which you can use for development or testing purposes while you are running the command again.
In its current form, dumpdata is returning one big JSON object which loaddata has to read into memory and parse before it can start importing again. By writing one row of data in a separate JSON object for it and having one resulting JSON object per line, loaddata could use buffered file reading like Python's readlines function to reduce the memory usage.
Unfortunately, this feature is probably backwards incompatible, although it might be possible to do some fancy reading of the file in the loaddata command to check its file structure. If that's not possible, I reckon it's best to add a new flag to enable this feature.
Change History (4)
comment:1 Changed 12 months ago by aaugustin
- Needs documentation unset
- Needs tests unset
- Patch needs improvement unset
- Triage Stage changed from Unreviewed to Someday/Maybe