Currently, I am working on parsing a list of JavaScript objects and upserting them to the database one by one using Node.js. The process typically looks like this:
return promise.map(list,
return parseItem(item)
.then(upsertSingleItemToDB)
).then(all finished!)
The challenge arises when dealing with large lists (around 3000 items), as parsing all the items in parallel consumes too much memory. To address this issue, I incorporated a concurrency limit within the promise library to prevent running out of memory (using 'when/guard').
However, I believe there is room for optimization in the database upsert process, especially since MongoDB offers a bulkWrite function. Given that parsing and bulk writing all items simultaneously is not feasible, I plan to divide the original object list into smaller sets. Each set will be parsed using promises in parallel, and the resulting array from each set will then undergo a promisified bulkWrite operation. This process will continue for the remaining sets of list items.
I am struggling to figure out how to structure these smaller sets of promises to ensure that only one set of parseSomeItems-BulkUpsertThem is executed at a time. Perhaps something like Promise.all([set1Bulk][set2Bulk]) could work, where set1Bulk represents another array of parser Promises run in parallel. Any pseudo code assistance would be greatly appreciated (I'm using 'when' if that information makes a difference).