Adding fields from one collection to another collection in MongoDB based on specific conditions for a considerable amount of data

Question

Adding fields from one collection to another collection in MongoDB based on specific conditions for a considerable amount of data

I encountered a situation where I constantly need to update a large number of collections.

Here are the collections:

coll1
{
  "identification_id" : String,
  "name" : String,
  "mobile_number" : Number,
  "location" : String,
  "user_properties" : [Mixed types],
  "profile_url" : String
}

coll2
{
  "identification_id": String,
  "user_id" : String,
  "name" : String,
  "mobile_number" : Number,
  "location" : String,
  "user_properties" : String,
  "profile_url": String,
  "qualified_user" : String,
  "user_interest_stage" :Number,
  "source" : String,
  "fb_id" : String,
  "comments":String
}

updated coll1
{
  "identification_id": String,
  "name" : String,
  "mobile_number" : Number,
  "location" : String,
  "user_properties" : String,
  "profile_url": String,
  "qualified_user" : String,
  "user_interest_stage" :Number,
  "source" : String,
  "fb_id" : String,
  "comments":String
}

For the collections coll1 and coll2, the following document insertion scenarios apply:

If a user from coll1 is qualified based on certain scenarios, a record will be created in coll2.
A new record can be manually created from API information in coll2
The identification for coll1 in coll2 is the user_id
There may be multiple records in coll2 for a single record in coll1

Now, we are merging these collections into a single collection, which will be coll1. We have decided to update qualified visitors using the key 'qualified_user' and update the corresponding user fields in coll1.

I have developed a script using Node JS and mongoose to fetch documents from coll1, verify a qualified_user in coll2, and update based on the following scenarios:

If there is no qualified user, update the document with default values of an unqualified user
If there is one qualified user, copy the qualification documents from coll2 and update in coll1
If there are multiple qualified users, copy the first document and update in coll1. For the rest of the documents in coll2, create new documents in coll1
After processing all documents from coll1, process coll2 documents that are qualified from APIs and create a new document in coll1

However, when running this script, I encountered the following error:

<--- JS stacktrace --->

==== JS stack trace =========================================

With a large number of documents in coll1, the processing time was significant. I used skip and limit to process all the documents, but it took 1 hour to complete. Is there a more efficient way to handle these types of database updates for a large number of collections?

javascript mongodb mongoose nodejs-server

Answer 1

Answer №1

Attempting to manage an excessive number of documents concurrently may deplete your available memory.

You are presented with two straightforward solutions:

Utilize Mongo's cursor to sequentially process the outcomes rather than retrieving them all at once.
Implement the --max-old-space-size parameter when executing your script, enabling you to manually define the script's memory allocation, like this:
```
node --max-old-space-size=4096 script.js
```

Nevertheless, both approaches are not without flaws, especially considering that your data volume is likely to increase over time, rendering these methods ineffective. I would recommend reassessing your data structure. MongoDB, being a schema-less database, struggles with data redundancies. It is more advisable to store all data in a single collection and update specific fields under specific circumstances.

Answer 2

Attempting to manage an excessive number of documents concurrently may deplete your available memory.

You are presented with two straightforward solutions:

Utilize Mongo's cursor to sequentially process the outcomes rather than retrieving them all at once.
Implement the --max-old-space-size parameter when executing your script, enabling you to manually define the script's memory allocation, like this:
```
node --max-old-space-size=4096 script.js
```

Nevertheless, both approaches are not without flaws, especially considering that your data volume is likely to increase over time, rendering these methods ineffective. I would recommend reassessing your data structure. MongoDB, being a schema-less database, struggles with data redundancies. It is more advisable to store all data in a single collection and update specific fields under specific circumstances.

Adding fields from one collection to another collection in MongoDB based on specific conditions for a considerable amount of data

Answer №1

Similar questions

jQuery Algorithm for calculating totals

A guide on extracting the text content from an anchor tag by using xPath() with a combination of selenium and Mocha

Having trouble passing a jQuery variable containing a string value to PHP through the jQuery AJAX function?

Mongoose Error: Schema Configuration Error - The specified type (`Bt`) is not valid for the `category` path

An addition operation in Javascript

Determine the number of update queries executed in the past 24 hours within MongoDB collections

Suggestions for improving the smoothness of the Bootstrap toggle hide/show feature

Problem: Implementing a horizontal scrolling feature using Skrollr

Does CSS in the current IE9 beta allow for text shadows to be implemented?

While executing a jssor code in SP 2007, IE experiences freezing issues

A strategy for concealing the selected button within a class of buttons with Vanilla JS HTML and CSS

Creating elegant Select Dropdown Box in AngularJS without relying on images

Implementation of the I18next library

Determine the identifier of the subsequent element located within the adjacent <div> using jQuery

What steps do I need to take to set up a nodeJS worker to transfer data from mongo to elasticsearch using streaming

Enable swipe functionality for mobile users

An issue with Axios request in a cordova app using a signed version

Exploring various endpoints using firestore Cloud Function

The variable is constantly reverting back to its initial value

What is the best method for showcasing numerous dropdown lists using JavaScript along with conditional if-else statements?