I encountered a situation where I constantly need to update a large number of collections.
Here are the collections:
coll1
{
"identification_id" : String,
"name" : String,
"mobile_number" : Number,
"location" : String,
"user_properties" : [Mixed types],
"profile_url" : String
}
coll2
{
"identification_id": String,
"user_id" : String,
"name" : String,
"mobile_number" : Number,
"location" : String,
"user_properties" : String,
"profile_url": String,
"qualified_user" : String,
"user_interest_stage" :Number,
"source" : String,
"fb_id" : String,
"comments":String
}
updated coll1
{
"identification_id": String,
"name" : String,
"mobile_number" : Number,
"location" : String,
"user_properties" : String,
"profile_url": String,
"qualified_user" : String,
"user_interest_stage" :Number,
"source" : String,
"fb_id" : String,
"comments":String
}
For the collections coll1 and coll2, the following document insertion scenarios apply:
- If a user from coll1 is qualified based on certain scenarios, a record will be created in coll2.
- A new record can be manually created from API information in coll2
- The identification for coll1 in coll2 is the user_id
- There may be multiple records in coll2 for a single record in coll1
Now, we are merging these collections into a single collection, which will be coll1. We have decided to update qualified visitors using the key 'qualified_user' and update the corresponding user fields in coll1.
I have developed a script using Node JS and mongoose to fetch documents from coll1, verify a qualified_user in coll2, and update based on the following scenarios:
- If there is no qualified user, update the document with default values of an unqualified user
- If there is one qualified user, copy the qualification documents from coll2 and update in coll1
- If there are multiple qualified users, copy the first document and update in coll1. For the rest of the documents in coll2, create new documents in coll1
- After processing all documents from coll1, process coll2 documents that are qualified from APIs and create a new document in coll1
However, when running this script, I encountered the following error:
<--- JS stacktrace --->
==== JS stack trace =========================================
With a large number of documents in coll1, the processing time was significant. I used skip and limit to process all the documents, but it took 1 hour to complete. Is there a more efficient way to handle these types of database updates for a large number of collections?