What is the best way to eliminate duplicate entries in MongoDB with a specific condition?

{
"_id" : ObjectId("5d3acf79ea99ef80dca9bcca"), 
"memberId" : "123",
"generatedId" : "00000d2f-9922-457a-be23-731f5fefeb14",
"memberType" : "premium"
},

{
"_id" : ObjectId("5e01554cea99eff7f98d7eed"), 
"memberId" : "123",
"generatedId" : "34jkd2092sdlk02kl23kl2309k2309kr",
"memberType" : "premium"
}

I possess a dataset consisting of 1 million documents in this particular format, and I am seeking guidance on how to eliminate duplicate documents based on the "memberId" field. Specifically, my goal is to delete duplicated documents where the value of "generatedId" does not contain a hyphen ("-"). As per the provided example, the second document should be removed due to the absence of a hyphen in the "generatedId" value. I would greatly appreciate any suggestions or insights on how to achieve this task.

Answer №1

It's important to have a strategy when dealing with your data, as the outcome can vary based on different factors.

One approach is to group your documents by their Ids to identify duplicates, and then filter out entries where the generatedId does not contain hyphens "-". By deleting these duplicate docs without hyphens in their generatedIds, you can clean up your dataset efficiently.

const result = await Collection.aggregate([
{
    $project: {
        _id: 1,
        doc: "$$ROOT",
    },
},
{
    $group: {
        _id: "$doc.memberId",
        count: { $sum: 1 },
        generatedId: { $first: "$doc.generatedId" },
        memberType: { $first: "$doc.memberType" },
    },
},
{
    $match: {
        count: { $gt: 1 },
        generatedId: { $regex: /^((?!-).)*$/g },
    },
},
]);

You'll end up with a list of documents that are duplicated based on memberId and do not have hyphens in their generatedIds. These can be safely deleted from your database.

Important Note: Be cautious when deleting data, as some duplicated memberIds may not contain hyphens in their generatedIds, resulting in unintentional deletions. Always backup your data before making any significant changes.

Answer №2

db.collection.aggregate([
{ 
 // Find all records with a "-" in the generatedId field
 "$match" : { "generatedId" : { "$regex": "[-]"} } },

 // Group them by memberId
  { 
   "$group": { 
        "_id": "$memberId", 
  }}

])

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Changing the main directory name in a Three.JS project triggers an unexpected aliasing glitch

Creating a new stackoverflow account just to ask a question is not my usual style, but I am completely baffled by this issue. For months, I have been struggling with a strange bug that causes a glitch in my three.js VR projects on Android. My usual method ...

"Enhancing Your List: A Comprehensive Guide to Editing List Items with the Power of AJAX, jQuery, and

At the moment, I am able to edit a list item by clicking the 'Edit' link. However, I would prefer to simply click on the list item itself to initiate the editing process. This is the content of my _item.html.erb partial. In this case, each proj ...

Tips for incorporating a time interval in every iteration of a for loop

I am attempting to display each word of a given text string on the screen at 60-second intervals. After some trial and error, here's what I have come up with: let text = "Aliquam bibendum nulla et ligula vehicula semper. Nulla id posuere lorem, ac di ...

Is integrating Vuetify with a project template created using vue-cli 3 causing issues when adding script tags to App.vue?

As a Vue beginner, I'm trying to wrap my head around a basic concept. In vue-cli 3, the "App.vue" file serves as the root component with <script> and <style> tags like any other component. However, after adding Vuetify to a project, the Ap ...

Adding an item to a nested array within an object in a Redux reducer

Here is an example of my initial state: let initialState = { data: { name: 'john', books: [ { name: 'a', price: 220 } ] } } Is there ...

Issues encountered when updating values in MaterialUI's TextField using Formik

Within my React functional component, I utilize Formik for form management and MaterialUI V5.10 for styling. The form includes TextField elements and a Canvas element. I am encountering two issues... Despite setting initial values in Formik, the TextFiel ...

Creating a state in React and populating the rows of a Material-UI table with fixed data

I have implemented a table in the UI using the material-UI Table component. Initially, I added static data without using the state property of react. Now, I am looking for a way to assign static data to table rows by utilizing the state property. The code ...

The DB GridFS API is causing files to become corrupt when downloading .zip files

Background Information: An application built with Node.js and Loopback that requires data from legacy enterprise CRM systems to be stored in its database. The data is provided in the form of a .zip file, which needs to be saved using GridFS for production ...

The React-FontAwesome icon is unable to display when a favicon has been set

I encountered an issue while using the react-fontawesome module to display brand icons. Whenever I set a favicon in <Head>...</Head> (imported from next/head), all the react-fontawesome icons disappear. Can someone please advise me on how to re ...

Removing a Request with specified parameters in MongoDB using NodeJS

Working with Angular 4 and MongoDB, I encountered an issue while attempting to send a delete request. My goal was to delete multiple items based on their IDs using the following setup: deleteData(id) { return this.http.delete(this.api, id) } In order ...

The absence of a form data boundary in the content-type of the POST request header

I have encountered an issue with my file upload code in AngularJS. The boundary is not being added to the content-type property in the request header, causing my C# web-api function to fail in detecting the image. Here's the post request using angula ...

JQuery scrolling animation not functioning on a specific page

I'm having trouble figuring out why my animation isn't working on just one specific page. Check it out here: As a test, I added a small gray square in the top left corner. When clicked, the script should scroll you down a bit. However, it seems ...

What is the reason behind RxJs recording 2 events during a long keypress?

I'm in the process of creating a user interface that reacts to keyPress events. Utilizing technologies like Angular and RxJS allows me to identify specific events. [Latest packages installed] The code structure appears as follows this.keyboard$ ...

When the button is clicked, the image vanishes into thin

One common issue is the image disappearing when users click on the Rotate Anti-clockwise or Rotate Clockwise buttons. This can be a frustrating problem to tackle! Check out this link for more information. If you run into this issue, here are some tips: ...

Treating Backbone Collection as an object instead of an array

Currently, I am working on incorporating nested comments using both Backbone and Rails. In my current setup on the server side, comment models are utilized to store the unique identifier parent_comment_id (assuming they have one). Whenever the application ...

Display the identical page using JavaScript within MVC4

I just started learning about MVC4 and I am incorporating partial views into my project. The goal is to display both original and negative images within the partial view. Initially, the view will load with the original images, but upon clicking a toggle bu ...

Reorganize a collection of objects to a more intricate structure

As I work on my project to create a table using basic HTML (JSX), I am faced with the challenge of displaying unique dates along with the quantity of apples, strawberries, bananas, and oranges sold each day. The data obtained from the API is causing diffi ...

Understanding the importance of maintaining the execution order is crucial when working with NodeJS applications that utilize

Can someone please assist me in understanding how to solve this issue? This snippet is from my auth-routes.js file: const userControllers = require('../controllers/user') module.exports = function(app){ app.post('/auth/recuperarpassword& ...

An expected expression was encountered near the if condition

I am encountering an expression expected error in Visual Studio near if(isNullOr ........ if (value) { if (isNullOrUndefined(x.value) && isNullOrUndefined(x.value2)) { x.minMark + '-' + a + '*' + x.b + ' ' + ...

What is the JavaScript method for updating an HTML5 datalist to show new options?

When populating options dynamically into an HTML5 datalist, I'm facing an issue where the browser tries to display the datalist before all the options have loaded. As a result, the list either does not show up completely or only partially shows up. Is ...