Delete Entries in MongoDB Collection According to Unique User Pairs

I have a collection of messages stored in MongoDB and I need to keep only the latest 500 records for each pair of users. Users are identified by their sentBy and sentTo attributes.

/* 1 */
{
    "_id" : ObjectId("5f1c1b00c62e9b9aafbe1d6c"),
    "sentAt" : ISODate("2020-07-25T11:44:00.004Z"),
    "readAt" : ISODate("1970-01-01T00:00:00.000Z"),
    "msgBody" : "dummy text",
    "msgType" : "text",
    "sentBy" : ObjectId("54d6732319f899c704b21ef7"),
    "sentTo" : ObjectId("54d6732319f899c704b21ef5"),
}

// More document examples here...

/* and more... assume it's over 10,000 documents */

I'm thinking of an algorithm that involves:

  • Grouping based on user pairs using the OR operator
  • Sorting the records in descending order based on timestamp
  • Limited to 500 records per user pair
  • Extract the array of _id values to be preserved
  • Use these IDs in a new query with .deleteMany() and $nin condition

I've been struggling with this issue and would greatly appreciate any help or guidance. Thank you so much :)

Answer №1

Based on the size of the collection, two different approaches could be considered:

  1. If the scale is manageable and you can group the entire collection within a reasonable timeframe, a similar method to your suggestion can be implemented:
db.collection.aggregate([
    {
        $sort: {
            sentAt: 1
        }
    },
    {
        $group: {
            _id: {
                $cond: [
                    {$gt: ["$sentBy", "$sentTo"]},
                    ["$sendBy", "$sentTo"],
                    ["$sentTo", "$sendBy"],
                ]
            },
            roots: {$push: "$$ROOT"}
        }
    },
    {
        $project: {
            roots: {$slice: ["$roots", -500]}
        }
    },
    {
        $unwind: "$roots"
    },
    {
        $replaceRoot: {
            newRoot: "$roots"
        }
    },
    {
        $out: "this_collection"
    }
])

The sorting stage should precede other operations as you cannot sort an inner array post grouping. The $cond in the group stage emulates the logic of the $or operator. Instead of fetching the result and using deleteMany with $nin, the $out operator can be used to rewrite the current collection.

  1. If the scale is too large for the previous approach, iterating user by user and following your initial suggestion would be more efficient. Here's a simple example:

let userIds = await db.collection.distinct("sentBy");

let done = [1];
for (let i = 0; i < userIds.length; i++) {
    
    let matches = await db.collection.aggregate([
        {
            $match: {
                $and: [
                    {
                        $or: [
                            {
                                "sentTo": userIds[i]
                            },
                            {
                                "sendBy": userIds[i]
                            }
                        ]
                    },
                    {  
                        $or: [
                            {
                                sendTo: {$nin: done}
                            },
                            {
                                sendBy: {$nin: done}
                            }
                        ]   
                    }
                ]
            }
        },
        {
            $sort: {
                sentAt: 1
            }
        },
        {
            $group: {
                _id: {
                    $cond: [
                        {$eq: ["$sentBy", userIds[i]]},
                        "$sendTo",
                        "$sentBy"
                    ]
                },
                roots: {$push: "$$ROOT"}
            }
        },
        {
            $project: {
                roots: {$slice: ["$roots", -500]}
            }
        },
        {
            $unwind: "$roots"
        },
        {
            $group: {
                _id: null,
                keepers: {$push: "$roots._id"}
            }
        }
    ]).toArray();
    
    if (matches.length) {
        await db.collection.deleteMany(
            {
                $and: [
                    {
                        $or: [
                            {
                                "sentTo": userIds[i]
                            },
                            {
                                "sendBy": userIds[i]
                            }
                        ]
                    },
                    {  
                        $or: [
                            {
                                sendTo: {$nin: done}
                            },
                            {
                                sendBy: {$nin: done}
                            }
                        ]
                    },
                    {
                        _id: {$nin: matches[0].keepers}
                    }
                ]
            }
        )
    }
    
    done.push(userIds[i])
}

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Tips for pausing keyframes animation on its final animation frame

Is there a way to halt my animation at the 100% keyframe? What I'm attempting to accomplish is animating these boxes so that when you click on the top one, it moves to the bottom and vice versa. Any suggestions or ideas on how to achieve this? <h ...

What is the reason behind this Uncaught TypeError that is happening?

After converting my questionnaire to a PHP file and adding a validation script, I encountered an error: Uncaught TypeError: Cannot set property 'onClick' of null The error is pointing me to line 163 in my JavaScript file, where the function f ...

transform the PHP object array string into a PHP variable object

I have an api call that retrieves data stdClass Object ( [data] => stdClass Object ( [TransactionId] => 10254 [RequestId] => 1548 [ResponseTime] => 0.161 [SSP] => test1542 ...

``What is the best way to handle CRUD operations in an Express application?

I have implemented node.js, express, and MongoDB connection with mongoose. Originally, I had a working code in a single file called server.js without using express. However, upon trying to integrate it into express, the functionality is not as expected. T ...

Create a basic single page application with Node.js and Express

Currently, I am working on developing a web application utilizing Node.js for the Back End and HTML/CSS/JS for the Front End. My goal is to create a single page app using the Express framework. I am interested in building a single page application with ju ...

Passing all emitted events from Vue 3 child component to its parent - A complete guide

My Vue components are structured as follows: <TopParent> <!-- Listening for events from EventProducer here --> <Child_1> <Child_2> <Child_3> ... <Child_N> <EventProducer /> &l ...

Eliminating the bottom border of all buttons, except for the last three buttons in the list, solely using pure JavaScript, unless alternative methods are available

I have 3 sets of buttons, with each set containing 9 buttons stacked in 3 columns inside an ordered list (ol) within list items (li). I have removed the bottom border of the buttons to avoid double borders since they are stacked on top of each other withou ...

Discovering the smallest, largest, and average values across all properties in an array of objects

Given an array of objects with varying values, the task is to determine the minimum, maximum, and average of the properties in that array. For example, consider the following array: const array = [{ "a": "-0.06", "b": "0.25", "c": "-0.96", ...

Determine the amount of unused vertical space within a block of text

Having applied CSS to a span element: font-height = 120px; height = 120px; line-height = 120px; The text inside the span does not completely fill the height of 120px. Is there a method to determine the offset of the text from the top and bottom boundar ...

What is the most effective way to update the React state based on an if-else condition within a

I am facing an issue where I have a component in my project that needs to update the parent state when clicked, but there is an if-else condition causing the state not to update in the parent component. In my project, I have two boxes with an if-else cond ...

Display a dropdown menu when hovering over with a delay

I recently created a basic navigation menu with dropdown functionality using CSS3 initially, but I decided to enhance it by incorporating jQuery to display the dropdown after a set timeframe. However, I am facing an issue where all dropdowns appear when ho ...

sending an array from one CodeIgniter view to another view via Ajax

In the following code segments of my application, myArray is an array where each element contains a few objects that I need to use in a second view. When I use alert(myJSON);, I am able to see the array in the alert window. However, when the view loads, i ...

The communication hub in a Vue.js application

I'm currently developing a Vue single-page project and I have implemented an empty Vue instance as a central event bus. However, I've encountered an issue when trying to fire an event. eventbus.js import vue from 'Vue' export default ...

I am looking to integrate my information into the user interface using Angular

import { Component, ViewEncapsulation } from '@angular/core'; import { Router } from '@angular/router'; import { Batch } from '../../../config/batchnew/batch.model'; import { BatchService } from '../../../config/batchnew ...

Why is it that this JavaScript isn't working as intended in the popup form?

</br> s_foot"> * use ajax.jquery as control event. like $("#save").click(function(){.....}); <script type="text/javascript>" var wp; var position; var pid; var product_name; var production_date; In this script, I am attempting to re ...

Is it possible to set up a PHP variable within a JavaScript function?

In the code snippet above, we have a JavaScript function that is used for validation. I am looking to set a PHP variable within the else statement. function validate() { if(document.loginForm.vuser_login.value==""){ alert("Login Name name ca ...

If the checkbox is selected, the textbox will receive the class "form-input validate required" upon Jquery validation

I am using Jquery Validation plugin to validate a form on my website. Below is the HTML form code: <form id="caller"> <label>Phone:</label> <input type="text" name="phone" id="phonen" class="form-input" value="" /> <di ...

Using a variable as an argument for a DOM function in JavaScript

I found this code snippet on a website and made some changes to it. However, the modified code below is not functioning as expected. My goal was to hide the div with the id "demo1", but for some reason, it's not working. What could be causing this is ...

The grid fails to apply remote filtering values when an additional Nested ajax call is incorporated alongside the current HttpProxy configuration

Whenever I click for filter/sort for remote filtering, Forms.asp triggers using a proxy and automatically reloads. Previously, when I used the script below to reload the ExtJS grid with Forms.asp returning new XML with filtered grid data, everything worked ...

Disable, Hide, or Remove Specific Options in a Single Dropdown Selection

A challenge I am facing involves creating a form with multiple select options that need to be ranked by the user from 1-8. However, I am encountering some difficulties in hiding, removing, or disabling certain select options. Below is an excerpt from my f ...