Combining data to form Nested Arrays

Question

Combining data to form Nested Arrays

I have a dataset of records stored in a database and I've been attempting to extract a complex set of information from these records.

Here are some sample records for reference:

{
    bookId : '135wfkjdbv',
    type : 'a',
    store : 'barnes & noble',
    shelf : 'A1'
}
{
    bookId : '13erjfn',
    type : 'b',
    store : 'barnes & noble',
    shelf : 'A2'
}

I'm looking to extract data that will provide, for each unique bookId, the count of records per shelf for each store where the book belongs to type 'a'.

Although I am aware that an aggregation query allows for various operations such as grouping and matching, I have not yet found a solution to this particular problem.

The desired output should look like this:

{
   bookId : '135wfkjdbv',
   stores : [
       {
           name : 'barnes & noble',
           shelves : [
                {
                     name : 'A1',
                     count : 12
                },
           ]
       },
       {
           name : 'books-a-million',
           shelves : [
                {
                     name : 'B3',
                     count : 8
                },
                {
                     name : 'D5',
                     count : 15
                },
           ]
       }  
   ]
}

javascript mongodb mongodb-query aggregation-framework

Answer 1

Answer №1

Understanding the process is not as challenging as it may seem at first glance. The aggregation "pipeline" operates by passing results from one stage to the next for processing, similar to how a Unix "pipe" functions:

ps -ef | grep mongo | tee out.txt

In this specific example, there are three $group stages involved. The initial stage performs basic aggregation, while the subsequent two stages "roll up" the necessary arrays in the output.

db.collection.aggregate([
    { "$group": {
        "_id": {
            "bookId": "$bookId",
            "store": "$store",
            "shelf": "$shelf"
        },
        "count": { "$sum": 1 }
    }},
    { "$group": {
        "_id": {
            "bookId": "$_id.bookId",
            "store": "$_id.store"
        },
        "shelves": { 
            "$push": {
                "name": "$_id.shelf",
                "count": "$count"
            }
        }
    }},
    { "$group": {
        "_id": "$_id.bookId",
        "stores": {
            "$push": {
                "name": "$_id.store",
                "shelves": "$shelves"
            }
        }
    }}
])

It may be tempting to use $project at the end to rename _id to

bookId</code, but it's important to remember that <code>_id

serves as the primary key. Developing good habits from the start will prevent unnecessary complications and costs associated with such alterations.

The essence of this operation lies in crafting the grouping details into the primary keys of each $group. Each subsequent stage then organizes these groupings into array structures, progressively condensing the grouping fields with their corresponding counts. This concept aligns with the SQL syntax:

GROUP BY bookId, store, shelf

From organizing shelves within stores to aggregating stores within bookIds, the pipeline structure efficiently iterates through data transformations. By visualizing the process as a series of interconnected forms, users can effectively fold results into hierarchical arrays.

Answer 2

Understanding the process is not as challenging as it may seem at first glance. The aggregation "pipeline" operates by passing results from one stage to the next for processing, similar to how a Unix "pipe" functions:

ps -ef | grep mongo | tee out.txt

In this specific example, there are three $group stages involved. The initial stage performs basic aggregation, while the subsequent two stages "roll up" the necessary arrays in the output.

db.collection.aggregate([
    { "$group": {
        "_id": {
            "bookId": "$bookId",
            "store": "$store",
            "shelf": "$shelf"
        },
        "count": { "$sum": 1 }
    }},
    { "$group": {
        "_id": {
            "bookId": "$_id.bookId",
            "store": "$_id.store"
        },
        "shelves": { 
            "$push": {
                "name": "$_id.shelf",
                "count": "$count"
            }
        }
    }},
    { "$group": {
        "_id": "$_id.bookId",
        "stores": {
            "$push": {
                "name": "$_id.store",
                "shelves": "$shelves"
            }
        }
    }}
])

It may be tempting to use $project at the end to rename _id to

bookId</code, but it's important to remember that <code>_id

serves as the primary key. Developing good habits from the start will prevent unnecessary complications and costs associated with such alterations.

The essence of this operation lies in crafting the grouping details into the primary keys of each $group. Each subsequent stage then organizes these groupings into array structures, progressively condensing the grouping fields with their corresponding counts. This concept aligns with the SQL syntax:

GROUP BY bookId, store, shelf

From organizing shelves within stores to aggregating stores within bookIds, the pipeline structure efficiently iterates through data transformations. By visualizing the process as a series of interconnected forms, users can effectively fold results into hierarchical arrays.

Combining data to form Nested Arrays

Answer №1

Similar questions

Why is a <script> tag placed within a <noscript> tag?

What sets apart using the loadText function from loadText() in JavaScript?

Can transclusion be achieved while maintaining the directive's scope in Angular?

The functionality does not seem to be functioning in Mozilla Firefox, however it is working correctly in Chrome when the following code is executed: `$('input[data-type="choise"

Creating a new document or collection by posting an aggregation

Is there a method to programmatically identify enterprise mode in IE11?

What is the process for storing data in Zend Framework 2 using MongoDB with Doctrine 2 ODM?

What is the best way to display a div box only when a user checks a checkbox for the first time out of a group of checkboxes, which could be 7 or more, and then hide the

Identifying the various types in Typescript

Accessing Nested States in Angular: A How-To Guide

Is there a way to call class methods from external code?

Modify the events JSON URL when the Full Calendar changes its display period

Require assistance in creating an event that triggers an action when clicked, while another event remains inactive upon clicking

Setting up redux with Next.js: a step-by-step guide

Animating jQuery to adjust height based on a percentage

Capture the responseBody data within a Newman script from a Postman collection and save it to a

Is there a method to establish a connection between two models in a MERN stack application and retrieve the user ID from the other

Discover all undefined variables in the project using Node.js

Encountering a Bad Request error while attempting to refresh the Django login access token

Transferring data obtained from an API in node.js to display in an HTML table