I attempted to create a lookup function that processes an array containing object ids and timestamps of object y. While it worked smoothly using localfield
and foreignfield
, I encountered difficulties replicating the same outcome with pipeline.
(I used names like y
for simplicity)
Successful version:
$lookup: {
from: 'y',
localField: 'ys.object_id',
foreignField: '_id',
as: 'docs',
},
The structure of objects in the ys
array is as follows:
{
object_id: ObjectID(),
timestamp: Date(),
}
I aim to rewrite this expression to incorporate a pipeline, as I want to filter out some objects based on their timestamp attribute during the lookup process.
My attempt so far:
$lookup: {
from: 'y',
let: { ys: '$ys' },
pipeline: [
{
$match: { $expr: { $eq: ['$_id', '$$ys.object_id'] } },
},
],
as: 'docs',
},
Database size: 20.4GB
Complete Query:
const query = [
{
$match: { 'ys.timestamp': { $lte: date, $gt: previousMonth } },
},
{
$limit: 100,
},
{
$lookup: {
from: 'y',
let: { ys: '$ys' },
pipeline: [
{
$match: { $expr: { $in: ['$_id', '$$ys.object_id'] } },
},
{
$sort: { timestamp: -1 },
},
{
$limit: 1,
},
],
as: 'doc',
},
},
];
Unfortunately, the above solution does not seem to work properly; it gets stuck and fails to return any results (times out eventually).
Is there a correct way to convert the successful approach into one utilizing a pipeline?
IMPORTANT: I modified the query to target a specific element by ID before executing the lookup. While this method did succeed, it took around 20 seconds to complete. I suspect this delay is causing the timeout in my usual query. Can someone explain the performance disparity between the two methods and suggest a workaround?