In the past, this question was answered, but MongoDB has since undergone significant advancements in its capabilities.
As mentioned in another response, MongoDB now includes sampling within the Aggregation Framework starting from version 3.2:
To achieve this, you can use the following code:
db.products.aggregate([{$sample: {size: 5}}]); // Select 5 documents
Or:
db.products.aggregate([
{$match: {category:"Electronic Devices"}}, // Filter the results
{$sample: {size: 5}} // Select 5 documents
]);
However, there are some cautions regarding the $sample operator:
(as of Nov, 6th 2017, with the latest version being 3.4) => If any of the following conditions are not met:
- $sample is the first stage of the pipeline
- N is less than 5% of the total documents in the collection
- The collection contains more than 100 documents
If any of the above conditions are NOT met, $sample performs a collection scan followed by a random sort to select N documents.
Similar to the previous example with the $match
OUTDATED ANSWER
You could previously run:
db.products.find({category:"Electronic Devices"}).skip(Math.random()*YOUR_COLLECTION_SIZE)
However, the order won't be random, and you will require two queries (one for counting YOUR_COLLECTION_SIZE) or estimating its size (around 100 records, 1000 records, 10000 records, etc.)
Another approach would be to add a field with a random number to all documents and query based on that number. The drawback here is that you'll get the same results every time you run the query. To address this, you can play with limit, skip, sort, or update the random numbers when fetching a record (leading to additional queries).
--It's uncertain whether you're using Mongoose, Mondoid, or directly the Mongo Driver for a specific language, so I'll focus on the mongo shell.
For instance, your product record might appear as follows:
{
_id: ObjectId("..."),
name: "Awesome Product",
category: "Electronic Devices",
}
I recommend using:
{
_id: ObjectId("..."),
name: "Awesome Product",
category: "Electronic Devices",
_random_sample: Math.random()
}
Then you could execute:
db.products.find({category:"Electronic Devices",_random_sample:{$gte:Math.random()}})
Periodically, you could update the _random_sample field in documents like this:
var your_query = {} // This may impact performance with many records
your_query = {category: "Electronic Devices"} // Update
// Upsert = false, multi = true
db.products.update(your_query,{$set:{_random_sample::Math.random()}},false,true)
Alternatively, you could update all retrieved records or only a few after fetching them, depending on the number of records:
for(var i = 0; i < records.length; i++){
var query = {_id: records[i]._id};
// Upsert = false, multi = false
db.products.update(query,{$set:{_random_sample::Math.random()}},false,false);
}
EDIT
Keep in mind that
db.products.update(your_query,{$set:{_random_sample::Math.random()}},false,true)
may not function optimally as it would update all products matching your query with the same random number. The last approach works more efficiently (updating documents as you retrieve them).