When working with couchbase, sorting results in or after reduce is not supported, making it challenging to retrieve the "Top 10" of something directly. In couchbase views, values are always sorted by key. The recommended approach is as follows:
- Query your view that provides key-value pairs structured as
tag_name - count_value
, ordered by tag_name
.
- Create a job that runs periodically (e.g. every N minutes) to fetch results from step [1], sort them, and store the sorted results in a separate key (e.g. "Top10Tags").
- In your application, query the key Top10Tags.
This method may help reduce network traffic, but keep in mind that the results could be outdated. To optimize performance, consider creating this "job" on the same server where couchbase is running (e.g. develop a small node.js application) to minimize data transfer and processing overhead for sorting at regular intervals.
If you are using the _count reduce function, emitting numbers is unnecessary; simply use null:
function(doc, meta) {
if(meta.type === "json" && doc.type === 'log') {
emit(doc.tag, null);
}
}
To handle documents tagged with multiple tags such as:
{
"type": "log",
"tags": ["tag1","tag2","tag3"]
}
Your map function should be updated as follows:
function(doc, meta) {
if(meta.type === "json" && doc.type === 'log') {
for(var i = 0; i < doc.tags.length; i++){
emit(doc.tags[i], null);
}
}
}
Regarding the top10 list, consider storing it in a memcache bucket if disk storage is not preferred.