I'm currently developing an FAQ system that includes a large number of question-answer pairs. My goal is to group similar questions together and I've been utilizing the npm set-clustering package for this purpose.
While the package offers a good match based on token matching, it requires me to specify the number of groups to create.
My ideal scenario would be for the grouping to be automatic, with the algorithm determining the appropriate number of groups to be created (Unsupervised learning).
If you know of any other package or platform that could assist me, please let me know.
Sample Questions:
What is the pricing of your product?
Can I speak to your representative?
Hi
Hi Friend
Hi, Good Morning
How much does it cost?
Current Result: (When specifying '3' as the number of groups)
(Hi, Hi Friend)
(What is the pricing of your product?, How much does the product cost?)
(Can I speak to your representative?, Hi, Good Morning)
Desired Grouping: (Without providing '3' as input)
(Hi, Hi Friend, Hi, Good Morning)
(What is the pricing of your product?, How much does the product cost?)
(Can I speak to your representative?)
Current Code:
var cluster = require('set-clustering');
for (let row of resp) {
articles.push({
title: row.que,
tags: row.tags
});
}
function similarity(x, y) {
var score = 0;
x.tags.forEach(function(tx) {
y.tags.forEach(function(ty) {
if (tx == ty)
score += 1;
});
});
return score;
}
// I want the grouping to be done autonomously without specifying the number of groups
var groups = c.evenGroups(3);
var titles = groups.map(function(group) {
return group.map(function(article) {
return article.title;
});
});
console.log(titles);