Maximum Number of Words Allowed in Word Cloud

Question

Maximum Number of Words Allowed in Word Cloud

I am currently utilizing a well-known word cloud library from this source: https://github.com/jasondavies/d3-cloud

My code is based on a replica of this block: http://bl.ocks.org/blockspring/847a40e23f68d6d7e8b5

In my dataset, I am aiming to limit the maximum number of words displayed in the word cloud. While the library provides functions for rotation, font size, spiral method, and more, it doesn't seem to have a direct way to determine the maximum word count to show.

To optimize computation, I believe feeding a subset of the original word count would be more efficient. However, I'm unsure if the word_count object is sorted by frequency before being processed by cloud.js as there are no apparent .sort calls.

If cloud.js does sort the word_count object by frequency or tf-idf, I would need to wait until after the list is generated to return the top k words, indicating that the entire text file has been iterated through.

Despite potential speed improvements in visualization, limiting the display to the top k (most frequent) words, let's say 20 excluding common words, might not significantly impact the underlying algorithm performance.

Visualizing it, larger font sizes indicate higher frequencies, which aligns with choosing the top k as the largest font-size words.

If anyone experienced with this visualization type can guide me on adjusting the code to return the top k words, I would greatly appreciate it.

Note: I initially posted this query on GitHub but was redirected here due to relevance. I've attempted to clarify and provide sufficient context, albeit feared it may still be considered too vague for stack overflow. Thank you for understanding.

Appreciatively,

javascript d3.js word-cloud

Answer 1

Answer №1

Maybe

let sentence = text.split(/[ '\-\(\)\*":;\[\]|{},.!?]+/),
  threshold = 5;
if (sentence.length == 1) {
  word_freq[sentence[0]] = 1;
} else {
  sentence.forEach(function(word) {
    let word = word.toLowerCase();
    if (word != "" && common_words.indexOf(word) == -1 && word.length > 1) {
      if (word_freq[word]) {
        word_freq[word]++;
      } else {
        word_freq[word] = 1;
      }
    }
  });
  for (let word in word_freq) {
    if (word_freq[word] < threshold) delete word_freq[word];
  }
}

You could consider implementing a counter to adjust the limit if there are too many words, ensuring Object.keys(word_freq).length is less than 20000.

Answer 2

Maybe

let sentence = text.split(/[ '\-\(\)\*":;\[\]|{},.!?]+/),
  threshold = 5;
if (sentence.length == 1) {
  word_freq[sentence[0]] = 1;
} else {
  sentence.forEach(function(word) {
    let word = word.toLowerCase();
    if (word != "" && common_words.indexOf(word) == -1 && word.length > 1) {
      if (word_freq[word]) {
        word_freq[word]++;
      } else {
        word_freq[word] = 1;
      }
    }
  });
  for (let word in word_freq) {
    if (word_freq[word] < threshold) delete word_freq[word];
  }
}

You could consider implementing a counter to adjust the limit if there are too many words, ensuring Object.keys(word_freq).length is less than 20000.

Maximum Number of Words Allowed in Word Cloud

Answer №1

Similar questions

Using Node.js and MongoDB to filter a sub array within an array of objects

Revalidation of paths in NextJS 13 is functioning properly for newly created comments, however, it is not working as

I'm noticing multiple repeated entries appearing when I try to set the cookie - what could be causing

Can't get className to work in VueJS function

Enhance your viewing experience by magnifying a specific element, all while maintaining full control to navigate

Execute JavaScript function with a delay once the webpage has finished loading

Executing Cascading Style Sheets (CSS) within JQuery/Javascript

It seems that the `to` required prop was missing in the `Link` component of React-Router

Adding data to an array using jQuery

Optimal method for displaying videos in a GatsbyJS front-end, utilizing Node.js for the back-end, and MySQL for the database

Unable to execute ajax on dom ready in Internet Explorer 9

Foundation plus Ajax page reloading equals a powerful combination

What is the best approach to implement pagination in the UI-Bootstrap Typeahead Directive?

Retrieving date from timestamp in a node.js environment

What is the best way to preserve text input value when going back a page in PHP?

Newbie's Guide - Building React/React-Bootstrap JavaScript Components Post Linking CDNs in index.html

Tips for using jQuery dropdown menus

Ways of converting a negative lookbehind into an ES5-friendly expression

Create a dynamic Bootstrap progress bar that smoothly transitions from 0 to 100%

Alter the jQuery in an Iframe to make changes to the parent document