Having trouble accessing the loadTokenizer function in Tensorflow JS

Question

Having trouble accessing the loadTokenizer function in Tensorflow JS

As a beginner with Tensorflow.js concepts, I recently attempted to tokenize a sentence using the Universal Sentence Encoder in Javascript. You can explore more about it on Github Reference

$ npm install @tensorflow/tfjs @tensorflow-models/universal-sentence-encoder

After running this command, a package-lock.json file was generated which I placed alongside my index.html file within the same directory structure shown below.

/*
  Folder
    |_index.html
    |_package-lock.json
    |_index.js
    |_index.css
*/

Within index.html:

<head>
  <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>
  <script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/universal-sentence-encoder"></script>   
  <script src="index.js" defer></script> 
</head>

Contents of index.js:

function tokenizePad(text){
    text = use.loadTokenizer().then(tokenizer => {
        tokenizer.encode(text); 
    });
    return text;
}

text = "I enjoy my holiday very much."
var tokenized = tokenizePad(text); //error

The console displayed an error message as follows:

Uncaught TypeError: use.loadTokenizer is not a function

Is there a solution to this issue? Are there alternative methods to achieve the desired outcome of converting the string into an array of encoded values like [341, 4125, 8, 140, 31, 19, 54, ......] mentioned in the Github Reference link?

javascript npm package tokenize tensorflow.js

Answer 1

Answer №1

I faced a similar challenge and came up with the solution below:

import use from 'module';

use.load().then(useObj => {
    model = useObj.model;
    tokenizer = useObj.tokenizer;

    text = "I absolutely love going on vacation."
    var tokenized = tokenizer.encode(text); 

    console.log(tokenized); //[7933, 2222, 0, 109, 7933, 2222, 0, 154, 2174, 48, 7933, 2222, 0, 1272, 7933, 2222, 0, 645, 336, 944, 7933, 2222, 0, 5568, 7933, 2222, 0, 47, 1788, 6]
});

The approach above focuses on character-level encoding. If you discover a method for word-based encoding, please share that with me.

Answer 2

I faced a similar challenge and came up with the solution below:

import use from 'module';

use.load().then(useObj => {
    model = useObj.model;
    tokenizer = useObj.tokenizer;

    text = "I absolutely love going on vacation."
    var tokenized = tokenizer.encode(text); 

    console.log(tokenized); //[7933, 2222, 0, 109, 7933, 2222, 0, 154, 2174, 48, 7933, 2222, 0, 1272, 7933, 2222, 0, 645, 336, 944, 7933, 2222, 0, 5568, 7933, 2222, 0, 47, 1788, 6]
});

The approach above focuses on character-level encoding. If you discover a method for word-based encoding, please share that with me.

Answer 3

Answer №2

This code snippet showcases a more sophisticated and refined approach. Begin by importing it in the following manner:

import * as USE from '@tensorflow-models/universal-sentence-encoder';

Then proceed to utilize it with USE:

// Load the model.
USE.load().then(model => {
  // Embed an array of sentences.
  const sentences = [
    'Greetings.',
    'How do you do?'
  ];
  model.embed(sentences).then(embeddings => {
    // The variable `embeddings` is a 2D tensor containing 512-dimensional embeddings for each sentence.
    // Therefore, in this scenario, `embeddings` has dimensions [2, 512].
    embeddings.print(true /* verbose */);
  });
});

Answer 4

This code snippet showcases a more sophisticated and refined approach. Begin by importing it in the following manner:

import * as USE from '@tensorflow-models/universal-sentence-encoder';

Then proceed to utilize it with USE:

// Load the model.
USE.load().then(model => {
  // Embed an array of sentences.
  const sentences = [
    'Greetings.',
    'How do you do?'
  ];
  model.embed(sentences).then(embeddings => {
    // The variable `embeddings` is a 2D tensor containing 512-dimensional embeddings for each sentence.
    // Therefore, in this scenario, `embeddings` has dimensions [2, 512].
    embeddings.print(true /* verbose */);
  });
});

Having trouble accessing the loadTokenizer function in Tensorflow JS

Answer №1

Answer №2

Similar questions

How can I create a timed slideshow of images?

Dramatist experiencing the error message "npm ERR! unable to identify executable to run"

Issues encountered while attempting to verify password confirmation within a React form using Joi

Examine the state of each element within a div separately

Error: EACCES - Access Denied on Windows operating system

JavaScript event in Chrome extension triggers a browser notification and allows for modification using a specific method

Spontaneous visual paired with text upon refreshing

The URL switches back and forth from "localhost:8100" to "localhost:8100/some-route" while displaying a blank white screen

Transferring files and folders to the Electron Distribution directory

Tips for integrating Tailwind CSS into Create React App using React

Is it a graphics card malfunction or a coding complication? Delving into THREE.JS WebGL

Encountering an issue such as receiving the 'Error eacces mkdir' message when trying to execute the 'npm install -g create-react-app' command

If the item already exists within the array, I aim to replace the existing object with the new one

Hidden button trigger is malfunctioning

How come the function is being triggered by my onclick button as soon as the page loads?

Having difficulty with sending an AJAX GET request to connect to mongodb

The "smiley" character added to the information during an Ajax call

Bootstrap tab content getting shifted downwards

Combining two arrays by finding common elements

Creating a private variable to perform a select_sum query