
Security News
Insecure Agents Podcast: Certified Patches, Supply Chain Security, and AI Agents
Socket CEO Feross Aboukhadijeh joins Insecure Agents to discuss CVE remediation and why supply chain attacks require a different security approach.
@tensorflow-models/universal-sentence-encoder
Advanced tools
Universal Sentence Encoder lite in TensorFlow.js
The Universal Sentence Encoder (Cer et al., 2018) (USE) is a model that encodes text into 512-dimensional embeddings. These embeddings can then be used as inputs to natural language processing tasks such as sentiment classification and textual similarity analysis.
This module is a TensorFlow.js GraphModel converted from the USE lite (module on TFHub), a lightweight version of the original. The lite model is based on the Transformer (Vaswani et al, 2017) architecture, and uses an 8k word piece vocabulary.
In this demo we embed six sentences with the USE, and render their self-similarity scores in a matrix (redder means more similar):

The matrix shows that USE embeddings can be used to cluster sentences by similarity.
The sentences (taken from the TensorFlow Hub USE lite colab):
The Universal Sentence Encoder for question answering (USE QnA) is a model that encodes question and answer texts into 100-dimensional embeddings. The dot product of these embeddings measures how well the answer fits the question. It can also be used in other applications, including any type of text classification, clustering, etc.
This module is a lightweight TensorFlow.js GraphModel. The model is based on the Transformer (Vaswani et al, 2017) architecture, and uses an 8k SentencePiece vocabulary. It is trained on a variety of data sources, with the goal of learning text representations that are useful out-of-the-box to retrieve an answer given a question.
In this demo we embed a question and three answers with the USE QnA, and render their their scores:

The scores show how well each answer fits the question.
Using yarn:
$ yarn add @tensorflow/tfjs @tensorflow-models/universal-sentence-encoder
Using npm:
$ npm install @tensorflow/tfjs @tensorflow-models/universal-sentence-encoder
To import in npm:
require('@tensorflow/tfjs');
const use = require('@tensorflow-models/universal-sentence-encoder');
or as a standalone script tag:
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/universal-sentence-encoder"></script>
Then:
// Load the model.
use.load().then(model => {
// Embed an array of sentences.
const sentences = [
'Hello.',
'How are you?'
];
model.embed(sentences).then(embeddings => {
// `embeddings` is a 2D tensor consisting of the 512-dimensional embeddings for each sentence.
// So in this example `embeddings` has the shape [2, 512].
embeddings.print(true /* verbose */);
});
});
load() accepts an optional configuration object where you can set custom modelUrl and/or vocabUrl strings (e.g. use.load({modelUrl: '', vocabUrl: ''})).
To use the Tokenizer separately:
use.loadTokenizer().then(tokenizer => {
tokenizer.encode('Hello, how are you?'); // [341, 4125, 8, 140, 31, 19, 54]
});
To use the QnA dual encoder:
// Load the model.
use.loadQnA().then(model => {
// Embed a dictionary of a query and responses. The input to the embed method
// needs to be in following format:
// {
// queries: string[];
// responses: Response[];
// }
// queries is an array of question strings
// responses is an array of following structure:
// {
// response: string;
// context?: string;
// }
// context is optional, it provides the context string of the answer.
const input = {
queries: ['How are you feeling today?', 'What is captial of China?'],
responses: [
'I\'m not feeling very well.',
'Beijing is the capital of China.',
'You have five fingers on your hand.'
]
};
var scores = [];
const embeddings = model.embed(input);
/*
* The output of the embed method is an object with two keys:
* {
* queryEmbedding: tf.Tensor;
* responseEmbedding: tf.Tensor;
* }
* queryEmbedding is a tensor containing embeddings for all queries.
* responseEmbedding is a tensor containing embeddings for all answers.
* You can call `arraySync()` to retrieve the values of the tensor.
* In this example, embed_query[0] is the embedding for the query
* 'How are you feeling today?'
* And embed_responses[0] is the embedding for the answer
* 'I\'m not feeling very well.'
*/
const embed_query = embeddings['queryEmbedding'].arraySync();
const embed_responses = embeddings['responseEmbedding'].arraySync();
// compute the dotProduct of each query and response pair.
for (let i = 0; i < input['queries'].length; i++) {
for (let j = 0; j < input['responses'].length; j++) {
scores.push(dotProduct(embed_query[i], embed_responses[j]));
}
}
});
// Calculate the dot product of two vector arrays.
const dotProduct = (xs, ys) => {
const sum = xs => xs ? xs.reduce((a, b) => a + b, 0) : undefined;
return xs.length === ys.length ?
sum(zipWith((a, b) => a * b, xs, ys))
: undefined;
}
// zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
const zipWith =
(f, xs, ys) => {
const ny = ys.length;
return (xs.length <= ny ? xs : xs.slice(0, ny))
.map((x, i) => f(x, ys[i]));
}
FAQs
Universal Sentence Encoder lite in TensorFlow.js
The npm package @tensorflow-models/universal-sentence-encoder receives a total of 6,447 weekly downloads. As such, @tensorflow-models/universal-sentence-encoder popularity was classified as popular.
We found that @tensorflow-models/universal-sentence-encoder demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 10 open source maintainers collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Socket CEO Feross Aboukhadijeh joins Insecure Agents to discuss CVE remediation and why supply chain attacks require a different security approach.

Security News
Tailwind Labs laid off 75% of its engineering team after revenue dropped 80%, as LLMs redirect traffic away from documentation where developers discover paid products.

Security News
The planned feature introduces a review step before releases go live, following the Shai-Hulud attacks and a rocky migration off classic tokens that disrupted maintainer workflows.