Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, text generation, question answering, machine translation, language detection, tokenization, POS tagging, and dependency parsing. It
This is a Node.js client for the NLP Cloud API: https://docs.nlpcloud.io
NLP Cloud serves high performance pre-trained for NER, sentiment-analysis, classification, summarization, text generation, question answering, machine translation, language detection, tokenization, POS tagging, and dependency parsing. It is ready for production, served through a REST API.
Pre-trained models are the spaCy models and some transformers-based models from Hugging Face. You can also deploy your own transformers-based models, or spaCy models.
If you face an issue, don't hesitate to raise it as a Github issue. Thanks!
Install via npm.
npm install nlpcloud --save
All objects returned by the library are Axios promises.
In case of success, results are contained in response.data
. In case of failure, you can retrieve the status code in err.response.status
and the error message in err.response.data.detail
.
Here is a full example that performs Named Entity Recognition (NER) using spaCy's en_core_web_lg
model, with a fake token:
const NLPCloudClient = require('nlpcloud');
const client = new NLPCloudClient('en_core_web_lg','4eC39HqLyjWDarjtT1zdp7dc')
client.entities("John Doe is a Go Developer at Google")
.then(function (response) {
console.log(response.data);
})
.catch(function (err) {
console.error(err.response.status);
console.error(err.response.data.detail);
});
And a full example that uses your own custom model 7894
:
const NLPCloudClient = require('nlpcloud');
const client = new NLPCloudClient('custom_model/7894','4eC39HqLyjWDarjtT1zdp7dc')
client.entities("John Doe is a Go Developer at Google")
.then(function (response) {
console.log(response.data);
})
.catch(function (err) {
console.error(err.response.status);
console.error(err.response.data.detail);
});
A json object is returned. Here is what it could look like:
[
{
"end": 8,
"start": 0,
"text": "John Doe",
"type": "PERSON"
},
{
"end": 25,
"start": 13,
"text": "Go Developer",
"type": "POSITION"
},
{
"end": 35,
"start": 30,
"text": "Google",
"type": "ORG"
},
]
Pass the model you want to use and the NLP Cloud token to the client during initialization.
The model can either be a pretrained model like en_core_web_lg
, bart-large-mnli
... but also one of your custom transformers-based models, or spaCy models, using custom_model/<model id>
(e.g. custom_model/2568
).
Your token can be retrieved from your NLP Cloud dashboard.
const NLPCloudClient = require('nlpcloud');
const client = new NLPCloudClient('<model>','<your token>')
If you want to use a GPU, pass gpu = true
.
const NLPCloudClient = require('nlpcloud');
const client = new NLPCloudClient("<model>", "<your token>", gpu = true)
Call the entities()
method and pass the text you want to perform named entity recognition (NER) on.
client.entities("<Your block of text>")
Call the classification()
method and pass the following arguments:
multi_class
Whether the classification should be multi-class or not, as a booleanclient.classification("<Your block of text>", ["label 1", "label 2", "..."])
Call the generation()
method and pass the following arguments:
minLength
: The minimum number of tokens that the generated text should contain, as an integer. The size of the generated text should not exceed 256 tokens on a CPU plan and 1024 tokens on GPU plan. If lengthNoInput
is false, the size of the generated text is the difference between minLength
and the length of your input text. If lengthNoInput
is true, the size of the generated text simply is minLength
. Defaults to 10.maxLength
: The maximum number of tokens that the generated text should contain, as an integer. The size of the generated text should not exceed 256 tokens on a CPU plan and 1024 tokens on GPU plan. If lengthNoInput
is false, the size of the generated text is the difference between maxLength
and the length of your input text. If lengthNoInput
is true, the size of the generated text simply is maxLength
. Defaults to 50.lengthNoInput
: Whether minLength
and maxLength
should not include the length of the input text, as a boolean. If false, minLength
and maxLength
include the length of the input text. If true, min_length and maxLength
don't include the length of the input text. Defaults to false.endSequence
: A specific token that should be the end of the generated sequence, as a string. For example if could be .
or \n
or ###
or anything else below 10 characters.removeInput
: Whether you want to remove the input text form the result, as a boolean. Defaults to false.topK
: The number of highest probability vocabulary tokens to keep for top-k-filtering, as an integer. Maximum 1000 tokens. Defaults to 0.topP
: If set to float < 1, only the most probable tokens with probabilities that add up to top_p or higher are kept for generation. This is a float. Should be between 0 and 1. Defaults to 0.7.temperature
: The value used to module the next token probabilities, as a float. Should be between 0 and 1. Defaults to 1.repetitionPenalty
: The parameter for repetition penalty, as a float. 1.0 means no penalty. Defaults to 1.0.lengthPenalty
: Exponential penalty to the length, as a float. 1.0 means no penalty. Set to values < 1.0 in order to encourage the model to generate shorter sequences, or to a value > 1.0 in order to encourage the model to produce longer sequences. Defaults to 1.0.client.generation("<Your input text>")
Call the sentiment()
method and pass the text you want to analyze the sentiment of:
client.sentiment("<Your block of text>")
Call the question()
method and pass the following:
client.question("<Your context>", "<Your question>")
Call the summarization()
method and pass the text you want to summarize.
client.summarization("<Your text to summarize>")
Call the translation()
method and pass the text you want to translate.
client.translation("<Your text to translate>")
Call the langdetection()
method and pass the text you want to analyze in order to detect the languages.
client.langdetection("<The text you want to analyze>")
Call the tokens()
method and pass the text you want to tokenize.
client.tokens("<Your block of text>")
Call the dependencies()
method and pass the text you want to perform part of speech tagging (POS) + arcs on.
client.dependencies("<Your block of text>")
Call the sentenceDependencies()
method and pass a block of text made up of several sentencies you want to perform POS + arcs on.
client.sentenceDependencies("<Your block of text>")
Call the libVersions()
method to know the versions of the libraries used behind the hood with the model (for example the PyTorch, TensorFlow, or spaCy version used).
client.libVersions()
FAQs
NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, text generation, image generation, code generation, question answering, automatic speech recognition, machine translat
The npm package nlpcloud receives a total of 595 weekly downloads. As such, nlpcloud popularity was classified as not popular.
We found that nlpcloud demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.