Security News
tea.xyz Spam Plagues npm and RubyGems Package Registries
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
corenlp
Advanced tools
Readme
This library helps making NodeJS/Web applications using the state-of-the-art technology for Natural Language Processing: Stanford CoreNLP. It is compatible with the latest release of CoreNLP 3.9.0.
This project is under active development, please stay tuned for updates. More documentation and examples are comming.
Assuming that StanfordCoreNLPServer is running on http://localhost:9000
....
import CoreNLP, { Properties, Pipeline } from 'corenlp';
const props = new Properties({
annotators: 'tokenize,ssplit,pos,lemma,ner,parse',
});
const pipeline = new Pipeline(props, 'English'); // uses ConnectorServer by default
const sent = new CoreNLP.simple.Sentence('The little dog runs so fast.');
pipeline.annotate(sent)
.then(sent => {
console.log('parse', sent.parse());
console.log(CoreNLP.util.Tree.fromSentence(sent).dump());
})
.catch(err => {
console.log('err', err);
});
Read the full API documentation.
npm i --save corenlp
Via npm
, run this command from your own project after having installed this library:
npm explore corenlp -- npm run corenlp:download
Once downloaded you can easily start the server by running
npm explore corenlp -- npm run corenlp:server
Or you can manually download the project from the Stanford's CoreNLP download section at: https://stanfordnlp.github.io/CoreNLP/download.html You may want to download, apart of the full package, other language models (see more on that page).
For advanced projects, when you have to customize the library a bit more, we highly recommend to download the StanfordCoreNLP from the original repository, and compile the source code by using ant jar
.
NOTE: Some functionality included in this library, for TokensRegex
, Semgrex
and Tregex
, requires the latest version from that repository, which contains some fixes needed by this library, not included in the latest stable release.
There are two method to connect your NodeJS application to Stanford CoreNLP:
# Run the server using all jars in the current directory (e.g., the CoreNLP home directory)
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000
CoreNLP connects by default via StanfordCoreNLPServer, using port 9000. You can also opt to setup the connection differently:
import CoreNLP, { Properties, Pipeline, ConnectorServer } from 'corenlp';
const connector = new ConnectorServer({ dsn: 'http://localhost:9000' });
const props = new Properties({
annotators: 'tokenize,ssplit,pos,lemma,ner,parse',
});
const pipeline = new Pipeline(props, 'English', connector);
CoreNLP expects by default the StanfordCoreNLP package to be placed (unzipped) inside the path ${YOUR_NPM_PROJECT_ROOT}/corenlp/
. You can also opt to setup the CLI interface differently:
import CoreNLP, { Properties, Pipeline, ConnectorCli } from 'corenlp';
const connector = new ConnectorCli({
classPath: 'corenlp/stanford-corenlp-full-2017-06-09/*', // specify the paths relative to your npm project root
mainClass: 'edu.stanford.nlp.pipeline.StanfordCoreNLP', // optional
props: 'StanfordCoreNLP-spanish.properties', // optional
});
const props = new Properties({
annotators: 'tokenize,ssplit,pos,lemma,ner,parse',
});
const pipeline = new Pipeline(props, 'English', connector);
// ... include dependencies
const props = new Properties({ annotators: 'tokenize,ssplit,lemma,pos,ner' });
const pipeline = new Pipeline(props, 'English', connector);
const sent = new CoreNLP.simple.Sentence('Hello world');
pipeline.annotate(sent)
.then(sent => {
console.log(sent.words());
console.log(sent.nerTags());
})
.catch(err => {
console.log('err', err);
});
// ... include dependencies
const props = new Properties();
props.setProperty('annotators', 'tokenize,ssplit,pos,lemma,ner,parse');
const pipeline = new Pipeline(props, 'Spanish');
const sent = new CoreNLP.simple.Sentence('Jorge quiere cinco empanadas de queso y carne.');
pipeline.annotate(sent)
.then(sent => {
console.log('parse', sent.parse()); // constituency parsing string representation
const tree = CoreNLP.util.Tree.fromSentence(sent);
tree.visitLeaves(node =>
console.log(node.word(), node.pos(), node.token().ner()));
console.log(tree.dump());
})
.catch(err => {
console.log('err', err);
});
// ... include dependencies
const props = new Properties();
props.setProperty('annotators', 'tokenize,ssplit,regexner,depparse');
const expression = new CoreNLP.simple.Expression(
'John Snow eats snow.',
'{ner:PERSON}=who <nsubj ({pos:VBZ}=action >dobj {}=what)');
const pipeline = new Pipeline(props, 'English');
pipeline.annotateSemgrex(expression, true) // similarly use pipeline.annotateTokensRegex / pipeline.annotateTregex
.then(expression => expression.sentence(0).matches().map(match => {
console.log('match', match.group('who'), match.group('action'), match.group('what'));
}))
.catch(err => {
console.log('err', err);
});
This library is isomorphic, which means that works as well on a Browser. The API is exactly the same, and you can use it directly by requiring it via a <script>
tag, using AMD (RequireJS), or within your app bundle.
The browser ready version of corenlp
can be found as dist/index.browser.min.js
, once built (npm run build
).
See the examples folder for more details.
Properties
Pipeline
Service
ConnectorServer # https://stanfordnlp.github.io/CoreNLP/corenlp-server.html
ConnectorCli # https://stanfordnlp.github.io/CoreNLP/cmdline.html
CoreNLP
simple # https://stanfordnlp.github.io/CoreNLP/simple.html
Annotable
Annotator
Document
Sentence
Token
annotator # https://stanfordnlp.github.io/CoreNLP/annotators.html
TokenizerAnnotator # https://stanfordnlp.github.io/CoreNLP/tokenize.html
WordsToSentenceAnnotator # https://stanfordnlp.github.io/CoreNLP/ssplit.html
POSTaggerAnnotator # https://stanfordnlp.github.io/CoreNLP/pos.html
MorphaAnnotator # https://stanfordnlp.github.io/CoreNLP/lemma.html
NERClassifierCombiner # https://stanfordnlp.github.io/CoreNLP/ner.html
ParserAnnotator # https://stanfordnlp.github.io/CoreNLP/parse.html
DependencyParseAnnotator # https://stanfordnlp.github.io/CoreNLP/depparse.html
RelationExtractorAnnotator # https://stanfordnlp.github.io/CoreNLP/relation.html
CorefAnnotator # https://stanfordnlp.github.io/CoreNLP/coref.html
SentimentAnnotator # https://stanfordnlp.github.io/CoreNLP/sentiment.html - Comming soon...
RelationExtractorAnnotator # https://stanfordnlp.github.io/CoreNLP/relation.html - TODO
NaturalLogicAnnotator # https://stanfordnlp.github.io/CoreNLP/natlog.html - TODO
QuoteAnnotator # https://stanfordnlp.github.io/CoreNLP/quote.html - TODO
util
Tree # http://www.cs.cornell.edu/courses/cs474/2004fa/lec1.pdf
This library is not maintained by StanfordNLP. However, it's based on and depends on StanfordNLP/CoreNLP to function.
Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55-60.
FAQs
A NodeJS CoreNLP library
The npm package corenlp receives a total of 69 weekly downloads. As such, corenlp popularity was classified as not popular.
We found that corenlp demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
Security News
As cyber threats become more autonomous, AI-powered defenses are crucial for businesses to stay ahead of attackers who can exploit software vulnerabilities at scale.
Security News
UnitedHealth Group disclosed that the ransomware attack on Change Healthcare compromised protected health information for millions in the U.S., with estimated costs to the company expected to reach $1 billion.