Socket
Socket
Sign inDemoInstall

wink-nlp

Package Overview
Dependencies
0
Maintainers
1
Versions
40
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

    wink-nlp

Developer friendly NLP ✨


Version published
Maintainers
1
Install size
498 kB
Created

Changelog

Source

Version 1.5.0 June 22, 2021

⚙️ Updates

  • Exposed its and as helpers via the instance of winkNLP as well. 🤓

Readme

Source

winkNLP

Build Status Coverage Status Gitter Follow on Twitter

Developer friendly NLP ✨

winkNLP is a JavaScript library for Natural Language Processing (NLP). Designed specifically to make development of NLP solutions easier and faster, winkNLP is optimized for the right balance of performance and accuracy. The package can handle large amount of raw text at speeds over 525,000 tokens/second. And with a test coverage of ~100%, winkNLP is a tool for building production grade systems with confidence.

Wink Wizard Showcase

Features

It packs a rich feature set into a small foot print codebase of under 1500 lines:

  1. Lossless & multilingual tokenizer

  2. Developer friendly and intuitive API

  3. Built-in API to aid text visualization

  4. Easy information extraction from raw text

  5. Extensive text processing features such as bag-of-words, frequency table, stop word removal, readability statistics computation and many more.

  6. Pre-trained models with sizes starting from <3MB onwards

  7. BM25-based vectorizer

  8. Cosine similarity

  9. Word vector integration

  10. Comprehensive NLP pipeline covering tokenization, sentence boundary detection, negation handling, sentiment analysis, part-of-speech (pos) tagging, lemmatization, named entity extraction, custom entities detection and pattern matching

  11. No external dependencies.

  12. Runs on web browsers

Installation

Use npm install:

npm install wink-nlp --save

In order to use winkNLP after its installation, you also need to install a language model. The following command installs the latest version of default language model — the light weight English language model called wink-eng-lite-model.

node -e "require( 'wink-nlp/models/install' )"

Any required model can be installed by specifying its name as the last parameter in the above command. For example:

node -e "require( 'wink-nlp/models/install' )" wink-eng-lite-model
How to install for Web Browser

If you’re using winkNLP in the browser use the wink-eng-lite-web-model instead. Learn about its installation and usage in our guide to using winkNLP in the browser.

Getting Started

The "Hello World!" in winkNLP is given below. As the next step, we recommend a dive into winkNLP's concepts.

// Load wink-nlp package  & helpers.
const winkNLP = require( 'wink-nlp' );
// Load "its" helper to extract item properties.
const its = require( 'wink-nlp/src/its.js' );
// Load "as" reducer helper to reduce a collection.
const as = require( 'wink-nlp/src/as.js' );
// Load english language model — light version.
const model = require( 'wink-eng-lite-model' );
// Instantiate winkNLP.
const nlp = winkNLP( model );

// NLP Code.
const text = 'Hello   World🌎! How are you?';
const doc = nlp.readDoc( text );

console.log( doc.out() );
// -> Hello   World🌎! How are you?

console.log( doc.sentences().out() );
// -> [ 'Hello   World🌎!', 'How are you?' ]

console.log( doc.entities().out( its.detail ) );
// -> [ { value: '🌎', type: 'EMOJI' } ]

console.log( doc.tokens().out() );
// -> [ 'Hello', 'World', '🌎', '!', 'How', 'are', 'you', '?' ]

console.log( doc.tokens().out( its.type, as.freqTable ) );
// -> [ [ 'word', 5 ], [ 'punctuation', 2 ], [ 'emoji', 1 ] ]

Try a sample code at RunKit or head to showcases to learn from live examples:

Wikipedia Timeline

Reads any wikipedia article and generates a visual timeline of all its events.

NLP Wizard 🧙

Performs tokenization, sentence boundary detection, pos tagging, named entity detection and sentiment analysis of user input text in real time.

Hashtag Sentiment 🎭

Analyzes sentiments of recent tweets containing the given hashtag.

Speed & Accuracy

The winkNLP processes raw text at ~525,000 tokens per second with its default language model — wink-eng-lite-model, when benchmarked using "Ch 13 of Ulysses by James Joyce" on a 2.2 GHz Intel Core i7 machine with 16GB RAM. The processing included the entire NLP pipeline — tokenization, sentence boundary detection, negation handling, sentiment analysis, part-of-speech tagging, and named entity extraction. This speed is way ahead of the prevailing speed benchmarks.

The benchmark was conducted on Node.js versions 14.8.0, and 12.18.3.

It pos tags a subset of WSJ corpus with an accuracy of ~94.7% — this includes tokenization of raw text prior to pos tagging. The current state-of-the-art is at ~97% accuracy but at lower speeds and is generally computed using gold standard pre-tokenized corpus.

Its general purpose sentiment analysis delivers a f-score of ~84.5%, when validated using Amazon Product Review Sentiment Labelled Sentences Data Set at UCI Machine Learning Repository. The current benchmark accuracy for specifically trained models can range around 95%.

Memory Requirement

Wink NLP delivers this performance with the minimal load on RAM. For example, it processes the entire History of India Volume I with a total peak memory requirement of under 80MB. The book has around 350 pages which translates to over 125,000 tokens.

Documentation

  • Concepts — everything you need to know to get started.
  • API Reference — explains usage of APIs with examples.
  • Change log — version history along with the details of breaking changes, if any.
  • Showcases — live examples with code to give you a head start.

Need Help?

Usage query 👩🏽‍💻

Please ask at Stack Overflow or discuss it at Wink JS Gitter Lobby.

Bug report 🐛

If you spot a bug and the same has not yet been reported, raise a new issue or consider fixing it and sending a PR.

New feature ✨

Looking for a new feature, request it via a new issue or consider becoming a contributor.

About wink

Wink is a family of open source packages for Natural Language Processing, Machine Learning, and Statistical Analysis in NodeJS. The code is thoroughly documented for easy human comprehension and has a test coverage of ~100% for reliability to build production grade solutions.

Wink NLP is copyright 2017-21 GRAYPE Systems Private Limited.

It is licensed under the terms of the MIT License.

Keywords

FAQs

Last updated on 22 Jun 2021

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc