wink-naive-bayes-text-classifier
Advanced tools
Comparing version 2.0.1 to 2.1.0
@@ -40,6 +40,6 @@ # Contributing to Wink | ||
### Documenting | ||
We believe that the documentation must not only explain the API but also narrate the story of logic, algorithms and references used. Wink uses the [JSDoc](http://usejsdoc.org/) standard for API documentation and [Literate-Programming Standards](https://en.wikipedia.org/wiki/Literate_programming) for documenting the logic using [docker](http://jbt.github.io/docker/src/docker.js.html). The API documentation quality is measured using [Inch CI](https://inch-ci.org/) and we expect that your contribution will improve or maintain the current levels. | ||
We believe that the documentation must not only explain the API but also narrate the story of logic, algorithms and references used. Wink uses the [JSDoc](https://jsdoc.app/) standard for API documentation and [Literate-Programming Standards](https://en.wikipedia.org/wiki/Literate_programming) for documenting the logic using [docker](http://jbt.github.io/docker/src/docker.js.html). The API documentation quality is measured using [Inch CI](https://inch-ci.org/) and we expect that your contribution will improve or maintain the current levels. | ||
### Testing | ||
Wink requires a test coverage of **atleast > 99.5%** and aims for 100%. Any new contribution must maintain the existing test coverage level. We use [Chai](http://chaijs.com/), [Mocha](https://mochajs.org/) and [Istanbul](https://inch-ci.org/), [Coveralls](https://coveralls.io/) to run tests and determine coverage. | ||
Wink requires a test coverage of **atleast > 99.5%** and aims for 100%. Any new contribution must maintain the existing test coverage level. We use [Chai](http://chaijs.com/), [Mocha](https://mochajs.org/) and [Istanbul](https://istanbul.js.org/), [Coveralls](https://coveralls.io/) to run tests and determine coverage. | ||
@@ -46,0 +46,0 @@ ### Committing |
{ | ||
"name": "wink-naive-bayes-text-classifier", | ||
"version": "2.0.1", | ||
"version": "2.1.0", | ||
"description": "Configurable Naive Bayes Classifier for text with cross-validation support", | ||
@@ -18,7 +18,7 @@ "keywords": [ | ||
"pretest": "npm run lint && npm run docs", | ||
"test": "istanbul cover _mocha ./test/", | ||
"coveralls": "istanbul cover _mocha --report lcovonly -- -R spec && cat ./coverage/lcov.info | coveralls && rm -rf ./coverage", | ||
"test": "nyc --reporter=html --reporter=text mocha ./test/", | ||
"coverage": "nyc report --reporter=text-lcov | coveralls", | ||
"sourcedocs": "docker -i src -o ./sourcedocs --sidebar no", | ||
"docs": "jsdoc src/*.js -c .jsdoc.json", | ||
"lint": "eslint ./src/*.js ./test/*.js" | ||
"lint": "eslint ./src/*.js ./test/*.js ./runkit/*.js" | ||
}, | ||
@@ -36,17 +36,17 @@ "repository": { | ||
"devDependencies": { | ||
"chai": "^4.2.0", | ||
"coveralls": "^3.0.3", | ||
"docdash": "winkjs/docdash", | ||
"docker": "^1.0.0", | ||
"eslint": "^5.16.0", | ||
"istanbul": "^1.1.0-alpha.1", | ||
"jsdoc": "^3.5.5", | ||
"mocha": "^6.0.2", | ||
"mocha-lcov-reporter": "^1.3.0" | ||
"chai": "^4.3.6", | ||
"coveralls": "^3.1.1", | ||
"docdash": "github:winkjs/docdash", | ||
"docker": "^0.2.14", | ||
"eslint": "^8.26.0", | ||
"jsdoc": "^3.6.11", | ||
"mocha": "^10.1.0", | ||
"nyc": "^15.1.0" | ||
}, | ||
"dependencies": { | ||
"wink-eng-lite-web-model": "^1.4.3", | ||
"wink-helpers": "^2.0.0", | ||
"wink-nlp-utils": "^2.0.4" | ||
"wink-nlp": "^1.12.2" | ||
}, | ||
"runkitExampleFilename": "./runkit/example.js" | ||
} |
@@ -6,10 +6,10 @@ | ||
### [![Build Status](https://api.travis-ci.org/winkjs/wink-naive-bayes-text-classifier.svg?branch=master)](https://travis-ci.org/winkjs/wink-naive-bayes-text-classifier) [![Coverage Status](https://coveralls.io/repos/github/winkjs/wink-naive-bayes-text-classifier/badge.svg?branch=master)](https://coveralls.io/github/winkjs/wink-naive-bayes-text-classifier?branch=master) [![Inline docs](http://inch-ci.org/github/winkjs/wink-naive-bayes-text-classifier.svg?branch=master)](http://inch-ci.org/github/winkjs/wink-naive-bayes-text-classifier) [![dependencies Status](https://david-dm.org/winkjs/wink-naive-bayes-text-classifier/status.svg)](https://david-dm.org/winkjs/wink-naive-bayes-text-classifier) [![devDependencies Status](https://david-dm.org/winkjs/wink-naive-bayes-text-classifier/dev-status.svg)](https://david-dm.org/winkjs/wink-naive-bayes-text-classifier?type=dev) [![Gitter](https://img.shields.io/gitter/room/nwjs/nw.js.svg)](https://gitter.im/winkjs/Lobby) | ||
### [![Build Status](https://app.travis-ci.com/winkjs/wink-naive-bayes-text-classifier.svg?branch=master)](https://app.travis-ci.com/winkjs/wink-naive-bayes-text-classifier) [![Coverage Status](https://coveralls.io/repos/github/winkjs/wink-naive-bayes-text-classifier/badge.svg?branch=master)](https://coveralls.io/github/winkjs/wink-naive-bayes-text-classifier?branch=master) [![Gitter](https://img.shields.io/gitter/room/nwjs/nw.js.svg)](https://gitter.im/winkjs/Lobby) | ||
<img align="right" src="https://decisively.github.io/wink-logos/logo-title.png" width="100px" > | ||
Classify text, analyse sentiments, recognize user intents for chatbot using **`wink-naive-bayes-text-classifier`**. It's [API](http://winkjs.org/wink-naive-bayes-text-classifier/NaiveBayesTextClassifier.html) offers a rich set of features: | ||
Classify text, analyse sentiments, recognize user intents for chatbot using **`wink-naive-bayes-text-classifier`**. Its [API](http://winkjs.org/wink-naive-bayes-text-classifier/NaiveBayesTextClassifier.html) offers a rich set of features: | ||
1. Configure text preparation task such as **amplify negation**, **tokenize**, **stem**, **remove stop words**, and **propagate negation** using [wink-nlp-utils](https://www.npmjs.com/package/wink-nlp-utils) or any other package of your choice. | ||
2. Configure **Lidstone** or **Lapalce** additive smoothing. | ||
1. Preprocess text using [wink-nlp](https://www.npmjs.com/package/wink-nlp) — tokenize, stem, remove stop words, and handle negation. It also supports [Named Entity Recognition](https://winkjs.org/wink-nlp/getting-started.html) to further enhance preprocessing. A single winkNLP based helper function for preparing text is available that (a) tokenizes, (b) removes punctuations, symbols, numerals, URLs, stop words and (c) stems. It can be required from `wink-naive-bayes-text-classifier/src/prep-text.js`. | ||
2. Configure **Lidstone** or **Laplace** additive smoothing. | ||
3. Configure **Multinomial** or **Binarized Multinomial** Naive Bayes model. | ||
@@ -34,13 +34,21 @@ 4. Export and import learnings in JSON format that can be easily saved on hard-disk. | ||
var nbc = Classifier(); | ||
// Load NLP utilities | ||
var nlp = require( 'wink-nlp-utils' ); | ||
// Configure preparation tasks | ||
nbc.definePrepTasks( [ | ||
// Simple tokenizer | ||
nlp.string.tokenize0, | ||
// Common Stop Words Remover | ||
nlp.tokens.removeWords, | ||
// Stemmer to obtain base word | ||
nlp.tokens.stem | ||
] ); | ||
// Load wink nlp and its model | ||
const winkNLP = require( 'wink-nlp' ); | ||
// Load language model | ||
const model = require( 'wink-eng-lite-web-model' ); | ||
const nlp = winkNLP( model ); | ||
const its = nlp.its; | ||
const prepTask = function ( text ) { | ||
const tokens = []; | ||
nlp.readDoc(text) | ||
.tokens() | ||
// Use only words ignoring punctuations etc and from them remove stop words | ||
.filter( (t) => ( t.out(its.type) === 'word' && !t.out(its.stopWordFlag) ) ) | ||
// Handle negation and extract stem of the word | ||
.each( (t) => tokens.push( (t.out(its.negationFlag)) ? '!' + t.out(its.stem) : t.out(its.stem) ) ); | ||
return tokens; | ||
}; | ||
nbc.definePrepTasks( [ prepTask ] ); | ||
// Configure behavior | ||
@@ -66,3 +74,2 @@ nbc.defineConfig( { considerOnlyPresence: true, smoothingFactor: 0.5 } ); | ||
// -> prepay | ||
``` | ||
@@ -79,8 +86,8 @@ | ||
### About wink | ||
[Wink](http://winkjs.org/) is a family of open source packages for **Statistical Analysis**, **Natural Language Processing** and **Machine Learning** in NodeJS. The code is **thoroughly documented** for easy human comprehension and has a **test coverage of ~100%** for reliability to build production grade solutions. | ||
[Wink](http://winkjs.org/) is a family of open source packages for **Natural Language Processing**, **Statistical Analysis** and **Machine Learning** in NodeJS. The code is **thoroughly documented** for easy human comprehension and has a **test coverage of ~100%** for reliability to build production grade solutions. | ||
### Copyright & License | ||
**wink-naive-bayes-text-classifier** is copyright 2017-19 [GRAYPE Systems Private Limited](http://graype.in/). | ||
**wink-naive-bayes-text-classifier** is copyright 2017-22 [GRAYPE Systems Private Limited](http://graype.in/). | ||
It is licensed under the terms of the MIT License. |
// Load Naive Bayes Text Classifier | ||
var Classifier = require( 'wink-naive-bayes-text-classifier' ); | ||
// Instantiate | ||
var nbc = Classifier(); | ||
// Load NLP utilities | ||
var nlp = require( 'wink-nlp-utils' ); | ||
// Configure preparation tasks | ||
nbc.definePrepTasks( [ | ||
// Simple tokenizer | ||
nlp.string.tokenize0, | ||
// Common Stop Words Remover | ||
nlp.tokens.removeWords, | ||
// Stemmer to obtain base word | ||
nlp.tokens.stem | ||
] ); | ||
var nbc = Classifier(); // eslint-disable-line new-cap | ||
// Load wink nlp and its model | ||
const winkNLP = require( 'wink-nlp' ); | ||
// Load language model | ||
const model = require( 'wink-eng-lite-web-model' ); | ||
const nlp = winkNLP( model ); | ||
const its = nlp.its; | ||
const prepTask = function ( text ) { | ||
const tokens = []; | ||
nlp.readDoc(text) | ||
.tokens() | ||
// Use only words ignoring punctuations etc and from them remove stop words | ||
.filter( (t) => ( t.out(its.type) === 'word' && !t.out(its.stopWordFlag) ) ) | ||
// Handle negation and extract stem of the word | ||
.each( (t) => tokens.push( (t.out(its.negationFlag)) ? '!' + t.out(its.stem) : t.out(its.stem) ) ); | ||
return tokens; | ||
}; | ||
nbc.definePrepTasks( [ prepTask ] ); | ||
// Configure behavior | ||
@@ -17,0 +25,0 @@ nbc.defineConfig( { considerOnlyPresence: true, smoothingFactor: 0.5 } ); |
@@ -5,3 +5,3 @@ // wink-naive-bayes-text-classifier | ||
// | ||
// Copyright (C) 2017-19 GRAYPE Systems Private Limited | ||
// Copyright (C) GRAYPE Systems Private Limited | ||
// | ||
@@ -289,3 +289,3 @@ // This file is part of “wink-naive-bayes-text-classifier”. | ||
// If smoothing factor is undefined set it to lapalce add+1 smoothing. | ||
// If smoothing factor is undefined set it to laplace add+1 smoothing. | ||
var sf = ( cfg.smoothingFactor === undefined ) ? 1 : parseFloat( cfg.smoothingFactor ); | ||
@@ -310,3 +310,4 @@ // Throw error for a value beyond 0-1 or NaN. | ||
* using these function a simple pipeline is built to serially transform the | ||
* input to the output. | ||
* input to the output. A single helper function for preparing text is available that (a) tokenizes, | ||
* (b) removes punctuations, symbols, numerals, URLs, stop words and (c) stems. | ||
* | ||
@@ -320,13 +321,6 @@ * @method NaiveBayesTextClassifier#definePrepTasks | ||
* // Load wink NLP utilities | ||
* var nlp = require( 'wink-nlp-utils' ); | ||
* var prepText = require( 'wink-naive-bayes-text-classifier/src/prep-text.js' ); | ||
* // Define the text preparation tasks. | ||
* myClassifier.definePrepTasks( [ | ||
* // Simple tokenizer to convert input text in to tokens | ||
* nlp.string.tokenize0, | ||
* // Removes stop words from the input tokens | ||
* nlp.tokens.removeWords, | ||
* // Stems each token into its base form | ||
* nlp.tokens.stem | ||
* ] ); | ||
* // -> 3 | ||
* myClassifier.definePrepTasks( [ prepText ] ); | ||
* // -> 1 | ||
* @throws Error if `tasks` is not an array of functions. | ||
@@ -333,0 +327,0 @@ */ |
Sorry, the diff of this file is not supported yet
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
85885
8
13
861
90
3
+ Addedwink-nlp@^1.12.2
- Removedwink-nlp-utils@^2.0.4
- Removedemoji-regex@9.2.2(transitive)
- Removedwink-distance@2.0.2(transitive)
- Removedwink-jaro-distance@2.0.0(transitive)
- Removedwink-nlp-utils@2.1.0(transitive)
- Removedwink-porter2-stemmer@2.0.1(transitive)
- Removedwink-tokenizer@5.3.0(transitive)