Socket
Socket
Sign inDemoInstall

cfc-classifier

Package Overview
Dependencies
4
Maintainers
1
Versions
2
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

    cfc-classifier

A Class Feature Centroid Classifier for text categorization


Version published
Maintainers
1
Install size
40.4 MB
Created

Readme

Source

Class Feature Centroid Classifier

This is a simple machine learning algorithm for text categorization based in the Hu Guan et al. (available here) article.

How centroid is calculated

$ npm i cfc-classifier

Usage

const CFC = require('cfc-classifier')

// Your dataset
const categories = ['a', 'b']
const corpus = [['category A'], ['category B']]

// Create a new classifier instance
const cfc = new CFC(categories, corpus)

// Train the classifier
cfc.train()

// Now you can classify texts
// the function below will return 'a'
cfc.classify('this text will be classified at category A')

Parsing step

In function of remove stopwords, word clustering or things like that this lib is open to insert any parsing steps that you want. In the example below I am inserting a parsing function that only remove the 'a' tokens.

const CFC = require('cfc-classifier')

const categories = ['a']
const corpus = [['a simple text, with some! interesting. things']]
const cfc = new CFC(categories, corpus)

// Add a parsing stepthis could be a remove
// stopwords function or something like that
const removeLetterA = (textTokens) => textTokens.filter(token => token.toLowerCase() !== 'a')
cfc.addParsingStep(removeLetterA)

const tokens = cfc.generateTokens(cfc.corpus)

// tokens === [ 'simple', 'text', 'with', 'some', 'interesting', 'things' ]

Testing

You can see examples in the __tests__ folder.

$ npm test
> DEBUG=ava:* nyc ava --color -v

✔ parsingStep › Remove letter A using parsing step
✔ countTermOccurrences › Generate unique terms
✔ tokens › Tokenize documents
✔ classify › Classify a text
✔ uniqueTerms › Generate unique terms

5 tests passed

----------|----------|----------|----------|----------|-------------------|
File      |  % Stmts | % Branch |  % Funcs |  % Lines | Uncovered Line #s |
----------|----------|----------|----------|----------|-------------------|
All files |      100 |      100 |      100 |      100 |                   |
 index.js |      100 |      100 |      100 |      100 |                   |
----------|----------|----------|----------|----------|-------------------|

Keywords

FAQs

Last updated on 18 Nov 2018

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc