Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

cfc-classifier

Package Overview
Dependencies
Maintainers
1
Versions
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

cfc-classifier

A Class Feature Centroid Classifier for text categorization

  • 1.0.1
  • latest
  • Source
  • npm
  • Socket score

Version published
Weekly downloads
1
Maintainers
1
Weekly downloads
 
Created
Source

Class Feature Centroid Classifier

This is a simple machine learning algorithm for text categorization based in the Hu Guan et al. (available here) article.

How centroid is calculated

$ npm i cfc-classifier

Usage

const CFC = require('cfc-classifier')

// Your dataset
const categories = ['a', 'b']
const corpus = [['category A'], ['category B']]

// Create a new classifier instance
const cfc = new CFC(categories, corpus)

// Train the classifier
cfc.train()

// Now you can classify texts
// the function below will return 'a'
cfc.classify('this text will be classified at category A')

Parsing step

In function of remove stopwords, word clustering or things like that this lib is open to insert any parsing steps that you want. In the example below I am inserting a parsing function that only remove the 'a' tokens.

const CFC = require('cfc-classifier')

const categories = ['a']
const corpus = [['a simple text, with some! interesting. things']]
const cfc = new CFC(categories, corpus)

// Add a parsing stepthis could be a remove
// stopwords function or something like that
const removeLetterA = (textTokens) => textTokens.filter(token => token.toLowerCase() !== 'a')
cfc.addParsingStep(removeLetterA)

const tokens = cfc.generateTokens(cfc.corpus)

// tokens === [ 'simple', 'text', 'with', 'some', 'interesting', 'things' ]

Testing

You can see examples in the __tests__ folder.

$ npm test
> DEBUG=ava:* nyc ava --color -v

✔ parsingStep › Remove letter A using parsing step
✔ countTermOccurrences › Generate unique terms
✔ tokens › Tokenize documents
✔ classify › Classify a text
✔ uniqueTerms › Generate unique terms

5 tests passed

----------|----------|----------|----------|----------|-------------------|
File      |  % Stmts | % Branch |  % Funcs |  % Lines | Uncovered Line #s |
----------|----------|----------|----------|----------|-------------------|
All files |      100 |      100 |      100 |      100 |                   |
 index.js |      100 |      100 |      100 |      100 |                   |
----------|----------|----------|----------|----------|-------------------|

Keywords

FAQs

Package last updated on 18 Nov 2018

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc