Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

@simonjb/rake-js

Package Overview
Dependencies
Maintainers
1
Versions
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@simonjb/rake-js

A pure JS implementation of the Rapid Automated Keyword Extraction (RAKE) algorithm.

  • 0.1.1
  • Source
  • npm
  • Socket score

Version published
Maintainers
1
Created
Source

RAKE.js

A pure JS implementation of the Rapid Automated Keyword Extraction (RAKE) algorithm. Put in any text corpus, get back a bunch of keyphrases and keywords.

TypeScript Build Status styled with prettier License: LGPL v3

Currently supported languages:

  • english
  • german
  • spanish
  • italian
  • dutch
  • portugese
  • swedish

More languages are fairly easy to add, see the stoplist module for details.

How to use

Without any further options:

  import rake from 'rake-js'

  const myKeywords = rake(someTextContent) // ['keyword1, ...]

When the language is known in advance (faster execution):

  import rake from 'rake-js'

  const myKeywords = rake(someTextContent, { language: 'english' })

When the corpus is divided by something other than whitespace (eg: ;):

  import rake from 'rake-js'

  const myKeywords = rake(someTextContent, { delimiters: [';+'] })

Implementation Details

This algorithm is fast, compared with other approaches like TextRank. The results are surprisingly good for a cross-language algorithm, and the truly relevant keywords / phrases are included in the result in most cases. For more details about the RAKE algorithm, read the original paper.

There are still rough edges in the code, but I tried to translate the abstract algorithm into a solid software package, tested and typesafe. Actually I wrote this thing because I was very disappointed with all the existing solutions on NPM, and I hope this repository is easier to contribute to in the future.

Roadmap:

  • support more languages (only handful are whitelisted for now)
  • duplicate keyword filtering
  • check browser compatibility

LICENSE:

LGPL-3.0.

You can use this package in all your free or commercial products without any issues, but I want bugfixes and improvements to this algorithm to flow back into the public code repository.

Keywords

FAQs

Package last updated on 10 Jun 2020

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc