Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

extract-lemmatized-nonstop-words

Package Overview
Dependencies
Maintainers
1
Versions
17
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

extract-lemmatized-nonstop-words

Extracts a pure list of lemmatized words of a text filtered by stop words

  • 1.0.19
  • latest
  • Source
  • npm
  • Socket score

Version published
Maintainers
1
Created
Source

Extract lemmatized nonstop words

Extracts a pure list of lemmatized words of a text filtered by stop words.

Features

  • Removing stopwords.
  • Removing proper noun.
  • Regular past tense verb and past participle verb to present form: created to create
  • Present form (3rd person) to present form: creates to create
  • Plural noun to singular: cats to cat
  • Gerund form verb to present form: creating to create

Install

install using Yarn:

yarn add extract-lemmatized-nonstop-words

install using NPM:

npm i --save extract-lemmatized-nonstop-words

Usage

const extract = require('extract-lemmatized-nonstop-words');

const words = extract('He created these categories and they are better.');

returns:

Array (3 items)
    0: Object
        lemma: "create"
        normal: "created"
        pos: "VBD"
        tag: "word"
        value: "created"
        vocabulary: "create"
    1: Object
        lemma: "category"
        normal: "categories"
        pos: "NNS"
        tag: "word"
        value: "categories"
        vocabulary: "category"
    2: Object
        lemma: "good"
        normal: "better"
        pos: "JJR"
        tag: "word"
        value: "better"
        vocabulary: "better"

API

extract(text, filter) ⇒ Array.<Object>

Extracts a pure list of lemmatized words of a text filtered by stop words. it will remove non-word tokens, ones which their length is less than 3 and contains non-alphabetic charachters.

ParamTypeDescription
textStringinput text
filterArray.<String>list of custom stopword which will replace with defaults, in case of passing false filtering results by stopwords will ignore.

Annotation Specification

AnnotationNameExample
NNNoundog man
NNSPlural noundogs men
NNPProper nounLondon Alex
NNPSPlural proper nounSmiths
VBBase form verbbe
VBPPresent form verbthrow
VBZPresent form (3rd person)throws
VBGGerund form verbthrowing
VBDPast tense verbthrew
VBNPast participle verbthrown
MDModal verbcan shall will may must ought
JJAdjectivebig fast
JJRComparative adjectivebigger
JJSSuperlative adjectivebiggest
RBAdverbnot quickly closely
RBRComparative adverbless-closely faster
RBSSuperlative adverbfastest
DTDeterminerthe a some both
PDTPredeterminerall quite
PRPPersonal PronounI you he she
PRP$Possessive PronounI you he she
POSPossessive ending's
INPrepositionof by in
PRParticleup off
TOtoto
WDTWh-determinerwhich that whatever whichever
WPWh-pronounwho whoever whom what
WP$Wh-possessivewhose
WRBWh-adverbhow where
EXExpletive therethere
CCCoordinating conjugation& and nor or
CDCardinal Numbers1 7 77 one
LSList item marker1 B C One
UHInterjectionah oh oops
FWForeign Wordsviva mon toujours
,Comma,
:Mid-sent punct: ; ...
.Sent-final punct.. ! ?
(Left parenthesis) } ]
)Right parenthesis( { [
#Pound sign#
$Currency symbols$ £ ¥
SYMOther symbols+ * / < >
EMEmojis & emoticons:)

FAQs

Package last updated on 09 Jun 2019

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc