New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details → →

Book a Demo Sign in

lexemic

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

lexemic

Tools for working with human language data.

latest

Source

npm

Version: 0.0.8

Version published: 13 years ago

Maintainers: 1

Created: 13 years ago

Source

lexemic

Requirements

Node.js (v0.10.17+)
npm (v1.3.9+)

Install

npm install lexemic -g

Description

Tools for working with human language data.

Features

Sentiment analysis
Stemming
Tokenization
Statistical analysis

Usage

$ lexemic [command] [implementation] [target...]

NOTE: The target may be an inline string or the path to a text file encoded as UTF-8.

Sentiment analysis

$ lexemic sentiment "I am mad at you." # => {  
                                             :score -1,  
                                             :comparative -0.25,  
                                             :positive {  
                                                        :score 0,  
                                                        :comparative 0,  
                                                        :words ()  
                                                       },  
                                             :negative {  
                                                        :score 1,  
                                                        :comparative 0.25,  
                                                        :words (mad)  
                                                       }  
                                             }

Sentiment analysis attempts to determine the affective state of the speaker or the writer. The default implementation returns an EDN map of this analysis. The :score represents the number of emotive words in the text while the :comparative rates the occurrence of these words with regards to the length of the text. The nested values (i.e. those under :positive and :negative ) provide a list of matched :words and take only into account their respective affectivity. The top level values incorporate both affective states – returning negative values for texts with overall negative affects and positive values for texts with overall positive affects.

Stemming

$ lexemic stem "My education has been educational" # => #{"My"
                                                          "educ"
                                                          "ha"
                                                          "been"}

Stemming attempts to reduce inflected words to their stem. This is useful in reducing your working set of words and expanding possible search matches. The default implementation uses the porter algorithm, though you may explicitly specify an implementation; -p and -porter for the porter algorithm (standard and gentle) or -l and -lancaster for the lancaster algorithm (much more aggressive). This command returns an EDN set of the reduced working set.

Tokenization

$ lexemic tokenize "This is a sample sentence." # => ["This"
                                                      "is"
                                                      "a"
                                                      "sample"
                                                      "sentence"
                                                      "."]

Tokenization attempts to break a text into its desired constituent parts – typically, that is into words. This command returns an EDN vector of the result of this process.

Statistical analysis

Levenshtein distance

$ lexemic distance "This is a sentence." "This is a similar sentence." # => 8

Levenshtein distance measures the distance between two strings of text, often two documents. Informally, this distance corresponds to the minimum number of single-character edits required to change one string into the other (e.g. in the above example, '8' represents the insertion of 1 space and the 7 letters of the word 'similar'). This command returns an integer representing the result of this process.

Issues

If you need help, find a bug, want to request a feature or want to contribute, please create an issue.

Copyright

See LICENSE.txt for details.

Keywords

FAQs

What is lexemic?

Is lexemic well maintained?

Package last updated on 30 Aug 2013

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

lexemic

lexemic

Requirements

Install

Description

Features

Usage

Sentiment analysis

Stemming

Tokenization

Statistical analysis

Levenshtein distance

Issues

Copyright

Keywords

Related posts

Axios Maintainer Confirms Social Engineering Attack Behind npm Compromise

Node.js Drops Bug Bounty Rewards After Funding Dries Up