🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more
Socket
DemoInstallSign in
Socket

markovian-nlp

Package Overview
Dependencies
Maintainers
1
Versions
40
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

markovian-nlp

NLP tools generate Markov sentences & models

2.1.1
Source
npm
Version published
Weekly downloads
6
-25%
Maintainers
1
Weekly downloads
 
Created
Source

markovian-nlp

license npm current version

Setup

Installation

With npm installed, run terminal command:

npm i markovian-nlp
  • npm package

Usage

Module import

Declare method imports at the top of each JavaScript file they will be used.

ES2015

import {
  ngramsDistribution,
  sentences,
} from 'markovian-nlp';

CommonJS

const {
  ngramsDistribution,
  sentences,
} = require('markovian-nlp');

Glossary

Learn more about computational linguistics and natural language processing (NLP) on Wikipedia.

The following terms are used in the API documentation:

termdescription
bigram2-gram sequence
deterministicrepeatable, non-random
endgramfinal gram in a sequence
n-gramcontiguous gram (word) sequence
startgramfirst gram in a sequence
unigram1-gram sequence

API

ngramsDistribution(document)

View the n-grams distribution of text.

Potential applications: Markov models

Example

ngramsDistribution('birds have featured in culture and art since prehistoric times');
Output
{
  and: { _end: 0, _start: 0, art: 1 },
  art: { _end: 0, _start: 0, since: 1 },
  birds: { _end: 0, _start: 1, have: 1 },
  culture: { _end: 0, _start: 0, and: 1 },
  featured: { _end: 0, _start: 0, in: 1 },
  have: { _end: 0, _start: 0, featured: 1 },
  in: { _end: 0, _start: 0, culture: 1 },
  prehistoric: { _end: 0, _start: 0, times: 1 },
  since: { _end: 0, _start: 0, prehistoric: 1 },
  times: { _end: 1, _start: 0 },
}

Each number represents the sum of occurrences.

startgramendgrambigrams
"birds""times"all remaining keys ("have featured", "featured in", etc.)

Input

user-defined parametertypeimplementsintermediate transformations
documentStringcompromise(document)normalization, rule-based text parsing

Return value

typedescription
Objectdistributions of unigrams to startgrams, endgrams, and following bigrams
Signature
// pseudocode (does not run)
ngramsDistribution(document) => ({
  ...unigrams: {
    ...{ ...bigram: bigramsDistribution },
    _end: endgramsDistribution,
    _start: startgramsDistribution,
  },
});

sentences(document)(seed)

sentences({ document[, count][, seed] })

Generate text sentences from a Markov process.

Potential applications: Natural language generation

Examples

One sentence

(example document source)

const document = "That there is constant succession and flux of ideas in our minds..."
const oneSentence = sentences(document);
Nondeterministic
oneSentence();
// output: "i have observed in the chief yet we might be able by a one
//   would promote introduce a contrary habit"

oneSentence();
// output: "this is not angry chiding or so easy to them from running away
//   with our thoughts by a proper and inure them"
Deterministic

Providing a seed produces a repeatable result:

oneSentence(1);
// deterministic output: "i would promote introduce a constant succession and hindering the path
//   and application getting the train they cannot keep their roving i would sooner reconcile
//   and contemplative part of the way to direct them"
Multiple sentences

(example document source)

sentences({
  document,
  count: 3,
  seed: 1,
});

// output: [
//   'i would promote introduce a constant succession and hindering the path and application getting the train they cannot keep their roving i would sooner reconcile and contemplative part of the way to direct them',
//   'he that train they seem to be glad to be done as may be avoided of our thoughts close to our thoughts by a proper and inure them',
//   'this wandering of attention and yet for ought i know this wandering thoughts i would promote introduce a contrary habit',
// ]

Input

user-defined parametertypeoptionaldefault valueimplementsdescription
document, options.documentStringfalsecompromise(document)Text.
seed, options.seedNumbertrueundefinedChance(seed)Leave undefined (default) for nondeterministic results, or specify seed for deterministic results.
optionsObjecttrue
options.countNumbertrue1Number of sentences to output.

Return value

typedescription
Array[Strings...]generated sentences

Keywords

computational linguistics

FAQs

Package last updated on 05 Oct 2018

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts