Socket
Book a DemoInstallSign in
Socket

distributed-ngram

Package Overview
Dependencies
Maintainers
1
Versions
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

distributed-ngram

npm i --save js-spark

latest
npmnpm
Version
1.0.2
Version published
Maintainers
1
Created
Source

WAT

Simply put predict next word user will write.

HOWTO

installation

    git clone git@github.com:syzer/distributedNgram.git && cd $_
    npm install
    npm install --save-dev

The file nGram.js offers more compact version of code:

    npm start

testing basic distributed task

var jsSpark = require('js-spark')({workers: 16});
var task = jsSpark.jsSpark;
var q = jsSpark.q;

task([20, 30, 40, 50])
    // this is executed on client side
    .map(function addOne(num) {
        return num + 1;
    })
    .reduce(function sumUp(sum, num) {
        return sum + num;
    })
    .run()
    .then(function(data) {
        // this is executed on back on server
        console.log('i finished calculating', data);
    })

tests

    npm test

Tasks

clone https://github.com/syzer/distributedNgram.git

./index.js

load:

  • dracula

  • lodash

  • load helpers

(gist)

// helpers ./lib/index.js

make function prepare()

// remove special characters
function prepare(str){}
prepare('“Listen to them, the children of the night. What music they make!”')
//=>"listen to them the children of the night what music they make"

(gist)

./index.js

make bigramText()

bigramText("to listen to them the children of the night what music they make");
//=>{to: {listen: 1, them:1} , listen:{to:1}, the:{children:1}}...
function bigramText(str) {
    return arr.reduce(bigramArray);
}

(gist)

./index.js

function mergeSmall()

  • create 2 tasks ch01, and ch02

  • use tasks to bigram those chapters

  • reduce response with _.merge

(gist)

./index.js

function mergeBig(texts)

  • load [ch1, ch2, ch3] or texts

  • make distinct tasks to bigram this text

  • reduce with _.mergeObjectsInArr

  • cache result

  • return result

(gist)

./index.js

function predict(word)

  • load appropriate key/word from cache

  • calc total hits

  • sort all hits in order,

may use helper function objToSortedArr(obj)

  • calc frequency/probability of next word

(gist)

./index.js

function train(fileName, splitter)

  • load file

  • prepare

  • use splitter(string) to create separate tasks

  • calculate tasks on clients using mergeBig()

TODO

[ ] git checkout [ ] js-spark adventure

Keywords

Machine learning

FAQs

Package last updated on 04 Aug 2015

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

About

Packages

Stay in touch

Get open source security insights delivered straight into your inbox.

  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc

U.S. Patent No. 12,346,443 & 12,314,394. Other pending.