Socket
Socket
Sign inDemoInstall

k-means-plus

Package Overview
Dependencies
0
Maintainers
1
Versions
1
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

    k-means-plus

K-means++ clustering a classification of data, so that points assigned to the same cluster are similar (in some sense). It is identical to the K-means algorithm, except for the selection of initial conditions.


Version published
Weekly downloads
1
decreased by-94.12%
Maintainers
1
Install size
154 kB
Created
Weekly downloads
 

Readme

Source

major-colors

K-means++ clustering a classification of data, so that points assigned to the same cluster are similar (in some sense). It is identical to the K-means algorithm, except for the careful selection of initial conditions. It is a way of avoiding the sometimes poor clusterings found by the standard k-means algorithm. k-means++ variance address this major theoretic shortcoming and guarantee an approximation ratio O(log k) in expectation (over the randomness of the algorithm), where k is the number of clusters used.

Install

npm i k-means-plus
yarn add k-means-plus

Usage

import KMeansCluster from 'k-means-plus';
const cluster = new KMeansCluster({ distance, maximumIterations, convergedFn });
const results = cluster.cluster(vectors, k);
import KMeansCluster from 'k-means-plus';

const data = [
  {'company': 'Microsoft' , 'size': 91259, 'revenue': 60420},
  {'company': 'IBM' , 'size': 400000, 'revenue': 98787},
  {'company': 'Skype' , 'size': 700, 'revenue': 716},
  {'company': 'SAP' , 'size': 48000, 'revenue': 11567},
  {'company': 'Yahoo!' , 'size': 14000 , 'revenue': 6426 },
  {'company': 'eBay' , 'size': 15000, 'revenue': 8700},
];

// Create the data 2D-array (vectors) describing the data
let vectors = [];
for (let i = 0 ; i < data.length ; i++) {
  vectors.push([ data[i]['size'] , data[i]['revenue']]);
}

const k = 4;

const cluster = new KMeansCluster();

const {
  model: { observations, centroids, assignments },
  iterations,
  durationMs
} = cluster.cluster(vectors, k);

Constructor

  • distance - (vector1: [number], vector2: [number]) => number distance function used for clustering between two vectors.
  • Default: euclidean distance
  • maximumIterations - number maximum iterations of clustering.
  • Default: 200
  • convergedFn - (centroids1: [[number]], centroids2: [[number]]) => boolean determine if two consecutive set of centroids are converged.
  • Default: the clusters are converged if the cluster assignments are the same.

Inputs

  • vectors - [[number]] point vectors to cluster
  • k - number of final clusters

Outputs

type result = {
  model: {
    observations: [[number]], // the original vectors to cluster
    centroids: [[number]], // vectors of the centers of cluster
    assignments: [number] // mapping from index of original vector to the index of cluter center it belongs to
  },
  iterations: number, // number of iterations ran before converging
  durationMs: number // the duration of the algorithm
}

FAQs

Last updated on 11 Sep 2018

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc