k-means-plus

Dependencies

Maintainers

Versions

Alerts

File Explorer

Advanced tools

npm Scripts

Install Socket

Detect and block malicious and high-risk dependencies

Install

k-means-plus

K-means++ clustering a classification of data, so that points assigned to the same cluster are similar (in some sense). It is identical to the K-means algorithm, except for the selection of initial conditions.

1.0.0

latest

npm

Version published: 6 years ago

Weekly downloads: 1; decreased by-94.12%

Maintainers: 1

Install size: 154 kB

Created: 6 years ago

Weekly downloads

Readme

Source

major-colors

K-means++ clustering a classification of data, so that points assigned to the same cluster are similar (in some sense). It is identical to the K-means algorithm, except for the careful selection of initial conditions. It is a way of avoiding the sometimes poor clusterings found by the standard k-means algorithm. k-means++ variance address this major theoretic shortcoming and guarantee an approximation ratio O(log k) in expectation (over the randomness of the algorithm), where k is the number of clusters used.

Install

npm i k-means-plus
yarn add k-means-plus

Usage

import KMeansCluster from 'k-means-plus';
const cluster = new KMeansCluster({ distance, maximumIterations, convergedFn });
const results = cluster.cluster(vectors, k);

import KMeansCluster from 'k-means-plus';

const data = [
  {'company': 'Microsoft' , 'size': 91259, 'revenue': 60420},
  {'company': 'IBM' , 'size': 400000, 'revenue': 98787},
  {'company': 'Skype' , 'size': 700, 'revenue': 716},
  {'company': 'SAP' , 'size': 48000, 'revenue': 11567},
  {'company': 'Yahoo!' , 'size': 14000 , 'revenue': 6426 },
  {'company': 'eBay' , 'size': 15000, 'revenue': 8700},
];

// Create the data 2D-array (vectors) describing the data
let vectors = [];
for (let i = 0 ; i < data.length ; i++) {
  vectors.push([ data[i]['size'] , data[i]['revenue']]);
}

const k = 4;

const cluster = new KMeansCluster();

const {
  model: { observations, centroids, assignments },
  iterations,
  durationMs
} = cluster.cluster(vectors, k);

Constructor

distance - (vector1: [number], vector2: [number]) => number distance function used for clustering between two vectors.
Default: euclidean distance
maximumIterations - number maximum iterations of clustering.
Default: 200
convergedFn - (centroids1: [[number]], centroids2: [[number]]) => boolean determine if two consecutive set of centroids are converged.
Default: the clusters are converged if the cluster assignments are the same.

Inputs

vectors - [[number]] point vectors to cluster
k - number of final clusters

Outputs

type result = {
  model: {
    observations: [[number]], // the original vectors to cluster
    centroids: [[number]], // vectors of the centers of cluster
    assignments: [number] // mapping from index of original vector to the index of cluter center it belongs to
  },
  iterations: number, // number of iterations ran before converging
  durationMs: number // the duration of the algorithm
}

FAQs

What is k-means-plus?

Is k-means-plus popular?

Is k-means-plus well maintained?

Last updated on 11 Sep 2018

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

k-means-plus

major-colors

Install

Usage

Constructor

Inputs

Outputs

Related posts

The AI Advantage: Reshaping Cybersecurity in the Age of Autonomous Threats

UnitedHealth Group Discloses Protected Health Information Compromised for “Substantial Portion of People in America” in Recent Cyberattack