## What is skmeans?

The skmeans npm package is a simple and efficient implementation of the k-means clustering algorithm. It is used for partitioning a dataset into k distinct, non-overlapping subsets (clusters). The package is designed to be lightweight and easy to use, making it suitable for quick clustering tasks in JavaScript applications.

## What are skmeans's main functionalities?

Basic k-means clustering

This feature allows you to perform basic k-means clustering on a dataset. The code sample demonstrates how to cluster a simple 2D dataset into 2 clusters.

```
const skmeans = require('skmeans');
const data = [[1, 2], [2, 3], [3, 4], [8, 9], [9, 10], [10, 11]];
const k = 2;
const result = skmeans(data, k);
console.log(result);
```

Custom initialization

This feature allows you to specify custom initial centroids for the k-means algorithm. The code sample demonstrates how to initialize the centroids manually.

```
const skmeans = require('skmeans');
const data = [[1, 2], [2, 3], [3, 4], [8, 9], [9, 10], [10, 11]];
const k = 2;
const initialCentroids = [[1, 2], [8, 9]];
const result = skmeans(data, k, 'kmpp', initialCentroids);
console.log(result);
```

Weighted k-means clustering

This feature allows you to perform weighted k-means clustering, where each data point can have a different weight. The code sample demonstrates how to apply weights to the data points.

```
const skmeans = require('skmeans');
const data = [[1, 2], [2, 3], [3, 4], [8, 9], [9, 10], [10, 11]];
const weights = [1, 1, 1, 2, 2, 2];
const k = 2;
const result = skmeans(data, k, 'kmpp', null, weights);
console.log(result);
```

## Other packages similar to skmeans

### kmeans-js

kmeans-js is another JavaScript implementation of the k-means clustering algorithm. It offers similar functionality to skmeans but includes additional features like support for different distance metrics and more advanced initialization methods. It is slightly more complex but provides more flexibility for advanced users.

### ml-kmeans

ml-kmeans is part of the machine learning library 'ml' and provides a robust implementation of the k-means algorithm. It is designed to work seamlessly with other machine learning tools in the 'ml' ecosystem, making it a good choice for more comprehensive machine learning projects. It offers more advanced options and better integration with other machine learning algorithms compared to skmeans.

### simple-statistics

simple-statistics is a library that provides a wide range of statistical tools, including k-means clustering. While it is not solely focused on k-means, it offers a comprehensive set of statistical functions that can be useful for data analysis. It is a good choice if you need a broader set of statistical tools in addition to k-means clustering.

## skmeans

Super fast simple k-means and k-means++ implementation for unidimiensional and multidimensional data. Works on nodejs and browser.

### Installation

```
npm install skmeans
```

### Usage

#### NodeJS

```
const skmeans = require("skmeans");
var data = [1,12,13,4,25,21,22,3,14,5,11,2,23,24,15];
var res = skmeans(data,3);
```

#### Browser

```
<!doctype html>
<html>
<head>
<script src="skmeans.js"></script>
</head>
<body>
<script>
var data = [1,12,13,4,25,21,22,3,14,5,11,2,23,24,15];
var res = skmeans(data,3);
console.log(res);
</script>
</body>
</html>
```

### Results

```
{
it: 2,
k: 3,
idxs: [ 2, 0, 0, 2, 1, 1, 1, 2, 0, 2, 0, 2, 1, 1, 0 ],
centroids: [ 13, 23, 3 ]
}
```

### API

#### skmeans(data,k,[centroids],[iterations])

Calculates unidimiensional and multidimensional k-means clustering on *data*. Parameters are:

**data** Unidimiensional or multidimensional array of values to be clustered. for unidimiensional data, takes the form of a simple array *[1,2,3.....,n]*. For multidimensional data, takes a
NxM array *[[1,2],[2,3]....[n,m]]***k** Number of clusters**centroids** Optional. Initial centroid values. If not provided, the algorith will try to choose an apropiate ones. Alternative values can be:
**"kmrand"** Cluster initialization will be random, but with extra checking, so there will no be two equal initial centroids.**"kmpp"** The algorythm will use the k-means++ cluster initialization method.

**iterations** Optional. Maximum number of iterations. If not provided, it will be set to 10000.**distance function** Optional. Custom distance function. Takes two points as arguments and returns a scalar number.

The function will return an object with the following data:

**it** The number of iterations performed until the algorithm has converged**k** The cluster size**centroids** The value for each centroid of the cluster**idxs** The index to the centroid corresponding to each value of the data array**test** Function to test new point membership

### Examples

```
var res = skmeans(data,3);
var res = skmeans(data,3,[1,5,9]);
var res = skmeans(data,3,"kmpp");
var res = skmeans(data,3,null,10);
var res = skmeans(data,3,null,null,(x1,x2)=>Math.abs(x1-x2));
var res = skmeans(data,3,null,10);
res.test(6);
var res = skmeans(data,3,null,10);
res.test(6,(x1,x2)=>Math.abs(x1-x2));
```