Socket
Socket
Sign inDemoInstall

github.com/james-bowman/sparse

Package Overview
Dependencies
2
Alerts
File Explorer

Install Socket

Detect and block malicious and high-risk dependencies

Install

    github.com/james-bowman/sparse

Package sparse provides implementations of selected sparse matrix formats. Matrices and linear algebra are used extensively in scientific computing and machine learning applications. Large datasets are analysed comprising vectors of numerical features that represent some object. The nature of feature encoding schemes, especially those like "one hot", tends to lead to vectors with mostly zero values for many of the features. In text mining applications, where features are typically terms from a vocabulary, it is not uncommon for 99% of the elements within these vectors to contain zero values. Sparse matrix formats take advantage of this fact to optimise memory usage and processing performance by only storing and processing non-zero values. Sparse matrix formats can broadly be divided into 3 main categories: 1. Creational - Sparse matrix formats suited to construction and building of matrices. Matrix formats in this category include DOK (Dictionary Of Keys) and COO (COOrdinate aka triplet). 2. Operational - Sparse matrix formats suited to arithmetic operations e.g. multiplication. Matrix formats in this category include CSR (Compressed Sparse Row aka CRS - Compressed Row Storage) and CSC (Compressed Sparse Column aka CCS - Compressed Column Storage) 3. Specialised - Specialised matrix formats suiting specific sparsity patterns. Matrix formats in this category include DIA (DIAgonal) for efficiently storing and manipulating symmetric diagonal matrices. A common practice is to construct sparse matrices using a creational format e.g. DOK or COO and then convert them to an operational format e.g. CSR for arithmetic operations. All sparse matrix implementations in this package implement the Matrix interface defined within the gonum/mat package and so may be used interchangeably with matrix types defined within the package e.g. mat.Dense, mat.VecDense, etc.


Version published

Readme

Source

Sparse matrix formats

License: MIT GoDoc Build Status Go Report Card codecov Mentioned in Awesome Go Sourcegraph

Implementations of selected sparse matrix formats for linear algebra supporting scientific and machine learning applications. Compatible with the APIs in the Gonum package and interoperable with Gonum dense matrix types.

Overview

Machine learning applications typically model entities as vectors of numerical features so that they may be compared and analysed quantitively. Typically the majority of the elements in these vectors are zeros. In the case of text mining applications, each document within a corpus is represented as a vector and its features represent the vocabulary of unique words. A corpus of several thousand documents might utilise a vocabulary of hundreds of thousands (or perhaps even millions) of unique words but each document will typically only contain a couple of hundred unique words. This means the number of non-zero values in the matrix might only be around 1%.

Sparse matrix formats capitalise on this premise by only storing the non-zero values thereby reducing both storage/memory requirements and processing effort for manipulating the data.

Features

Usage

The sparse matrices in this package implement the Gonum Matrix interface and so are fully interoperable and mutually compatible with the Gonum APIs and dense matrix types.

// Construct a new 3x2 DOK (Dictionary Of Keys) matrix
dokMatrix := sparse.NewDOK(3, 2)

// Populate it with some non-zero values
dokMatrix.Set(0, 0, 5)
dokMatrix.Set(2, 1, 7)

// Demonstrate accessing values (could use Gonum's mat.Formatted()
// function to pretty print but this demonstrates element access)
m, n := dokMatrix.Dims()
for i := 0; i < m; i++ {
    for j := 0; j < n; j++ {
        fmt.Printf("%.0f,", dokMatrix.At(i, j))
    }
    fmt.Printf("\n")
}

// Convert DOK matrix to CSR (Compressed Sparse Row) matrix
// just for fun (not required for upcoming multiplication operation)
csrMatrix := dokMatrix.ToCSR()

// Create a random 2x3 COO (COOrdinate) matrix with
// density of 0.5 (half the elements will be non-zero)
cooMatrix := sparse.Random(sparse.COOFormat, 2, 3, 0.5)

// Convert CSR matrix to Gonum mat.Dense matrix just for fun
// (not required for upcoming multiplication operation)
// then transpose so it is the right shape/dimensions for
// multiplication with the original CSR matrix
denseMatrix := csrMatrix.ToDense().T()

// Multiply the 2 matrices together and store the result in the
// sparse receiver (multiplication with sparse product)
var csrProduct sparse.CSR
csrProduct.Mul(csrMatrix, cooMatrix)

// As an alternative, use the sparse BLAS routines for efficient
// sparse matrix multiplication with a Gonum mat.Dense product
// (multiplication with dense product)
denseProduct := sparse.MulMatMat(false, 1, csrMatrix, denseMatrix, nil)

Installation

With Go installed, package installation is performed using go get.

go get -u github.com/james-bowman/sparse/...

Acknowledgements

See Also

License

MIT

FAQs

Last updated on 29 Jul 2021

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc