Memory efficient stack of multiple 2D sparse arrays.
Installation
Requirements
Python 3.8 or higher
Pip Install
Simply install using pip: pip install sparsestack
First code example
import numpy as np
from sparsestack import StackedSparseArray
scores1 = np.random.random((12, 10))
scores1[scores1 < 0.9] = 0
scores2 = np.random.random((12, 10))
scores2[scores2 < 0.75] = 0
sparsestack = StackedSparseArray(12, 10)
sparsestack.add_dense_matrix(scores1, "scores_1")
sparsestack.add_dense_matrix(scores2, "scores_2", join_type="left")
sparsestack[3, 4]
sparsestack[3, :]
sparsestack[:, 2]
sparsestack[3, :, 0]
sparsestack[3, :, "scores_1"]
scores2_after_merge = sparsestack.to_array("scores_2")
Adding data to a sparsestack
-array
Sparsestack provides three options to add data to a new layer.
.add_dense_matrix(input_array)
Can be used to add all none-zero elements of input_array
to the sparsestack. Depending on the chosen join_type
either all such values will be added (join_type="outer"
or join_type="right"
), or only those which are already present in underlying layers ("left" or "inner" join)..add_sparse_matrix(input_coo_matrix)
This method will expect a COO-style matrix (e.g. scipy) which has attributes .row, .col and .data. The join type can again be specified using join_type
..add_sparse_data(row, col, data)
This essentially does the same as .add_sparse_matrix(input_coo_matrix)
but might in some cases be a bit more flexible because row, col and data are separate input arguments.
Accessing data from sparsestack
-array
The collected sparse data can be accessed in multiple ways.
- Slicing.
sparsestack
allows multiple types of slicing (see also code example above).
sparsestack[3, 4]
sparsestack[3, :]
sparsestack[:, 2]
sparsestack[3, :, 0]
sparsestack[3, :, "scores_1"]
.to_array()
Creates and returns a dense numpy array of size .shape
. Can also be used to create a dense numpy array of only a single layer when used like .to_array(name="layerX")
.
Carefull: Obviously by converting to a dense array, the sparse nature will be lost and all empty positions in the stack will be filled with zeros..to_coo(name="layerX")
Returns a scipy sparse COO-matrix of the specified layer.