What is vega-dataflow?
The vega-dataflow npm package is a JavaScript library for building reactive dataflow graphs. It is a core component of the Vega visualization grammar, enabling the construction of data processing pipelines that can react to changes in data, parameters, or user interactions.
What are vega-dataflow's main functionalities?
Data Transformation
This feature allows you to apply transformations to data. In this example, a filter transformation is applied to filter out data values less than or equal to 10.
const vega = require('vega-dataflow');
const df = new vega.Dataflow();
const data = df.add([]);
const transform = df.add(vega.transforms.Filter, {expr: 'datum.value > 10', pulse: data});
df.pulse(data, vega.changeset().insert([{value: 5}, {value: 15}])).run();
console.log(transform.value); // [{value: 15}]
Reactive Dataflow
This feature allows you to create reactive dataflow graphs that automatically update when data changes. In this example, an aggregate transformation is used to compute the sum of data values.
const vega = require('vega-dataflow');
const df = new vega.Dataflow();
const data = df.add([]);
const sum = df.add(vega.transforms.Aggregate, {fields: ['value'], ops: ['sum'], pulse: data});
df.pulse(data, vega.changeset().insert([{value: 5}, {value: 15}])).run();
console.log(sum.value); // [{sum_value: 20}]
Parameter Binding
This feature allows you to bind parameters to transformations, making them dynamic and reactive to parameter changes. In this example, a filter transformation is bound to a parameter, and the filter condition updates when the parameter changes.
const vega = require('vega-dataflow');
const df = new vega.Dataflow();
const param = df.add(10);
const data = df.add([]);
const transform = df.add(vega.transforms.Filter, {expr: 'datum.value > param', pulse: data});
df.pulse(data, vega.changeset().insert([{value: 5}, {value: 15}])).run();
console.log(transform.value); // [{value: 15}]
df.update(param, 20).run();
console.log(transform.value); // []
Other packages similar to vega-dataflow
d3
D3.js is a JavaScript library for producing dynamic, interactive data visualizations in web browsers. It uses HTML, SVG, and CSS. While D3 focuses on data-driven document manipulation and visualization, vega-dataflow is more about building reactive data processing pipelines.
rx
RxJS is a library for reactive programming using Observables, to make it easier to compose asynchronous or callback-based code. While RxJS provides a more general-purpose reactive programming model, vega-dataflow is specifically designed for data processing in the context of data visualization.
lodash
Lodash is a JavaScript library that provides utility functions for common programming tasks using a functional programming paradigm. While Lodash offers a wide range of data manipulation utilities, vega-dataflow is focused on building reactive dataflow graphs for data visualization.
vega-dataflow
Reactive dataflow processing.
Defines a reactive dataflow graph that can process both scalar values and
streaming relational data. A central Dataflow
instance manages and
schedules a collection of Operator
instances, each of which is a node in
a dataflow graph. Each operator maintains a local state value, and may
also process streaming data objects (or tuples) passing through. Operators
may depend on a set of named Parameters
, which can either be fixed values
or live references to other operator values.
Upon modifications to operator parameters or input data, changes are
propagated through the graph in topological order. Pulse
objects propagate
from operators to their dependencies, and carry queues of added, removed
and/or modified tuples.
This module contains a library of Operator
types for data stream query
processing, including data generation, sampling, filtering, binning,
group-by aggregation, and cross-stream lookup operations.