![Dev Dependencies](https://david-dm.org/signicode/scramjet/dev-status.svg)
What does it do?
Scramjet is a powerful, yet simple functional stream programming framework written on top of node.js object streams that
exposes a standards inspired javascript API and written fully in native ES6. Thanks to it some built in optimizations
scramjet is much faster than similar frameworks in asynchronous operations (like for instance calling an API).
It is built upon the logic behind three well known javascript array operations - namingly map, filter and reduce. This
means that if you've ever performed operations on an Array in JavaScript - you already know Scramjet like the back of
your hand.
The main advantage of scramjet is running asynchronous operations on your data streams. First of all it allows you to
perform the transformations both synchronously and asynchronously by using the same API - so now you can "map" your
stream from whatever source and call any number of API's consecutively.
The benchmarks are punblished in the scramjet-benchmark repo.
Example
How about a CSV parser of all the parkings in the city of Wrocław from http://www.wroclaw.pl/open-data/...
const request = require("request");
const StringStream = require("scramjet").StringStream;
let columns = null;
request.get("http://www.wroclaw.pl/open-data/opendata/its/parkingi/parkingi.csv")
.pipe(new StringStream())
.split("\n")
.parse((line) => line.split(";"))
.pop(1, (data) => columns = data)
.map((data) => columns.reduce((acc, id, i) => (acc[id] = data[i], acc), {}))
.on("data", console.log.bind(console))
API Docs
Here's the list of the exposed classes and methods, please review the specific documentation for details:
Note that:
- Most of the methods take a callback argument that operates on the stream items.
- The callback, unless it's stated otherwise, will receive an argument with the next chunk.
- If you want to perform your operations asynchronously, return a Promise, otherwise just return the right value.
The quick reference of the exposed classes:
DataStream ⇐ stream.PassThrough
DataStream is the primary stream type for Scramjet. When you parse your
stream, just pipe it you can then perform calculations on the data objects
streamed through your flow.
Detailed DataStream docs here
Method | Description | Example |
---|
new DataStream(opts) | Create the DataStream. | DataStream example |
dataStream.TimeSource : Object | Source of time - must implement the interface of Date. | |
dataStream.setTimeout : function | setTimeout method | |
dataStream.clearTimeout : function | setTimeout method | |
dataStream.debug(func) ⇒ DataStream | Injects a debugger statement when called. | debug example |
dataStream.use(func) ⇒ * | Calls the passed in place with the stream as first argument, returns result. | use example |
dataStream.cluster(hashFunc, count, stringify, parse) ⇒ ClusteredDataStream | [NYI] Distributes processing to multiple forked subprocesses. | |
dataStream.separate(func, createOptions) ⇒ DataStream | Separates execution to multiple streams using the hashes returned by the passed callback. | separate example |
dataStream.tee(func) ⇒ DataStream | Duplicate the stream | tee example |
dataStream.slice(start, end, func) ⇒ DataStream | Gets a slice of the stream to the callback function. | slice example |
dataStream.accumulate(func, into) ⇒ Promise | Accumulates data into the object. | accumulate example |
dataStream.reduce(func, into) ⇒ Promise | Reduces the stream into a given accumulator | reduce example |
dataStream.reduceNow(func, into) ⇒ * | Reduces the stream into the given object, returning it immediately. | reduceNow example |
dataStream.remap(func, Clazz) ⇒ DataStream | Remaps the stream into a new stream. | remap example |
dataStream.flatMap(func, Clazz) ⇒ DataStream | Takes any method that returns any iterable and flattens the result. | flatMap example |
dataStream.unshift(item) ↩︎ | Pushes any data at call time | |
dataStream.flatten() ⇒ DataStream | A shorthand for streams of Arrays to flatten them. | |
dataStream.batch(count) ⇒ DataStream | Aggregates chunks in arrays given number of number of items long. | batch example |
dataStream.timeBatch(ms, count) ⇒ DataStream | Aggregates chunks to arrays not delaying output by more than the given number of ms. | timeBatch example |
dataStream.each(func) ↩︎ | Performs an operation on every chunk, without changing the stream | |
dataStream.map(func, Clazz) ⇒ DataStream | Transforms stream objects into new ones, just like Array.prototype.map | map example |
dataStream.assign(func) ⇒ DataStream | Transforms stream objects by assigning the properties from the returned | assign example |
dataStream.filter(func) ⇒ DataStream | Filters object based on the function outcome, just like | filter example |
dataStream.shift(count, func) ⇒ DataStream | Shifts the first n items from the stream and pipes the other | shift example |
dataStream.toBufferStream(serializer) ⇒ BufferStream | Creates a BufferStream | toBufferStream example |
dataStream.stringify(serializer) ⇒ StringStream | Creates a StringStream | stringify example |
dataStream.toArray(initial) ⇒ Promise | Aggregates the stream into a single Array | |
DataStream.fromArray(arr) ⇒ DataStream | Create a DataStream from an Array | fromArray example |
DataStream.fromIterator(iter) ⇒ DataStream | Create a DataStream from an Iterator | fromIterator example |
StringStream ⇐ DataStream
A stream of string objects for further transformation on top of DataStream.
Detailed StringStream docs here
BufferStream ⇐ DataStream
A factilitation stream created for easy splitting or parsing buffers
Detailed BufferStream docs here
MultiStream
An object consisting of multiple streams than can be refined or muxed.
Detailed MultiStream docs here
Method | Description | Example |
---|
new MultiStream(streams, options) | Crates an instance of MultiStream with the specified stream list | MultiStream example |
multiStream.streams : Array | Array of all streams | |
multiStream.map(aFunc) ⇒ MultiStream | Returns new MultiStream with the streams returned by the tranform. | map example |
multiStream.filter(func) ⇒ MultiStream | Filters the stream list and returns a new MultiStream with only the | filter example |
multiStream.dedupe(cmp) ⇒ DataStream | Removes duplicate items from stream using the given hash function | dedupe example |
multiStream.mux(cmp) ⇒ DataStream | Muxes the streams into a single one | mux example |
multiStream.add(stream) | Adds a stream to the MultiStream | add example |
multiStream.remove(stream) | Removes a stream from the MultiStream | remove example |
Browserifying
Scramjet works in the browser too, there's a nice, self-contained sample in here, just run it:
git clone https://github.com/signicode/scramjet.git
cd scramjet
npm install .
cd samples/browser
npm start
If you need your scramjet version for the browser, grab browserify and just run:
browserify lib/index -standalone scramjet -o /path/to/your/browserified-scramjet.js
With this you can run your transformations in the browser, use websockets to send them back and forth. If you do and fail for some reason, please remember to be issuing those issues - as no one person can test all the use cases and I am but one person.
Usage
Scramjet uses functional programming to run transformations on your data streams in a fashion very similar to the well known event-stream node module. Most transformations are done by passing a transform function. You can write your function in two ways:
- Synchronous
Example: a simple stream transform that outputs a stream of objects of the same id property and the length of the value string.
datastream.map(
(item) => ({id: item.id, length: item.value.length})
)
- Asynchronous (using Promises)
Example: A simple stream that fetches an url mentioned in the incoming object
datastream.map(
(item) => new Promise((resolve, reject) => {
request(item.url, (err, res, data) => {
if (err)
reject(err);
else
resolve(data);
});
})
)
The actual logic of this transform function is as if you passed your function
to the then
method of a Promise resolved with the data from the input
stream.
License and contributions
As of version 2.0 Scramjet is MIT Licensed.
Help wanted
The project need's your help! There's lots of work to do - transforming and muxing, joining and splitting, browserifying, modularizing, documenting and issuing those issues.
If you want to help and be part of the Scramjet team, please reach out to me, signicode on Github or email me: scramjet@signicode.com.