Node Shell
Node shell is an npm package aimed at providing bash-like operations/simplicity within the node ecosystem. The goal is to make working with files/folders, http requests, and transformations, as as easy as possible. The library is built upon the async generation constructs within Ecmascript as well as stream constructs within the node ecosystem. This means the performance is iterative and real-time, just in the same way piping works in a Unix shell.
(remote-tokens.js) Example of processing URLs from the input stream
#!/usr/bin/env -S npx @arcsine/nodesh
$stdin
.$match($pattern.URL, 'extract')
.$fetch()
.$tokens()
.$filter(x =>
x.length >= 6 &&
x.charAt(0) === x.charAt(0).toUpperCase()
)
.$stdout;
NOTE: The shebang defined here is using env
's -S
flag which will allow for the passing of arguments in the shebang.
As you can see above, the library's aim is to mimic the pattern of command piping, as well as integrate with stdin/stdout seamlessly. With the shebang applied appropriately, this script can be used just like any other cli command.
Example of integrating node scripts within the shell
$ find . -name '*.ts' |\
cat |\
./remote-tokens.js |\
sort -u
Table of Contents
Goals
This tools is aimed at simple workflows that normally live within the domain of bash scripts. It is meant to be an alternative of staying in bash or jumping over to another language like python. It's aimed at being able to leverage node libraries and utilities while providing a solid set of foundational elements.
The goal of this tool is not to be:
- a comprehensive streaming framework
- a reactive framework (e.g. rxjs)
- a build system alternative.
This tool has aspects of all of the above, but it's primary design goal is to focus on providing simplicity in shell-like interaction. To that end, design decisions were made towards simplicity over performance, and towards common patterns versus being completely configurable.
Motivation
When solving simple problems involving file systems, file contents, and even http requests, the Unix command line is a great place to operate. The command line is a powerful tool with all of the built in functionality, and the simplicity of the Unix philosophy. As bash scripts grow in complexity, maintenance and understanding tend to drop off quickly. When piping a file through 5-10+ commands, following the logic can be challenging.
Usually at this point, is when I would switch over to something like Python given it's "batteries included" mentality, as it's a perfectly fine language in it's own right. That being said, I find it more and more desirable to be able to leverage common tools/libraries from the node ecosystem in these tasks.
Architecture
The tool revolves around the use of async
generators, as denoted by async function *
. This allows for the iterative operation, as well as support for asynchronous operations. This means everything within the framework is non-blocking. This also means the primary way of using the framework is by accessing your data as an async generator. The library has built in support for converting basic data types into async generators, as well as built-in support for common patterns.
Example of simple async generator
async function * asyncWorker() {
while (true) {
const result = await longOp();
yield result;
}
}
Sources
Out of the box, the following types support the async iterator symbol (AsyncIterable
):
Iterables
- Generator - This will return the generator, but as an async generator
Set
- This will return an async generator over the set contentsMap
- This will return an async generator over the map's entries [key, value]Array
- This will return an async generator over the array contentsURLSearchParams
- This will generate over the key/value pairsNodeJS:ReadStream
- This will return a line-oriented async generator over the read stream
stream.Readable
http.IncomingMessage
fs.ReadStream
Example of read stream
const lineGenerator = $of(fs.createReadStream('data.txt'));
... or ...
const lineGenerator = fs.createReadStream('data.txt').$map(x => ...);
Primitives
The following primitives are also supported, but will return a generator that only has
a single value, that of the primitive
String
Number
RegExp
Boolean
Buffer
In addition to the built-in functionality, a global function $of
is declared that will allow any value passed in to be converted to an async iterable. If the item is iterable or is a stream, it will return the iteration as a generator, otherwise return the value as a single-valued generator.
Example of simple value
const bigIntGen = $of(10000n);
GlobalHelpers
Within the framework there are some common enough patterns that
exposing them globally proves useful.
$of
Will turn any value into a sequence. If the input value is of type:
Iterable
- Returns sequence of elementsAsyncIterable
- Returns sequence of elementsReadable
/ReadStream
- Returns a sequence of lines read from stream- Everything else - Returns a sequence of a single element
static $of(el: Readable): AsyncGenerator<string>;
static $of(el: string): AsyncGenerator<string>;
static $of<T>(el: AsyncIterable<T>): AsyncGenerator<T>;
static $of<T>(el: Iterable<T>): AsyncGenerator<T>;
static $of<T>(el: AsyncIterable<T>): AsyncGenerator<T>;
static $of<T>(el: T[]): AsyncGenerator<T>;
Example
$of([1,2,3])
.$map(x => x ** 2)
[1,2,3]
.$map(x => x ** 2)
$registerOperator
In the process of using the tool, there may be a need for encapsulating common
operations. By default, $wrap
provides an easy path for re-using functionality,
but it lacks the clarity of intent enjoyed by the built in operators.
static get $registerOperator(): (op: Function) => void;
Example
(reverse.js)
class Custom {
$reverse() {
return this
.$collect()
.$map(x => x.reverse())
.$flatten();
}
}
registerOperator(Custom);
module global {
interface AsyncIterable<T> extends Custom;
}
require('./reverse')
[1,2,3]
.$iterable
.$reverse()
$argv
The cleaned argv parameters for the running script. Starting at index 0,
is the first meaning parameter for the script. This differs from process.argv
by excluding the executable and script name. This is useful as the script may
be invoked in many different ways and the desire is to limit the amount of
guessing needed to handle inputs appropriately.
NOTE: If you are going to use a command line parsing tool, then you would continue to
use process.argv
as normal.
static get $argv(): string[];
Example
(argv[0] ?? 'Enter a file name:'.$prompt())
.$read()
$stdin
Provides direct access to stdin as sequence of lines
static get $stdin(): AsyncIterable<string>;
Example
$stdin
.$map(line => line.split('').reverse().join(''))
.$stdout
$env
A case insensitive map for accessing environment variables. Like process.env
, but
doesn't require knowledge of the case. Useful for simplifying script interactions.
static get $env(): Record<string, string>;
Example
($env.user_name ?? ask('Enter a user name'))
.$map(userName => ... )
$pattern
Common patterns that can be used where regular expressions are supported
static get $pattern(): {
URL: RegExp;EMAIL: RegExp;};
Example
<file>
.$read()
.$match($pattern.URL, 'extract')
.$filter(url => url.endsWith('.com'))
$range
Produces a numeric range, between start (1 by default) and stop (inclusive). A step
parameter can be defined to specify the distance between iterated numbers.
static $range(stop: number, start?: number, step?: number): AsyncIterable<number>;
Example
$range(1, 3)
.$map(x => x**2)
$range(10, 1, 2)
Operators
The entirety of this project centers on the set of available operators. These operators can be broken into the following groups
Core
The core functionality provides some very basic support for sequences
$forEach
This operator is a terminal action that receives each element of the sequence in sequence,
but returns no value. This function produces a promise that should be waited on to ensure the
sequence is exhausted.
$forEach<T>(this: AsyncIterable<T>, fn: PromFunc<T, any>): Promise<void>;
Example
fs.createReadStream('<file>')
.$forEach(console.log)
$map
Converts the sequence of data into another, by applying an operation
on each element.
$map<T, U>(this: AsyncIterable<T>, fn: PromFunc<T, U>): $AsyncIterable<U>;
Example
fs.createReadStream('<file>')
.$map(line => line.toUpperCase())
$filter
Determines if items in the sequence are valid or not. Invalid items
are discarded, while valid items are retained.
$filter<T>(this: AsyncIterable<T>, pred: PromFunc<T, boolean>): $AsyncIterable<T>;
Example
fs.createReadStream('<file>')
.$filter(x => x.length > 10)
$flatten
Flattens a sequence of arrays, or a sequence of sequences. This allows for operators that
return arrays/sequences, to be able to be represented as a single sequence.
$flatten<T, U>(this: AsyncIterable<AsyncIterable<U> | Iterable<U>>): $AsyncIterable<U>;
Example
fs.createReadStream('<file>')
.$map(line => line.split(/\s+/g))
.$flatten()
$flatMap
This is a combination of $map
and $flatten
as they are common enough in usage to warrant a
combined operator. This will map the the contents of the sequence (which produces an array
or sequence), and producing a flattened output.
$flatMap<T, U>(this: AsyncIterable<T>, fn: PromFunc<T, AsyncIterable<U> | Iterable<U>>): $AsyncIterable<U>;
Example
fs.createReadStream('<file>')
.$flatMap(line => line.split(/\s+/g))
$reduce
This is the standard reduce operator and behaves similarly as Array.prototype.reduce
. This operator
takes in an accumulation function, which allows for computing a single value based on visiting each element
in the sequence. Given that reduce is a comprehensive and produces a singular value, this operation cannot
stream and will block until the stream is exhausted. Normally it is common to understand $map
and $filter
as
being implemented by $reduce
, but in this situation they behave differently.
$reduce<T, U>(this: AsyncIterable<T>, fn: PromFunc2<U, T, U> & {init?: () => U;}, acc?: U): $AsyncIterable<U>;
Example
fs.createReadStream('<file>')
.$flatMap(line => line.split(/\s+/g))
.$reduce((acc, token) => {
acc[token] = (acc[token] ?? 0) + 1;
return acc;
}, {});
$collect
Gathers the entire sequence output as a single array. This is useful if you need the entire stream to perform an action.
$collect<T>(this: AsyncIterable<T>): $AsyncIterable<T[]>;
Example
fs.createReadStream('<file>')
.$collect()
.$map(lines => lines.join('\n'))
$wrap
This is the simplest mechanism for extending the framework as the operator takes in a function that operates on the sequence of
data as a whole. It will consume the sequence and produce an entirely new sequence.
$wrap<T, U>(this: AsyncIterable<T>, fn: (input: AsyncIterable<T>) => (AsyncIterable<U> | Iterable<U>)): $AsyncIterable<U>;
Example
async function translate*(lang, gen) {
for await (const line of gen) {
for (const word of line.split(/\s+/g)) {
const translated = await doTranslate(lang, word);
yield translated;
}
}
}
fs.createReadStream('<file>')
.$wrap(translate.bind(null, 'fr'));
$onError
If an error occurs, use the provided sequence instead
$onError<T>(this: AsyncIterable<T>, alt: OrCallable<AsyncIterable<T> | Iterable<T>>): $AsyncIterable<T>;
Example
'<file>'.
.$read()
.$onError(() => `Sample Text`)