graph-run
Figure out which items in a dependency graph can be handled in
parallel, and which must be deferred until their dependencies are
completed. Handles cyclic and acyclic graphs, but if cycles are
allowed, then dependency order may not be guaranteed.
For example, given the following directed graph:
flowchart TD
A --> B
A --> C
B --> E
B --> D
C --> D
C --> G
D --> F
F --> E
it will allow you to operate in this order:
- E and G in parallel (since they are leaf nodes)
- F when E is complete
- D when F is complete
- B when D and E are complete (possibly in parallel with C)
- C when D and G are complete (possibly in parallel with B)
- A when B and C are complete
Operation will be maximally parallelized, so that each step in
the graph only waits until its dependencies are completed.
Cycles
If a cycle exists in the graph, then at least one node in the
cycle will by necessity be operated upon before its
dependencies are complete.
Cycles may be disallowed by throwing in the onCycle method
provided.
Contract
- Each node reachable from the initial entry points provided will
be visited exactly once.
- Except in the case of cycles, each node's dependencies will be
visited before itself.
- In the default async forms of the methods provided:
- All methods will be awaited before continuing.
- Any operations that can be done in parallel will be.
- All pending visits and dependency lookups will be skipped if a
provided
AbortSignal fires.
Difference between this and typical topological sort
This is not just a topological sort, and there are better
libraries available that implement Kahn's Topological Sort very
efficiently.
However, topological sorting is insufficient if you wish to know
which items can be operated on in parallel, and Kahn's
algorithm only works when the graph is guaranteed to be acyclic
and all nodes are known ahead of time.
This implementation does not require that the graph be entirely
loaded ahead of time, and provides a mechanism to treat cycles as
a warning rather than an error, with the caveat that if cycles
are allowed then topological ordering is of course not
guaranteed.
Caveat
While this can be used in theory to explore any infinitely large
graph, note that the result set will be stored in memory.
So, if you were to use it to try to crawl all the links in
Wikipedia or something, you're going to have a Bad Time if the
result set gets too big.
USAGE
import {
graphRun,
allSettled,
any,
race,
graphRunSync,
allSettledSync,
anySync,
raceSync,
} from 'graph-run'
const ac = new AbortController()
const results = await graphRun({
graph: [some, nodes, known, at, the, start],
getDeps: async (n) => {
return [the, dependency, node, objects]
},
visit: async (node, signal, path) => {
await doSomething(n)
},
onCycle: (node, cycle, path) => {
console.error(
`warning: while evaluating ${
node
} at path ${
path.join('->')
} encountered cycle: ${
cycle.join('->')
}. Proceeding without this dependency.`
)
},
failFast: true
signal: ac.signal
})
const isDAG = (graph) => {
try {
graphRunSync({
graph: [graph],
getDeps: n => n.children ?? [],
onCycle: () => { throw 'cycle detected' },
visit: () => {}
})
return true
} catch {
return false
}
}
const g = {
children: [
{
name: 'a',
children: [{ name: 'b' }],
}
]
}
console.log(isDAG(g))
g.children[0].children[0].children = g.children
console.log(isDAG(g))
See the typedocs for
complete API details.
Other Options
How This Is Different
- More flexible than most as far as the data structures used. Ie,
Node can be any value type.
- Dependency edges are resolved on-demand, and may be calculated
asynchronously, making it efficient in cases where loading the
entire graph may be expensive.
- Both cyclic and acyclic graphs are supported.