v4.2.0 (2020-04-20)
This update adds experimental multithreaded reading for N-Triples and N-Quads, called 'Scanners': @graphy/content.nt.scan
and @graphy/content.nq.scan
. These are deserializers that take a single input stream and read it across multiple workers in parallel, allowing users to take advantage of multiprocessor systems.
@graphy/content.nt.scan
and @graphy/content.nq.scan
are experimental; they have limited test case coverage in node.js and have not been tested in the browser.
How it works:
The main thread populates sections of a preallocated SharedArrayBuffer
with chunks copied from the input stream data and allows workers to claim each slot on a first-come-first-serve basis (synchronized using Atomics
). For each chunk of input data, the main thread handles reading the leading and trailing fragments that belong to RDF statements spanning across chunk boundaries. Communication about data exchange is done mostly through unidirectional byte indicators on SharedArrayBuffer
, with occassional help and synchronization from Atomics
when necessary. Error handling and gathering of results is done using message passing, i.e., postMessage
, which accepts transferable objects when passing large amounts of data.
How its used:
Similar to map/reduce, you must provide some code that will be run on each thread to create the reader instance and send updates or submit the final result(s) back to the main thread. Set the .run
property on the config object passed to the constructor, it will be eval
'd on both the main thread and each of the workers (eval
is necessary to allow access to global context, such as require
, while providing consistent behavior across node.js and browser). Tasks are therefore intended only for running trusted code. See scan verb documentation for more information.
🍭 Features