ipfs-unixfs-importer
JavaScript implementation of the UnixFs importer used by IPFS
Table of contents
Install
$ npm i ipfs-unixfs-importer
Lead Maintainer
Alex Potsides
Usage
Example
Let's create a little directory to import:
> cd /tmp
> mkdir foo
> echo 'hello' > foo/bar
> echo 'world' > foo/quux
And write the importing logic:
import { importer } from 'ipfs-unixfs-importer'
import { MemoryBlockstore } from 'blockstore-core/memory'
const blockstore = new MemoryBlockstore()
const source = [{
path: '/tmp/foo/bar',
content: fs.createReadStream(file)
}, {
path: '/tmp/foo/quxx',
content: fs.createReadStream(file2)
}]
for await (const entry of importer(source, blockstore, options)) {
console.info(entry)
}
When run, metadata about DAGNodes in the created tree is printed until the root:
{
cid: CID,
path: 'tmp/foo/bar',
unixfs: UnixFS
}
{
cid: CID,
path: 'tmp/foo/quxx',
unixfs: UnixFS
}
{
cid: CID,
path: 'tmp/foo',
unixfs: UnixFS
}
{
cid: CID,
path: 'tmp',
unixfs: UnixFS
}
API
import { importer } from 'ipfs-unixfs-importer'
const stream = importer(source, blockstore [, options])
The importer
function returns an async iterator takes a source async iterator that yields objects of the form:
{
path: 'a name',
content: (Buffer or iterator emitting Buffers),
mtime: (Number representing seconds since (positive) or before (negative) the Unix Epoch),
mode: (Number representing ugo-rwx, setuid, setguid and sticky bit)
}
stream
will output file info objects as files get stored in IPFS. When stats on a node are emitted they are guaranteed to have been written.
blockstore
is an instance of a blockstore
The input's file paths and directory structure will be preserved in the dag-pb
created nodes.
options
is an JavaScript option that might include the following keys:
wrapWithDirectory
(boolean, defaults to false): if true, a wrapping node will be createdshardSplitThreshold
(positive integer, defaults to 1000): the number of directory entries above which we decide to use a sharding directory builder (instead of the default flat one)chunker
(string, defaults to "fixed"
): the chunking strategy. Supports:
avgChunkSize
(positive integer, defaults to 262144
): the average chunk size (rabin chunker only)minChunkSize
(positive integer): the minimum chunk size (rabin chunker only)maxChunkSize
(positive integer, defaults to 262144
): the maximum chunk sizestrategy
(string, defaults to "balanced"
): the DAG builder strategy name. Supports:
flat
: flat list of chunksbalanced
: builds a balanced treetrickle
: builds a trickle tree
maxChildrenPerNode
(positive integer, defaults to 174
): the maximum children per node for the balanced
and trickle
DAG builder strategieslayerRepeat
(positive integer, defaults to 4): (only applicable to the trickle
DAG builder strategy). The maximum repetition of parent nodes for each layer of the tree.reduceSingleLeafToSelf
(boolean, defaults to true
): optimization for, when reducing a set of nodes with one node, reduce it to that node.hamtHashFn
(async function(string) Buffer): a function that hashes file names to create HAMT shardshamtBucketBits
(positive integer, defaults to 8
): the number of bits at each bucket of the HAMTprogress
(function): a function that will be called with the byte length of chunks as a file is added to ipfs.onlyHash
(boolean, defaults to false): Only chunk and hash - do not write to diskhashAlg
(string): multihash hashing algorithm to usecidVersion
(integer, default 0): the CID version to use when storing the data (storage keys are based on the CID, including it's version)rawLeaves
(boolean, defaults to false): When a file would span multiple DAGNodes, if this is true the leaf nodes will not be wrapped in UnixFS
protobufs and will instead contain the raw file bytesleafType
(string, defaults to 'file'
) what type of UnixFS node leaves should be - can be 'file'
or 'raw'
(ignored when rawLeaves
is true
)blockWriteConcurrency
(positive integer, defaults to 10) How many blocks to hash and write to the block store concurrently. For small numbers of large files this should be high (e.g. 50).fileImportConcurrency
(number, defaults to 50) How many files to import concurrently. For large numbers of small files this should be high (e.g. 50).
Overriding internals
Several aspects of the importer are overridable by specifying functions as part of the options object with these keys:
chunkValidator
(function): Optional function that supports the signature async function * (source, options)
- This function takes input from the
content
field of imported entries. It should transform them into Buffer
s, throwing an error if it cannot. - It should yield
Buffer
objects constructed from the source
or throw an Error
chunker
(function): Optional function that supports the signature async function * (source, options)
where source
is an async generator and options
is an options object
- It should yield
Buffer
objects.
bufferImporter
(function): Optional function that supports the signature async function * (entry, blockstore, options)
- This function should read
Buffer
s from source
and persist them using blockstore.put
or similar entry
is the { path, content }
entry, where entry.content
is an async generator that yields Buffers- It should yield functions that return a Promise that resolves to an object with the properties
{ cid, unixfs, size }
where cid
is a CID, unixfs
is a UnixFS entry and size
is a Number
that represents the serialized size of the IPLD node that holds the buffer data. - Values will be pulled from this generator in parallel - the amount of parallelisation is controlled by the
blockWriteConcurrency
option (default: 10)
dagBuilder
(function): Optional function that supports the signature async function * (source, blockstore, options)
- This function should read
{ path, content }
entries from source
and turn them into DAGs - It should yield a
function
that returns a Promise
that resolves to { cid, path, unixfs, node }
where cid
is a CID
, path
is a string, unixfs
is a UnixFS entry and node
is a DAGNode
. - Values will be pulled from this generator in parallel - the amount of parallelisation is controlled by the
fileImportConcurrency
option (default: 50)
treeBuilder
(function): Optional function that supports the signature async function * (source, blockstore, options)
- This function should read
{ cid, path, unixfs, node }
entries from source
and place them in a directory structure - It should yield an object with the properties
{ cid, path, unixfs, size }
where cid
is a CID
, path
is a string, unixfs
is a UnixFS entry and size
is a Number
.
Contribute
Feel free to join in. All welcome. Open an issue!
This repository falls under the IPFS Code of Conduct.
License
Licensed under either of
Contribute
Feel free to join in. All welcome. Open an issue!
This repository falls under the IPFS Code of Conduct.