@ipld/car
Content Addressable aRchive format reader and writer
Install
$ npm i @ipld/car
See also:
Contents
Example
import fs from 'fs'
import { Readable } from 'stream'
import { CarReader, CarWriter } from '@ipld/car'
import * as raw from 'multiformats/codecs/raw'
import { CID } from 'multiformats/cid'
import { sha256 } from 'multiformats/hashes/sha2'
async function example () {
const bytes = new TextEncoder().encode('random meaningless bytes')
const hash = await sha256.digest(raw.encode(bytes))
const cid = CID.create(1, raw.code, hash)
const { writer, out } = await CarWriter.create([cid])
Readable.from(out).pipe(fs.createWriteStream('example.car'))
await writer.put({ cid, bytes })
await writer.close()
const inStream = fs.createReadStream('example.car')
const reader = await CarReader.fromIterable(inStream)
const roots = await reader.getRoots()
const got = await reader.get(roots[0])
console.log('Retrieved [%s] from example.car with CID [%s]',
new TextDecoder().decode(got.bytes),
roots[0].toString())
}
example().catch((err) => {
console.error(err)
process.exit(1)
})
Will output:
Retrieved [random meaningless bytes] from example.car with CID [bafkreihwkf6mtnjobdqrkiksr7qhp6tiiqywux64aylunbvmfhzeql2coa]
See the examples directory for more.
Usage
@ipld/car
is consumed through factory methods on its different classes. Each
class represents a discrete set of functionality. You should select the classes
that make the most sense for your use-case.
Please be aware that @ipld/car
does not validate that block data matches
the paired CIDs when reading a CAR. See the
verify-car.js example for one possible approach to
validating blocks as they are read. Any CID verification requires that the hash
function that was used to generate the CID be available, the CAR format does
not restrict the allowable multihashes.
The basic CarReader
class is consumed via:
import { CarReader } from '@ipld/car/reader'
Or alternatively: import { CarReader } from '@ipld/car'
. CommonJS require
will also work for the same import paths and references.
CarReader
is useful for relatively small CAR archives as it buffers the
entirety of the archive in memory to provide access to its data. This class is
also suitable in a browser environment. The CarReader
class provides
random-access get(key)
and has(key)
methods as well as iterators for blocks()
] and
cids()
].
CarReader
can be instantiated from a
single Uint8Array
or from
an AsyncIterable
of Uint8Array
s (note that
Node.js streams are AsyncIterable
s and can be consumed in this way).
The CarIndexedReader
class is a special form of CarReader
and can be
consumed in Node.js only (not in the browser) via:
import { CarIndexedReader } from '@ipld/car/indexed-reader'
Or alternatively: import { CarIndexedReader } from '@ipld/car'
. CommonJS
require
will also work for the same import paths and references.
A CarIndexedReader
provides the same functionality as CarReader
but is
instantiated from a path to a CAR file and also
adds a close()
method that must be called when the reader
is no longer required, to clean up resources.
CarIndexedReader
performs a single full-scan of a CAR file, collecting a list
of CID
s and their block positions in the archive. It then performs
random-access reads when blocks are requested via get()
and the blocks()
and
cids()
iterators.
This class may be sutiable for random-access (primarily via has()
and get()
)
to relatively large CAR files.
import { CarBlockIterator } from '@ipld/car/iterator'
import { CarCIDIterator } from '@ipld/car/iterator'
Or alternatively:
import { CarBlockIterator, CarCIDIterator } from '@ipld/car'
. CommonJS
require
will also work for the same import paths and references.
These two classes provide AsyncIterable
s to the blocks or just the CIDs
contained within a CAR archive. These are efficient mechanisms for scanning an
entire CAR archive, regardless of size, if random-access to blocks is not
required.
CarBlockIterator
and CarCIDIterator
can be instantiated from a
single Uint8Array
(see
CarBlockIterator.fromBytes()
and
CarCIDIterator.fromBytes()
) or from
an AsyncIterable
of Uint8Array
s (see
CarBlockIterator.fromIterable()
and
CarCIDIterator.fromIterable()
)—note that
Node.js streams are AsyncIterable
s and can be consumed in this way.
The CarIndexer
class can be used to scan a CAR archive and provide indexing
data on the contents. It can be consumed via:
import CarIndexer from '@ipld/car/indexed-reader'
Or alternatively: import { CarIndexer } from '@ipld/car'
. CommonJS
require
will also work for the same import paths and references.
This class is used within CarIndexedReader
and is only
useful in cases where an external index of a CAR needs to be generated and used.
The index data can also be used with
CarReader.readRaw()
] to fetch block data directly from
a file descriptor using the index data for that block.
A CarWriter
is used to create new CAR archives. It can be consumed via:
import CarWriter from '@ipld/car/writer'
Or alternatively: import { CarWriter } from '@ipld/car'
. CommonJS
require
will also work for the same import paths and references.
Creation of a CarWriter
involves a "channel", or a
{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
pair. The writer
side
of the channel is used to put()
blocks, while the out
side of the channel emits the bytes that form the encoded CAR archive.
In Node.js, you can use the
Readable.from()
API to convert the out
AsyncIterable
to a standard Node.js stream, or it can
be directly fed to a
stream.pipeline()
.
API
Contents
class CarReader
Properties:
version
(number)
: The version number of the CAR referenced by this
reader (should be 1
or 2
).
Provides blockstore-like access to a CAR.
Implements the RootsReader
interface:
getRoots()
. And the BlockReader
interface:
get()
, has()
,
blocks()
(defined as a BlockIterator
) and
cids()
(defined as a CIDIterator
).
Load this class with either import { CarReader } from '@ipld/car/reader'
(const { CarReader } = require('@ipld/car/reader')
). Or
import { CarReader } from '@ipld/car'
(const { CarReader } = require('@ipld/car')
).
The former will likely result in smaller bundle sizes where this is
important.
async CarReader#getRoots()
Get the list of roots defined by the CAR referenced by this reader. May be
zero or more CID
s.
async CarReader#has(key)
Check whether a given CID
exists within the CAR referenced by this
reader.
async CarReader#get(key)
Fetch a Block
(a { cid:CID, bytes:Uint8Array }
pair) from the CAR
referenced by this reader matching the provided CID
. In the case where
the provided CID
doesn't exist within the CAR, undefined
will be
returned.
async * CarReader#blocks()
- Returns:
AsyncGenerator<Block>
Returns a BlockIterator
(AsyncIterable<Block>
) that iterates over all
of the Block
s ({ cid:CID, bytes:Uint8Array }
pairs) contained within
the CAR referenced by this reader.
async * CarReader#cids()
- Returns:
AsyncGenerator<CID>
Returns a CIDIterator
(AsyncIterable<CID>
) that iterates over all of
the CID
s contained within the CAR referenced by this reader.
async CarReader.fromBytes(bytes)
Instantiate a CarReader
from a Uint8Array
blob. This performs a
decode fully in memory and maintains the decoded state in memory for full
access to the data via the CarReader
API.
async CarReader.fromIterable(asyncIterable)
Instantiate a CarReader
from a AsyncIterable<Uint8Array>
, such as
a modern Node.js stream.
This performs a decode fully in memory and maintains the decoded state in
memory for full access to the data via the CarReader
API.
Care should be taken for large archives; this API may not be appropriate
where memory is a concern or the archive is potentially larger than the
amount of memory that the runtime can handle.
async CarReader.readRaw(fd, blockIndex)
-
fd
(fs.promises.FileHandle|number)
: A file descriptor from the
Node.js fs
module. Either an integer, from fs.open()
or a FileHandle
from fs.promises.open()
.
-
blockIndex
(BlockIndex)
: An index pointing to the location of the
Block required. This BlockIndex
should take the form:
{cid:CID, blockLength:number, blockOffset:number}
.
-
Returns: Promise<Block>
: A { cid:CID, bytes:Uint8Array }
pair.
Reads a block directly from a file descriptor for an open CAR file. This
function is only available in Node.js and not a browser environment.
This function can be used in connection with CarIndexer
which emits
the BlockIndex
objects that are required by this function.
The user is responsible for opening and closing the file used in this call.
class CarIndexedReader
Properties:
version
(number)
: The version number of the CAR referenced by this
reader (should be 1
).
A form of CarReader
that pre-indexes a CAR archive from a file and
provides random access to blocks within the file using the index data. This
function is only available in Node.js and not a browser environment.
For large CAR files, using this form of CarReader
can be singificantly more
efficient in terms of memory. The index consists of a list of CID
s and
their location within the archive (see CarIndexer
). For large numbers
of blocks, this index can also occupy a significant amount of memory. In some
cases it may be necessary to expand the memory capacity of a Node.js instance
to allow this index to fit. (e.g. by running with
NODE_OPTIONS="--max-old-space-size=16384"
).
As an CarIndexedReader
instance maintains an open file descriptor for its
CAR file, an additional CarReader#close
method is attached. This
must be called to have full clean-up of resources after use.
Load this class with either
import { CarIndexedReader } from '@ipld/car/indexed-reader'
(const { CarIndexedReader } = require('@ipld/car/indexed-reader')
). Or
import { CarIndexedReader } from '@ipld/car'
(const { CarIndexedReader } = require('@ipld/car')
). The former will likely
result in smaller bundle sizes where this is important.
async CarIndexedReader#getRoots()
See CarReader#getRoots
async CarIndexedReader#has(key)
See CarReader#has
async CarIndexedReader#get(key)
See CarReader#get
async * CarIndexedReader#blocks()
- Returns:
AsyncGenerator<Block>
See CarReader#blocks
async * CarIndexedReader#cids()
- Returns:
AsyncGenerator<CID>
See CarReader#cids
async CarIndexedReader#close()
Close the underlying file descriptor maintained by this CarIndexedReader
.
This must be called for proper resource clean-up to occur.
async CarIndexedReader.fromFile(path)
Instantiate an CarIndexedReader
from a file with the provided
path
. The CAR file is first indexed with a full path that collects CID
s
and block locations. This index is maintained in memory. Subsequent reads
operate on a read-only file descriptor, fetching the block from its in-file
location.
For large archives, the initial indexing may take some time. The returned
Promise
will resolve only after this is complete.
class CarBlockIterator
Properties:
version
(number)
: The version number of the CAR referenced by this
iterator (should be 1
).
Provides an iterator over all of the Block
s in a CAR. Implements a
BlockIterator
interface, or AsyncIterable<Block>
. Where a Block
is
a { cid:CID, bytes:Uint8Array }
pair.
As an implementer of AsyncIterable
, this class can be used directly in a
for await (const block of iterator) {}
loop. Where the iterator
is
constructed using CarBlockiterator.fromBytes
or
CarBlockiterator.fromIterable
.
An iteration can only be performce once per instantiation.
CarBlockIterator
also implements the RootsReader
interface and provides
the getRoots()
method.
Load this class with either
import { CarBlockIterator } from '@ipld/car/iterator'
(const { CarBlockIterator } = require('@ipld/car/iterator')
). Or
import { CarBlockIterator } from '@ipld/car'
(const { CarBlockIterator } = require('@ipld/car')
).
async CarBlockIterator#getRoots()
Get the list of roots defined by the CAR referenced by this iterator. May be
zero or more CID
s.
async CarBlockIterator.fromBytes(bytes)
Instantiate a CarBlockIterator
from a Uint8Array
blob. Rather
than decoding the entire byte array prior to returning the iterator, as in
CarReader.fromBytes
, only the header is decoded and the remainder
of the CAR is parsed as the Block
s as yielded.
async CarBlockIterator.fromIterable(asyncIterable)
Instantiate a CarBlockIterator
from a AsyncIterable<Uint8Array>
,
such as a modern Node.js stream.
Rather than decoding the entire byte array prior to returning the iterator,
as in CarReader.fromIterable
, only the header is decoded and the
remainder of the CAR is parsed as the Block
s as yielded.
class CarCIDIterator
Properties:
version
(number)
: The version number of the CAR referenced by this
iterator (should be 1
).
Provides an iterator over all of the CID
s in a CAR. Implements a
CIDIterator
interface, or AsyncIterable<CID>
. Similar to
CarBlockIterator
but only yields the CIDs in the CAR.
As an implementer of AsyncIterable
, this class can be used directly in a
for await (const cid of iterator) {}
loop. Where the iterator
is
constructed using CarCIDiterator.fromBytes
or
CarCIDiterator.fromIterable
.
An iteration can only be performce once per instantiation.
CarCIDIterator
also implements the RootsReader
interface and provides
the getRoots()
method.
Load this class with either
import { CarCIDIterator } from '@ipld/car/iterator'
(const { CarCIDIterator } = require('@ipld/car/iterator')
). Or
import { CarCIDIterator } from '@ipld/car'
(const { CarCIDIterator } = require('@ipld/car')
).
async CarCIDIterator#getRoots()
Get the list of roots defined by the CAR referenced by this iterator. May be
zero or more CID
s.
async CarCIDIterator.fromBytes(bytes)
Instantiate a CarCIDIterator
from a Uint8Array
blob. Rather
than decoding the entire byte array prior to returning the iterator, as in
CarReader.fromBytes
, only the header is decoded and the remainder
of the CAR is parsed as the CID
s as yielded.
async CarCIDIterator.fromIterable(asyncIterable)
Instantiate a CarCIDIterator
from a AsyncIterable<Uint8Array>
,
such as a modern Node.js stream.
Rather than decoding the entire byte array prior to returning the iterator,
as in CarReader.fromIterable
, only the header is decoded and the
remainder of the CAR is parsed as the CID
s as yielded.
class CarIndexer
Properties:
version
(number)
: The version number of the CAR referenced by this
reader (should be 1
).
Provides an iterator over all of the Block
s in a CAR, returning their CIDs
and byte-location information. Implements an AsyncIterable<BlockIndex>
.
Where a BlockIndex
is a
{ cid:CID, length:number, offset:number, blockLength:number, blockOffset:number }
.
As an implementer of AsyncIterable
, this class can be used directly in a
for await (const blockIndex of iterator) {}
loop. Where the iterator
is
constructed using CarIndexer.fromBytes
or
CarIndexer.fromIterable
.
An iteration can only be performce once per instantiation.
CarIndexer
also implements the RootsReader
interface and provides
the getRoots()
method.
Load this class with either
import { CarIndexer } from '@ipld/car/indexer'
(const { CarIndexer } = require('@ipld/car/indexer')
). Or
import { CarIndexer } from '@ipld/car'
(const { CarIndexer } = require('@ipld/car')
). The former will likely
result in smaller bundle sizes where this is important.
async CarIndexer#getRoots()
Get the list of roots defined by the CAR referenced by this indexer. May be
zero or more CID
s.
async CarIndexer.fromBytes(bytes)
Instantiate a CarIndexer
from a Uint8Array
blob. Only the header
is decoded initially, the remainder is processed and emitted via the
iterator as it is consumed.
async CarIndexer.fromIterable(asyncIterable)
Instantiate a CarIndexer
from a AsyncIterable<Uint8Array>
,
such as a modern Node.js stream.
is decoded initially, the remainder is processed and emitted via the
iterator as it is consumed.
class CarWriter
Provides a writer interface for the creation of CAR files.
Creation of a CarWriter
involves the instatiation of an input / output pair
in the form of a WriterChannel
, which is a
{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
pair. These two
components form what can be thought of as a stream-like interface. The
writer
component (an instantiated CarWriter
), has methods to
put()
new blocks and close()
the writing operation (finalising the CAR archive). The out
component is
an AsyncIterable
that yields the bytes of the archive. This can be
redirected to a file or other sink. In Node.js, you can use the
Readable.from()
API to convert this to a standard Node.js stream, or it can be directly fed
to a
stream.pipeline()
.
The channel will provide a form of backpressure. The Promise
from a
write()
won't resolve until the resulting data is drained from the out
iterable.
It is also possible to ignore the Promise
from write()
calls and allow
the generated data to queue in memory. This should be avoided for large CAR
archives of course due to the memory costs and potential for memory overflow.
Load this class with either
import { CarWriter } from '@ipld/car/writer'
(const { CarWriter } = require('@ipld/car/writer')
). Or
import { CarWriter } from '@ipld/car'
(const { CarWriter } = require('@ipld/car')
). The former will likely
result in smaller bundle sizes where this is important.
async CarWriter#put(block)
-
block
(Block)
: A { cid:CID, bytes:Uint8Array }
pair.
-
Returns: Promise<void>
: The returned promise will only resolve once the
bytes this block generates are written to the out
iterable.
Write a Block
(a { cid:CID, bytes:Uint8Array }
pair) to the archive.
async CarWriter#close()
Finalise the CAR archive and signal that the out
iterable should end once
any remaining bytes are written.
async CarWriter.create(roots)
Create a new CAR writer "channel" which consists of a
{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
pair.
async CarWriter.createAppender()
- Returns:
WriterChannel
: The channel takes the form of
{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
.
Create a new CAR appender "channel" which consists of a
{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
pair.
This appender does not consider roots and does not produce a CAR header.
It is designed to append blocks to an existing CAR archive. It is
expected that out
will be concatenated onto the end of an existing
archive that already has a properly formatted header.
async CarWriter.updateRootsInBytes(bytes, roots)
-
bytes
(Uint8Array)
-
roots
(CID[])
: A new list of roots to replace the existing list in
the CAR header. The new header must take up the same number of bytes as the
existing header, so the roots should collectively be the same byte length
as the existing roots.
-
Returns: Promise<Uint8Array>
Update the list of roots in the header of an existing CAR as represented
in a Uint8Array.
This operation is an overwrite, the total length of the CAR will not be
modified. A rejection will occur if the new header will not be the same
length as the existing header, in which case the CAR will not be modified.
It is the responsibility of the user to ensure that the roots being
replaced encode as the same length as the new roots.
The byte array passed in an argument will be modified and also returned
upon successful modification.
async CarWriter.updateRootsInFile(fd, roots)
-
fd
(fs.promises.FileHandle|number)
: A file descriptor from the
Node.js fs
module. Either an integer, from fs.open()
or a FileHandle
from fs.promises.open()
.
-
roots
(CID[])
: A new list of roots to replace the existing list in
the CAR header. The new header must take up the same number of bytes as the
existing header, so the roots should collectively be the same byte length
as the existing roots.
-
Returns: Promise<void>
Update the list of roots in the header of an existing CAR file. The first
argument must be a file descriptor for CAR file that is open in read and
write mode (not append), e.g. fs.open
or fs.promises.open
with 'r+'
mode.
This operation is an overwrite, the total length of the CAR will not be
modified. A rejection will occur if the new header will not be the same
length as the existing header, in which case the CAR will not be modified.
It is the responsibility of the user to ensure that the roots being
replaced encode as the same length as the new roots.
This function is only available in Node.js and not a browser
environment.
class CarBufferWriter
A simple CAR writer that writes to a pre-allocated buffer.
CarBufferWriter#addRoot(root, options)
-
root
(CID)
-
options
-
Returns: CarBufferWriter
Add a root to this writer, to be used to create a header when the CAR is
finalized with close()
CarBufferWriter#write(block)
Write a Block
(a { cid:CID, bytes:Uint8Array }
pair) to the archive.
Throws if there is not enough capacity.
CarBufferWriter#close([options])
Finalize the CAR and return it as a Uint8Array
.
CarBufferWriter.blockLength(Block)
-
block
(Block)
-
Returns: number
Calculates number of bytes required for storing given block in CAR. Useful in
estimating size of an ArrayBuffer
for the CarBufferWriter
.
-
rootLengths
(number[])
-
Returns: number
Calculates header size given the array of byteLength for roots.
-
options
(object)
-
Returns: number
Calculates header size given the array of roots.
Estimates header size given a count of the roots and the expected byte length
of the root CIDs. The default length works for a standard CIDv1 with a
single-byte multihash code, such as SHA2-256 (i.e. the most common CIDv1).
CarBufferWriter.createWriter(buffer[, options])
Creates synchronous CAR writer that can be used to encode blocks into a given
buffer. Optionally you could pass byteOffset
and byteLength
to specify a
range inside buffer to write into. If car file is going to have roots
you
need to either pass them under options.roots
(from which header size will
be calculated) or provide options.headerSize
to allocate required space
in the buffer. You may also provide known roots
and headerSize
to
allocate space for the roots that may not be known ahead of time.
Note: Incorrect headerSize
may lead to copying bytes inside a buffer
which will have a negative impact on performance.
Reads header data from a BytesReader
. The header may either be in the form
of a CarHeader
or CarV2Header
depending on the CAR being read.
async decoder.readBlockHead(reader)
Reads the leading data of an individual block from CAR data from a
BytesReader
. Returns a BlockHeader
object which contains
{ cid, length, blockLength }
which can be used to either index the block
or read the block binary data.
decoder.createDecoder(reader)
-
reader
(BytesReader)
-
Returns: CarDecoder
Creates a CarDecoder
from a BytesReader
. The CarDecoder
is as async
interface that will consume the bytes from the BytesReader
to yield a
header()
and either blocks()
or blocksIndex()
data.
decoder.bytesReader(bytes)
-
bytes
(Uint8Array)
-
Returns: BytesReader
Creates a BytesReader
from a Uint8Array
.
decoder.asyncIterableReader(asyncIterable)
Creates a BytesReader
from an AsyncIterable<Uint8Array>
, which allows for
consumption of CAR data from a streaming source.
decoder.limitReader(reader, byteLimit)
-
reader
(BytesReader)
-
byteLimit
(number)
-
Returns: BytesReader
Wraps a BytesReader
in a limiting BytesReader
which limits maximum read
to byteLimit
bytes. It does not update pos
of the original
BytesReader
.
License
Licensed under either of
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.