@ipld/car
A JavaScript Content Addressable aRchive (CAR) file reader and writer.
See also:
Contents
Example
import fs from 'fs'
import { Readable } from 'stream'
import { CarReader, CarWriter } from '@ipld/car'
import raw from 'multiformats/codecs/raw'
import CID from 'multiformats/cid'
import { sha256 } from 'multiformats/hashes/sha2'
async function example () {
const bytes = new TextEncoder().encode('random meaningless bytes')
const hash = await sha256.digest(raw.encode(bytes))
const cid = CID.create(1, raw.code, hash)
const { writer, out } = await CarWriter.create([cid])
Readable.from(out).pipe(fs.createWriteStream('example.car'))
await writer.put({ cid, bytes })
await writer.close()
const inStream = fs.createReadStream('example.car')
const reader = await CarReader.fromIterable(inStream)
const roots = await reader.getRoots()
const got = await reader.get(roots[0])
console.log('Retrieved [%s] from example.car with CID [%s]',
new TextDecoder().decode(got.bytes),
roots[0].toString())
}
example().catch((err) => {
console.error(err)
process.exit(1)
})
Will output:
Retrieved [random meaningless bytes] from example.car with CID [bafkreihwkf6mtnjobdqrkiksr7qhp6tiiqywux64aylunbvmfhzeql2coa]
Usage
@ipld/car
is consumed through factory methods on its different classes. Each
class represents a discrete set of functionality. You should select the classes
that make the most sense for your use-case.
The basic CarReader
class is consumed via:
import CarReader from '@ipld/car/reader'
Or alternatively: import { CarReader } from '@ipld/car'
. CommonJS require
will also work for the same import paths and references.
CarReader
is useful for relatively small CAR archives as it buffers the
entirety of the archive in memory to provide access to its data. This class is
also suitable in a browser environment. The CarReader
class provides
random-access get(key)
and has(key)
methods as well as iterators for blocks()
] and
cids()
].
CarReader
can be instantiated from a
single Uint8Array
or from
an AsyncIterable
of Uint8Array
s (note that
Node.js streams are AsyncIterable
s and can be consumed in this way).
The CarIndexedReader
class is a special form of CarReader
and can be
consumed in Node.js only (not in the browser) via:
import CarIndexedReader from '@ipld/car/indexed-reader'
Or alternatively: import { CarIndexedReader } from '@ipld/car'
. CommonJS
require
will also work for the same import paths and references.
A CarIndexedReader
provides the same functionality as CarReader
but is
instantiated from a path to a CAR file and also
adds a close()
method that must be called when the reader
is no longer required, to clean up resources.
CarIndexedReader
performs a single full-scan of a CAR file, collecting a list
of CID
s and their block positions in the archive. It then performs
random-access reads when blocks are requested via get()
and the blocks()
and
cids()
iterators.
This class may be sutiable for random-access (primarily via has()
and get()
)
to relatively large CAR files.
import { CarBlockIterator } from '@ipld/car/iterator'
import { CarCIDIterator } from '@ipld/car/iterator'
Or alternatively:
import { CarBlockIterator, CarCIDIterator } from '@ipld/car'
. CommonJS
require
will also work for the same import paths and references.
These two classes provide AsyncIterable
s to the blocks or just the CIDs
contained within a CAR archive. These are efficient mechanisms for scanning an
entire CAR archive, regardless of size, if random-access to blocks is not
required.
CarBlockIterator
and CarCIDIterator
can be instantiated from a
single Uint8Array
(see
CarBlockIterator.fromBytes()
and
CarCIDIterator.fromBytes()
) or from
an AsyncIterable
of Uint8Array
s (see
CarBlockIterator.fromIterable()
and
CarCIDIterator.fromIterable()
)—note that
Node.js streams are AsyncIterable
s and can be consumed in this way.
The CarIndexer
class can be used to scan a CAR archive and provide indexing
data on the contents. It can be consumed via:
import CarIndexer from '@ipld/car/indexed-reader'
Or alternatively: import { CarIndexer } from '@ipld/car'
. CommonJS
require
will also work for the same import paths and references.
This class is used within CarIndexedReader
and is only
useful in cases where an external index of a CAR needs to be generated and used.
The index data can also be used with
CarReader.readRaw()
] to fetch block data directly from
a file descriptor using the index data for that block.
A CarWriter
is used to create new CAR archives. It can be consumed via:
import CarWriter from '@ipld/car/writer'
Or alternatively: import { CarWriter } from '@ipld/car'
. CommonJS
require
will also work for the same import paths and references.
Creation of a CarWriter
involves a "channel", or a
{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
pair. The writer
side
of the channel is used to put()
blocks, while the out
side of the channel emits the bytes that form the encoded CAR archive.
In Node.js, you can use the
Readable.from()
API to convert the out
AsyncIterable
to a standard Node.js stream, or it can
be directly fed to a
stream.pipeline()
.
API
Contents
class CarReader
Properties:
version
(number)
: The version number of the CAR referenced by this
reader (should be 1
).
Provides blockstore-like access to a CAR.
Implements the RootsReader
interface:
getRoots()
. And the BlockReader
interface:
get()
, has()
,
blocks()
(defined as a BlockIterator
) and
cids()
(defined as a CIDIterator
).
Load this class with either import CarReader from '@ipld/car/reader'
(const CarReader = require('@ipld/car/reader')
). Or
import { CarReader } from '@ipld/car'
(const { CarReader } = require('@ipld/car')
).
async CarReader#getRoots()
Get the list of roots defined by the CAR referenced by this reader. May be
zero or more CID
s.
async CarReader#has(key)
Check whether a given CID
exists within the CAR referenced by this
reader.
async CarReader#get(key)
Fetch a Block
(a { cid:CID, bytes:Uint8Array }
pair) from the CAR
referenced by this reader matching the provided CID
. In the case where
the provided CID
doesn't exist within the CAR, undefined
will be
returned.
async * CarReader#blocks()
- Returns:
AsyncGenerator<Block>
Returns a BlockIterator
(AsyncIterable<Block>
) that iterates over all
of the Block
s ({ cid:CID, bytes:Uint8Array }
pairs) contained within
the CAR referenced by this reader.
async * CarReader#cids()
- Returns:
AsyncGenerator<CID>
Returns a CIDIterator
(AsyncIterable<CID>
) that iterates over all of
the CID
s contained within the CAR referenced by this reader.
async CarReader.fromBytes(bytes)
Instantiate a CarReader
from a Uint8Array
blob. This performs a
decode fully in memory and maintains the decoded state in memory for full
access to the data via the CarReader
API.
async CarReader.fromIterable(asyncIterable)
Instantiate a CarReader
from a AsyncIterable<Uint8Array>
, such as
a modern Node.js stream.
This performs a decode fully in memory and maintains the decoded state in
memory for full access to the data via the CarReader
API.
Care should be taken for large archives; this API may not be appropriate
where memory is a concern or the archive is potentially larger than the
amount of memory that the runtime can handle.
async CarReader.readRaw(fd, blockIndex)
-
fd
(fs.promises.FileHandle|number)
: A file descriptor from the
Node.js fs
module. Either an integer, from fs.open()
or a FileHandle
from fs.promises.open()
.
-
blockIndex
(BlockIndex)
: An index pointing to the location of the
Block required. This BlockIndex
should take the form:
{cid:CID, blockLength:number, blockOffset:number}
.
-
Returns: Promise<Block>
: A { cid:CID, bytes:Uint8Array }
pair.
Reads a block directly from a file descriptor for an open CAR file. This
function is only available in Node.js and not a browser environment.
This function can be used in connection with CarIndexer
which emits
the BlockIndex
objects that are required by this function.
The user is responsible for opening and closing the file used in this call.
class CarIndexedReader
Properties:
version
(number)
: The version number of the CAR referenced by this
reader (should be 1
).
A form of CarReader
that pre-indexes a CAR archive from a file and
provides random access to blocks within the file using the index data. This
function is only available in Node.js and not a browser environment.
For large CAR files, using this form of CarReader
can be singificantly more
efficient in terms of memory. The index consists of a list of CID
s and
their location within the archive (see CarIndexer
). For large numbers
of blocks, this index can also occupy a significant amount of memory. In some
cases it may be necessary to expand the memory capacity of a Node.js instance
to allow this index to fit. (e.g. by running with
NODE_OPTIONS="--max-old-space-size=16384"
).
As an CarIndexedReader
instance maintains an open file descriptor for its
CAR file, an additional CarReader#close
method is attached. This
must be called to have full clean-up of resources after use.
Load this class with either
import CarIndexedReader from '@ipld/car/indexed-reader'
(const CarIndexedReader = require('@ipld/car/indexed-reader')
). Or
import { CarIndexedReader } from '@ipld/car'
(const { CarIndexedReader } = require('@ipld/car')
).
async CarIndexedReader#getRoots()
See CarReader#getRoots
async CarIndexedReader#has(key)
See CarReader#has
async CarIndexedReader#get(key)
See CarReader#get
async * CarIndexedReader#blocks()
- Returns:
AsyncGenerator<Block>
See CarReader#blocks
async * CarIndexedReader#cids()
- Returns:
AsyncGenerator<CID>
See CarReader#cids
async CarWriter#close()
Close the underlying file descriptor maintained by this CarIndexedReader
.
This must be called for proper resource clean-up to occur.
async CarIndexedReader.fromFile(path)
Instantiate an CarIndexedReader
from a file with the provided
path
. The CAR file is first indexed with a full path that collects CID
s
and block locations. This index is maintained in memory. Subsequent reads
operate on a read-only file descriptor, fetching the block from its in-file
location.
For large archives, the initial indexing may take some time. The returned
Promise
will resolve only after this is complete.
class CarBlockIterator
Properties:
version
(number)
: The version number of the CAR referenced by this
iterator (should be 1
).
Provides an iterator over all of the Block
s in a CAR. Implements a
BlockIterator
interface, or AsyncIterable<Block>
. Where a Block
is
a { cid:CID, bytes:Uint8Array }
pair.
As an implementer of AsyncIterable
, this class can be used directly in a
for await (const block of iterator) {}
loop. Where the iterator
is
constructed using CarBlockiterator.fromBytes
or
CarBlockiterator.fromIterable
.
An iteration can only be performce once per instantiation.
CarBlockIterator
also implements the RootsReader
interface and provides
the getRoots()
method.
Load this class with either
import { CarBlockIterator } from '@ipld/car/iterator'
(const { CarBlockIterator } = require('@ipld/car/iterator')
). Or
import { CarBlockIterator } from '@ipld/car'
(const { CarBlockIterator } = require('@ipld/car')
).
async CarBlockIterator#getRoots()
Get the list of roots defined by the CAR referenced by this iterator. May be
zero or more CID
s.
async CarBlockIterator.fromBytes(bytes)
Instantiate a CarBlockIterator
from a Uint8Array
blob. Rather
than decoding the entire byte array prior to returning the iterator, as in
CarReader.fromBytes
, only the header is decoded and the remainder
of the CAR is parsed as the Block
s as yielded.
async CarBlockIterator.fromIterable(asyncIterable)
Instantiate a CarBlockIterator
from a AsyncIterable<Uint8Array>
,
such as a modern Node.js stream.
Rather than decoding the entire byte array prior to returning the iterator,
as in CarReader.fromIterable
, only the header is decoded and the
remainder of the CAR is parsed as the Block
s as yielded.
class CarCIDIterator
Properties:
version
(number)
: The version number of the CAR referenced by this
iterator (should be 1
).
Provides an iterator over all of the CID
s in a CAR. Implements a
CIDIterator
interface, or AsyncIterable<CID>
. Similar to
CarBlockIterator
but only yields the CIDs in the CAR.
As an implementer of AsyncIterable
, this class can be used directly in a
for await (const cid of iterator) {}
loop. Where the iterator
is
constructed using CarCIDiterator.fromBytes
or
CarCIDiterator.fromIterable
.
An iteration can only be performce once per instantiation.
CarCIDIterator
also implements the RootsReader
interface and provides
the getRoots()
method.
Load this class with either
import { CarCIDIterator } from '@ipld/car/iterator'
(const { CarCIDIterator } = require('@ipld/car/iterator')
). Or
import { CarCIDIterator } from '@ipld/car'
(const { CarCIDIterator } = require('@ipld/car')
).
async CarCIDIterator#getRoots()
Get the list of roots defined by the CAR referenced by this iterator. May be
zero or more CID
s.
async CarCIDIterator.fromBytes(bytes)
Instantiate a CarCIDIterator
from a Uint8Array
blob. Rather
than decoding the entire byte array prior to returning the iterator, as in
CarReader.fromBytes
, only the header is decoded and the remainder
of the CAR is parsed as the CID
s as yielded.
async CarCIDIterator.fromIterable(asyncIterable)
Instantiate a CarCIDIterator
from a AsyncIterable<Uint8Array>
,
such as a modern Node.js stream.
Rather than decoding the entire byte array prior to returning the iterator,
as in CarReader.fromIterable
, only the header is decoded and the
remainder of the CAR is parsed as the CID
s as yielded.
class CarIndexer
Properties:
version
(number)
: The version number of the CAR referenced by this
reader (should be 1
).
Provides an iterator over all of the Block
s in a CAR, returning their CIDs
and byte-location information. Implements an AsyncIterable<BlockIndex>
.
Where a BlockIndex
is a
{ cid:CID, length:number, offset:number, blockLength:number, blockOffset:number }
.
As an implementer of AsyncIterable
, this class can be used directly in a
for await (const blockIndex of iterator) {}
loop. Where the iterator
is
constructed using CarIndexer.fromBytes
or
CarIndexer.fromIterable
.
An iteration can only be performce once per instantiation.
CarIndexer
also implements the RootsReader
interface and provides
the getRoots()
method.
Load this class with either
import CarIndexer from '@ipld/car/indexer'
(const CarIndexer = require('@ipld/car/indexer')
). Or
import { CarIndexer } from '@ipld/car'
(const { CarIndexer } = require('@ipld/car')
).
async CarIndexer#getRoots()
Get the list of roots defined by the CAR referenced by this indexer. May be
zero or more CID
s.
async CarIndexer.fromBytes(bytes)
Instantiate a CarIndexer
from a Uint8Array
blob. Only the header
is decoded initially, the remainder is processed and emitted via the
iterator as it is consumed.
async CarIndexer.fromIterable(asyncIterable)
Instantiate a CarIndexer
from a AsyncIterable<Uint8Array>
,
such as a modern Node.js stream.
is decoded initially, the remainder is processed and emitted via the
iterator as it is consumed.
class CarWriter
Provides a writer interface for the creation of CAR files.
Creation of a CarWriter
involves the instatiation of an input / output pair
in the form of a WriterChannel
, which is a
{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
pair. These two
components form what can be thought of as a stream-like interface. The
writer
component (an instantiated CarWriter
), has methods to
put()
new blocks and close()
the writing operation (finalising the CAR archive). The out
component is
an AsyncIterable
that yields the bytes of the archive. This can be
redirected to a file or other sink. In Node.js, you can use the
Readable.from()
API to convert this to a standard Node.js stream, or it can be directly fed
to a
stream.pipeline()
.
The channel will provide a form of backpressure. The Promise
from a
write()
won't resolve until the resulting data is drained from the out
iterable.
It is also possible to ignore the Promise
from write()
calls and allow
the generated data to queue in memory. This should be avoided for large CAR
archives of course due to the memory costs and potential for memory overflow.
Load this class with either
import CarWriter from '@ipld/car/writer'
(const CarWriter = require('@ipld/car/writer')
). Or
import { CarWriter } from '@ipld/car'
(const { CarWriter } = require('@ipld/car')
).
async CarWriter#put(block)
-
block
(Block)
: A { cid:CID, bytes:Uint8Array }
pair.
-
Returns: Promise<void>
: The returned promise will only resolve once the
bytes this block generates are written to the out
iterable.
Write a Block
(a { cid:CID, bytes:Uint8Array }
pair) to the archive.
async CarWriter#close()
Finalise the CAR archive and signal that the out
iterable should end once
any remaining bytes are written.
async CarWriter.create(roots)
Create a new CAR writer "channel" which consists of a
{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
pair.
async CarWriter.createAppender()
- Returns:
WriterChannel
: The channel takes the form of
{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
.
Create a new CAR appender "channel" which consists of a
{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
pair.
This appender does not consider roots and does not produce a CAR header.
It is designed to append blocks to an existing CAR archive. It is
expected that out
will be concatenated onto the end of an existing
archive that already has a properly formatted header.
License
Licensed under either of
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.