What is @discoveryjs/json-ext?
The @discoveryjs/json-ext npm package provides utilities for working with JSON data, including features for streaming, parsing, and stringifying large JSON objects and arrays. It is designed to handle JSON data efficiently, making it easier to work with large JSON files or streams of JSON data.
What are @discoveryjs/json-ext's main functionalities?
Streaming JSON parse
This feature allows for parsing newline-delimited JSON (NDJSON) from a stream, enabling efficient processing of large JSON files without loading them entirely into memory.
const { createReadStream } = require('fs');
const { parseNDJSON } = require('@discoveryjs/json-ext');
const stream = createReadStream('path/to/your/file.ndjson');
parseNDJSON(stream, (obj) => {
console.log(obj);
});
Stringify JSON to stream
This feature allows for stringifying JSON objects and piping them to a writable stream, which is useful for creating NDJSON files or streaming JSON data over the network.
const { createWriteStream } = require('fs');
const { stringifyStream } = require('@discoveryjs/json-ext');
const stream = createWriteStream('path/to/your/output.ndjson');
const objects = [{ foo: 'bar' }, { baz: 'qux' }];
stringifyStream(objects).pipe(stream);
Other packages similar to @discoveryjs/json-ext
JSONStream
JSONStream offers similar streaming JSON parsing and stringifying capabilities. It allows for parsing JSON files or streams using JSONPath-like expressions. Compared to @discoveryjs/json-ext, JSONStream focuses more on the streaming aspect and might be more suitable for scenarios where JSONPath expressions are needed for selecting data.
stream-json
stream-json provides a toolkit for processing JSON as a stream. It includes a variety of stream components for parsing, filtering, and transforming JSON data. While it shares the streaming JSON processing capability with @discoveryjs/json-ext, stream-json offers a more modular approach, allowing users to build custom processing pipelines.
json-ext
A set of utilities designed to extend JSON's capabilities, especially for handling large JSON data (over 100MB) efficiently:
Key Features
- Optimized to handle large JSON data with minimal resource usage (see benchmarks)
- Works seamlessly with browsers, Node.js, Deno, and Bun
- Supports both Node.js and Web streams
- Available in both ESM and CommonJS
- TypeScript typings included
- No external dependencies
- Compact size: 9.4Kb (minified), 3.8Kb (min+gzip)
Why json-ext?
- Handles large JSON files: Overcomes the limitations of V8 for strings larger than ~500MB, enabling the processing of huge JSON data.
- Prevents main thread blocking: Distributes parsing and stringifying over time, ensuring the main thread remains responsive during heavy JSON operations.
- Reduces memory usage: Traditional
JSON.parse()
and JSON.stringify()
require loading entire data into memory, leading to high memory consumption and increased garbage collection pressure. parseChunked()
and stringifyChunked()
process data incrementally, optimizing memory usage. - Size estimation:
stringifyInfo()
allows estimating the size of resulting JSON before generating it, enabling better decision-making for JSON generation strategies.
Install
npm install @discoveryjs/json-ext
API
parseChunked()
Functions like JSON.parse()
, iterating over chunks to reconstruct the result object, and returns a Promise.
Note: reviver
parameter is not supported yet.
function parseChunked(input: Iterable<Chunk> | AsyncIterable<Chunk>): Promise<any>;
function parseChunked(input: () => (Iterable<Chunk> | AsyncIterable<Chunk>)): Promise<any>;
type Chunk = string | Buffer | Uint8Array;
Benchmark
Usage:
import { parseChunked } from '@discoveryjs/json-ext';
const data = await parseChunked(chunkEmitter);
Parameter chunkEmitter
can be an iterable or async iterable that iterates over chunks, or a function returning such a value. A chunk can be a string
, Uint8Array
, or Node.js Buffer
.
Examples:
- Generator:
parseChunked(function*() {
yield '{ "hello":';
yield Buffer.from(' "wor');
yield new TextEncoder().encode('ld" }');
});
- Async generator:
parseChunked(async function*() {
for await (const chunk of someAsyncSource) {
yield chunk;
}
});
- Array:
parseChunked(['{ "hello":', ' "world"}'])
- Function returning iterable:
parseChunked(() => ['{ "hello":', ' "world"}'])
- Node.js
Readable
stream:
import fs from 'node:fs';
parseChunked(fs.createReadStream('path/to/file.json'))
- Web stream (e.g., using fetch()):
Note: Iterability for Web streams was added later in the Web platform, not all environments support it. Consider using parseFromWebStream()
for broader compatibility.
const response = await fetch('https://example.com/data.json');
const data = await parseChunked(response.body);
stringifyChunked()
Functions like JSON.stringify()
, but returns a generator yielding strings instead of a single string.
Note: Returns "null"
when JSON.stringify()
returns undefined
(since a chunk cannot be undefined
).
function stringifyChunked(value: any, replacer?: Replacer, space?: Space): Generator<string, void, unknown>;
function stringifyChunked(value: any, options: StringifyOptions): Generator<string, void, unknown>;
type Replacer =
| ((this: any, key: string, value: any) => any)
| (string | number)[]
| null;
type Space = string | number | null;
type StringifyOptions = {
replacer?: Replacer;
space?: Space;
highWaterMark?: number;
};
Benchmark
Usage:
-
Getting an array of chunks:
const chunks = [...stringifyChunked(data)];
-
Iterating over chunks:
for (const chunk of stringifyChunked(data)) {
console.log(chunk);
}
-
Specifying the minimum size of a chunk with highWaterMark
option:
const data = [1, "hello world", 42];
console.log([...stringifyChunked(data)]);
console.log([...stringifyChunked(data, { highWaterMark: 16 })]);
console.log([...stringifyChunked(data, { highWaterMark: 1 })]);
-
Streaming into a stream with a Promise
(modern Node.js):
import { pipeline } from 'node:stream/promises';
import fs from 'node:fs';
await pipeline(
stringifyChunked(data),
fs.createWriteStream('path/to/file.json')
);
-
Wrapping into a Promise
streaming into a stream (legacy Node.js):
import { Readable } from 'node:stream';
new Promise((resolve, reject) => {
Readable.from(stringifyChunked(data))
.on('error', reject)
.pipe(stream)
.on('error', reject)
.on('finish', resolve);
});
-
Writing into a file synchronously:
Note: Slower than JSON.stringify()
but uses much less heap space and has no limitation on string length
import fs from 'node:fs';
const fd = fs.openSync('output.json', 'w');
for (const chunk of stringifyChunked(data)) {
fs.writeFileSync(fd, chunk);
}
fs.closeSync(fd);
-
Using with fetch (JSON streaming):
Note: This feature has limited support in browsers, see Streaming requests with the fetch API
Note: ReadableStream.from()
has limited support in browsers, use createStringifyWebStream()
instead.
fetch('http://example.com', {
method: 'POST',
duplex: 'half',
body: ReadableStream.from(stringifyChunked(data))
});
-
Wrapping into ReadableStream
:
Note: Use ReadableStream.from()
or createStringifyWebStream()
when no extra logic is needed
new ReadableStream({
start() {
this.generator = stringifyChunked(data);
},
pull(controller) {
const { value, done } = this.generator.next();
if (done) {
controller.close();
} else {
controller.enqueue(value);
}
},
cancel() {
this.generator = null;
}
});
stringifyInfo()
export function stringifyInfo(value: any, replacer?: Replacer, space?: Space): StringifyInfoResult;
export function stringifyInfo(value: any, options?: StringifyInfoOptions): StringifyInfoResult;
type StringifyInfoOptions = {
replacer?: Replacer;
space?: Space;
continueOnCircular?: boolean;
}
type StringifyInfoResult = {
bytes: number;
spaceBytes: number;
circular: object[];
};
Functions like JSON.stringify()
, but returns an object with the expected overall size of the stringify operation and a list of circular references.
Example:
import { stringifyInfo } from '@discoveryjs/json-ext';
console.log(stringifyInfo({ test: true }, null, 4));
Options
continueOnCircular
Type: Boolean
Default: false
Determines whether to continue collecting info for a value when a circular reference is found. Setting this option to true
allows finding all circular references.
parseFromWebStream()
A helper function to consume JSON from a Web Stream. You can use parseChunked(stream)
instead, but @@asyncIterator
on ReadableStream
has limited support in browsers (see ReadableStream compatibility table).
import { parseFromWebStream } from '@discoveryjs/json-ext';
const data = await parseFromWebStream(readableStream);
createStringifyWebStream()
A helper function to convert stringifyChunked()
into a ReadableStream
(Web Stream). You can use ReadableStream.from()
instead, but this method has limited support in browsers (see ReadableStream.from() compatibility table).
import { createStringifyWebStream } from '@discoveryjs/json-ext';
createStringifyWebStream({ test: true });
License
MIT