Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

stream-json

Package Overview
Dependencies
Maintainers
1
Versions
47
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

stream-json - npm Package Compare versions

Comparing version 0.2.0 to 0.2.1

tests/test_filter_objects.js

9

package.json
{
"name": "stream-json",
"version": "0.2.0",
"description": "stream-json is a collection of node.js stream components for creating custom standard-compliant JSON processors, which requires a minimal memory footprint. It can parse JSON files far exceeding available memory. Even individual data items are streamed piece-wise. Streaming SAX-inspired event-based API is included as well.",
"version": "0.2.1",
"description": "stream-json is a SAX-insired stream components with a minimal memory footprint to parse huge JSON files. Includes utilities to stream Django-like JSON database dumps.",
"homepage": "http://github.com/uhop/stream-json",

@@ -29,3 +29,6 @@ "bugs": "http://github.com/uhop/stream-json/issues",

"tokenizer",
"parser"
"parser",
"django",
"stream",
"streaming"
],

@@ -32,0 +35,0 @@ "author": "Eugene Lazutkin <eugene.lazutkin@gmail.com> (http://lazutkin.com/)",

@@ -22,3 +22,5 @@ # stream-json

* `Assembler` to assemble full objects from an event stream.
* `StreamArray` handles a frequent use case: a huge array of relatively small objects. It streams array components individually taking care of assembling them automatically.
* `StreamArray` handles a frequent use case: a huge array of relatively small objects similar to [Django](https://www.djangoproject.com/)-produced database dumps. It streams array components individually taking care of assembling them automatically.
* `StreamFilteredArray` is a companion for `StreamArray`. The difference is that it allows to filter out unneeded objects in an efficient way without assembling them fully.
* `FilterObjects` filters complete objects and primitives.

@@ -64,6 +66,4 @@ Additionally a helper function is available in the main file, which creates a `Source` object with a default set of stream components.

This is the workhorse of the package. It is a transform stream, which consumes text, and produces a stream of tokens. It is always the first in a pipe chain being directly fed with a text from a file, a socket, the standard input, or any other text stream.
This is the workhorse of the package. It is a [Transform](https://nodejs.org/api/stream.html#stream_class_stream_transform) stream, which consumes text, and produces a stream of tokens. It is always the first in a pipe chain being directly fed with a text from a file, a socket, the standard input, or any other text stream. Its `Writeable` part operates in a buffer mode, while its `Readable` part operates in an [objectMode](http://nodejs.org/api/stream.html#stream_object_mode).
Its `Writeable` part operates in a buffer mode, while its `Readable` part operates in an [objectMode](http://nodejs.org/api/stream.html#stream_object_mode).
```js

@@ -85,6 +85,4 @@ var Parser = require("stream-json/Parser");

`Streamer` is a transform stream, which consumes a stream of tokens, and produces a stream of events. It is always the second in a pipe chain after the `Parser`. It knows JSON semantics and produces actionable events.
`Streamer` is a [Transform](https://nodejs.org/api/stream.html#stream_class_stream_transform) stream, which consumes a stream of tokens, and produces a stream of events. It is always the second in a pipe chain after the `Parser`. It knows JSON semantics and produces actionable events. It operates in an [objectMode](http://nodejs.org/api/stream.html#stream_object_mode).
It operates in an [objectMode](http://nodejs.org/api/stream.html#stream_object_mode).
```js

@@ -142,6 +140,4 @@ var Streamer = require("stream-json/Streamer");

`Packer` is a transform stream, which passes through a stream of events, optionally assembles keys, strings, and/or numbers from chunks, and adds new events with assembled values. It is a companion for `Streamer`, which frees users from implementing the assembling logic, when it is known that keys, strings, and/or numbers will fit in the available memory.
`Packer` is a [Transform](https://nodejs.org/api/stream.html#stream_class_stream_transform) stream, which passes through a stream of events, optionally assembles keys, strings, and/or numbers from chunks, and adds new events with assembled values. It is a companion for `Streamer`, which frees users from implementing the assembling logic, when it is known that keys, strings, and/or numbers will fit in the available memory. It operates in an [objectMode](http://nodejs.org/api/stream.html#stream_object_mode).
It operates in an [objectMode](http://nodejs.org/api/stream.html#stream_object_mode).
```js

@@ -156,3 +152,3 @@ var Packer = require("stream-json/Packer");

`options` contains some important parameters, and should be specified. It can contain some technical properties thoroughly documented in [node.js' Stream documentation](http://nodejs.org/api/stream.html). Additionally it recognizes following flags:
`options` contains some important parameters, and should be specified. It can contain some technical properties thoroughly documented in [node.js' Stream documentation](http://nodejs.org/api/stream.html). Additionally it recognizes following properties:

@@ -190,6 +186,4 @@ * `packKeys` can be `true` or `false` (the default). If `true`, a key value is returned as a new event:

`Emitter` is a writeable stream, which consumes a stream of events, and emits them on itself. The standard `finish` event is used to indicate the end of a stream.
`Emitter` is a [Writeable](https://nodejs.org/api/stream.html#stream_class_stream_writable) stream, which consumes a stream of events, and emits them on itself (all streams are instances of [EventEmitter](https://nodejs.org/api/events.html#events_class_events_eventemitter)). The standard `finish` event is used to indicate the end of a stream. It operates in an [objectMode](http://nodejs.org/api/stream.html#stream_object_mode).
It operates in an [objectMode](http://nodejs.org/api/stream.html#stream_object_mode).
```js

@@ -221,6 +215,4 @@ var Emitter = require("stream-json/Emitter");

`Filter` is an advance selector for sub-objects from a stream of events.
`Filter` is a [Transform](https://nodejs.org/api/stream.html#stream_class_stream_transform) stream, which is an advance selector for sub-objects from a stream of events. It operates in an [objectMode](http://nodejs.org/api/stream.html#stream_object_mode).
It operates in an [objectMode](http://nodejs.org/api/stream.html#stream_object_mode).
```js

@@ -235,3 +227,3 @@ var Filter = require("stream-json/Filter");

`options` contains some important parameters, and should be specified. It can contain some technical properties thoroughly documented in [node.js' Stream documentation](http://nodejs.org/api/stream.html). Additionally it recognizes following flags:
`options` contains some important parameters, and should be specified. It can contain some technical properties thoroughly documented in [node.js' Stream documentation](http://nodejs.org/api/stream.html). Additionally it recognizes following properties:

@@ -272,3 +264,3 @@ * `separator` is a string to use to separate key and index values forming a path in a current object. By default it is `.` (a dot).

`Source` is a convenience object. It connects individual streams with pipes, and attaches itself to the end emitting all events on itself (just like `Emitter`). The standard `end` event is used to indicate the end of a stream.
`Source` is a convenience object. It connects individual streams with pipes, and attaches itself to the end emitting all events on itself (just like `Emitter`). The standard `end` event is used to indicate the end of a stream. It is based on [EventEmitter](https://nodejs.org/api/events.html#events_class_events_eventemitter).

@@ -390,4 +382,6 @@ ```js

This utility deals with a frequent use case: our JSON is an array of various sub-objects. The assumption is that while individual array items fit in memory, the array itself does not. Such files are frequently produced by various database dump utilities, e.g., [Django's dumpdata](https://docs.djangoproject.com/en/1.8/ref/django-admin/#dumpdata-app-label-app-label-app-label-model).
This utility deals with a frequent use case: our JSON is an array of various sub-objects. The assumption is that while individual array items fit in memory, the array itself does not. Such files are frequently produced by various database dump utilities, e.g., [Django](https://www.djangoproject.com/)'s [dumpdata](https://docs.djangoproject.com/en/1.8/ref/django-admin/#dumpdata-app-label-app-label-app-label-model).
It is a [Transform](https://nodejs.org/api/stream.html#stream_class_stream_transform) stream, which opertes in an [objectMode](http://nodejs.org/api/stream.html#stream_object_mode).
`StreamArray` produces a stream of objects in following format:

@@ -430,5 +424,131 @@

### utils/StreamFilteredArray
This utility handles the same use case as `StreamArray`, but in addition it allows to check the objects as they are being built to reject, or accept them. Rejected objects are not assembled, and filtered out.
It is a [Transform](https://nodejs.org/api/stream.html#stream_class_stream_transform) stream, which opertes in an [objectMode](http://nodejs.org/api/stream.html#stream_object_mode).
Just like `StreamArray`, `StreamFilteredArray` produces a stream of objects in following format:
```js
{index, value}
```
Where `index` is a numeric index in the array starting from 0, and `value` is a corresponding value. All objects are produced strictly sequentially.
```js
var createSource = require("stream-json");
var StreamFilteredArray = require("stream-json/utils/StreamFilteredArray");
function f(assembler){
// test only top-level objects in the array:
if(assembler.stack.length == 2 && assembler.key === null){
// make a decision depending on a boolean property "active":
if(assembler.current.hasOwnProperty("active")){
// "true" to accept, "false" to reject
return assembler.current.active;
}
}
// return undefined to indicate our uncertainty at this moment
}
var source = createSource(options),
stream = StreamFilteredArray.make({objectFilter: f});
// Example of use:
stream.output.on("data", function(object){
console.log(object.index, object.value);
});
stream.output.on("end", function(){
console.log("done");
});
fs.createReadStream(fname).pipe(stream.input);
```
`StreamFilteredArray` is a constructor, which optionally takes one object: `options`. `options` can contain some technical parameters, which are rarely needs to be specified. You can find it thoroughly documented in [node.js' Stream documentation](http://nodejs.org/api/stream.html). But additionally it recognizes the following property:
* `objectFilter` is a function, which takes an `Assembler` instance as its only argument, and may return following values to indicate its decision:
* any truthy value indicates that we are interested in this object. `StreamFilteredArray` will stop polling our filter function and will assemble the object for future use.
* `false` (the exact value) indicates that we should skip this object. `StreamFilteredArray` will stop polling our filter function, and will stop assembling the object, discarding it completely.
* any other falsy value indicates that we have not enough information (most likely because the object was not assembled yet to make a decision). `StreamFilteredArray` will poll our filter function next time the object changes.
The default for `objectFilter` allows passing all objects.
In general `objectFilter` is called on incomplete objects. It means that if a decision is based on a value of a certain properties, those properties could be unprocessed at that moment. In such case it is reasonable to delay a decision by returning a falsy (but not `false`) value, like `undefined`.
Complete objects are not submitted to a filter function and accepted automatically. It means that all primitive values: booleans, numbers, strings, `null` objects are streamed, and not consulted with `objectFilter`.
If you want to filter out complete objects, including primitive values, use `FilterObjects`.
`StreamFilteredArray` instances expose one property:
* `objectFilter` is a function, which us called for every top-level streamable object. It can be replaced with another function at any time. Usually it is replaced between objects after an accept/reject decision is made.
Directly on `StreamFilteredArray` there is a class-level helper function `make()`, which is an exact clone of `StreamArray.make()`.
The test file for `StreamFilteredArray`: `tests/test_filtered_array.js`.
### utils/FilterObjects
This utility filters out complete objects (and primitive values) working with a stream in the same format as `StreamArray` and `StreamFilteredArray`:
```js
{index, value}
```
Where `index` is a numeric index in the array starting from 0, and `value` is a corresponding value. All objects are produced strictly sequentially.
It is a [Transform](https://nodejs.org/api/stream.html#stream_class_stream_transform) stream, which opertes in an [objectMode](http://nodejs.org/api/stream.html#stream_object_mode).
```js
var createSource = require("stream-json");
var StreamArray = require("stream-json/utils/StreamArray");
var FilterObjects = require("stream-json/utils/FilterObjects");
function f(item){
// accept all odd-indexed items, which are:
// true objects, but not arrays, or nulls
if(item.index % 2 && item.value &&
typeof item.value == "object" &&
!(item.value instanceof Array)){
return true;
}
return false;
}
var source = createSource(options),
stream = StreamArray.make(),
filter = new FilterObjects({itemFilter: f});
// Example of use:
stream.output.on("data", function(object){
console.log(object.index, object.value);
});
stream.output.on("end", function(){
console.log("done");
});
fs.createReadStream(fname).pipe(stream.input).pipe(filter);
```
`FilterObjects` is a constructor, which optionally takes one object: `options`. `options` can contain some technical parameters, which are rarely needs to be specified. You can find it thoroughly documented in [node.js' Stream documentation](http://nodejs.org/api/stream.html). But additionally it recognizes the following property:
* `itemFilter` is a function, which takes a `{index, value}` object as its only argument, and may return following values to indicate its decision:
* any truthy value to accept the object.
* any falsy value to reject the object.
The default for `itemFilter` accepts all objects.
`FilterObjects` instances expose one property:
* `itemFilter` is a function, which us called for every top-level streamable object. It can be replaced with another function at any time.
The test file for `FilterObjects`: `tests/test_filter_objects.js`.
## Advanced use
The whole library is organized as set of small components, which can be combined to produce the most effective pipeline. All components are based on node.js [streams](http://nodejs.org/api/stream.html), and [events](http://nodejs.org/api/events.html). It is easy to add your own components to solve your unique tasks.
The whole library is organized as set of small components, which can be combined to produce the most effective pipeline. All components are based on node.js [streams](http://nodejs.org/api/stream.html), and [events](http://nodejs.org/api/events.html). They implement all require standard APIs. It is easy to add your own components to solve your unique tasks.

@@ -485,2 +605,3 @@ The code of all components are compact and simple. Please take a look at their source code to see how things are implemented, so you can produce your own components in no time.

- 0.2.1 *added utilities to filter objects on the fly.*
- 0.2.0 *new faster parser, formal unit tests, added utilities to assemble objects on the fly.*

@@ -487,0 +608,0 @@ - 0.1.0 *bug fixes, more documentation.*

@@ -6,14 +6,16 @@ "use strict";

require("./test_classic.js");
require("./test_parser.js");
require("./test_streamer.js");
require("./test_packer.js");
require("./test_filter.js");
require("./test_escaped.js");
require("./test_source.js");
require("./test_emitter.js");
require("./test_assembler.js");
require("./test_array.js");
require("./test_classic");
require("./test_parser");
require("./test_streamer");
require("./test_packer");
require("./test_filter");
require("./test_escaped");
require("./test_source");
require("./test_emitter");
require("./test_assembler");
require("./test_array");
require("./test_filtered_array");
require("./test_filter_objects");
unit.run();

@@ -19,4 +19,4 @@ "use strict";

this.assembler = null;
this.counter = 0;
this._assembler = null;
this._counter = 0;
}

@@ -26,3 +26,3 @@ util.inherits(StreamArray, Transform);

StreamArray.prototype._transform = function transform(chunk, encoding, callback){
if(!this.assembler){
if(!this._assembler){
// first chunk should open an array

@@ -33,19 +33,9 @@ if(chunk.name !== "startArray"){

}
this.assembler = new Assembler();
this._assembler = new Assembler();
}
this.assembler[chunk.name] && this.assembler[chunk.name](chunk.value);
this._assembler[chunk.name] && this._assembler[chunk.name](chunk.value);
if(!this.assembler.stack.length){
switch(chunk.name){
case "startArray":
case "startObject":
case "keyValue":
break;
default:
if(this.assembler.current.length){
this.push({index: this.counter++, value: this.assembler.current.pop()});
}
break;
}
if(!this._assembler.stack.length && this._assembler.current.length){
this.push({index: this._counter++, value: this._assembler.current.pop()});
}

@@ -52,0 +42,0 @@

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc