Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

avsc

Package Overview
Dependencies
Maintainers
1
Versions
158
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

avsc - npm Package Compare versions

Comparing version 0.2.0 to 1.0.0

doc/benchmarks.md

53

doc/advanced.md

@@ -8,9 +8,60 @@ Advanced usage

TOOD
Avro supports reading data written by another schema (as long as the reader's
and writer's schemas are compatible). We can do this by creating an appropriate
`Resolver`:
```javascript
var avsc = require('avsc');
// A schema's first version.
var v1 = avsc.parse({
type: 'record',
name: 'Person',
fields: [
{name: 'name', type: 'string'},
{name: 'age', type: 'int'}
]
});
// The updated version.
has been added:
var v2 = avsc.parse({
type: 'record',
name: 'Person',
fields: [
{
name: 'name', type: [
'string',
{
name: 'Name',
type: 'record',
fields: [
{name: 'first', type: 'string'},
{name: 'last', type: 'string'}
]
}
]
},
{name: 'phone', type: ['null', 'string'], default: null}
]
});
var resolver = v2.createResolver(v1);
var buf = v1.encode({name: 'Ann', age: 25});
var obj = v2.decode(buf, resolver); // === {name: {string: 'Ann'}, phone: null}
```
Type hooks
----------
Using the `typeHook` option, it is possible to introduce custom behavior on any
type. This can for example be used to override a type's `isValid` or `random`
method.
Below we show an example implementing a custom random float generator.
```javascript
var avsc = require('avsc');
/**

@@ -17,0 +68,0 @@ * Hook which allows setting a range for float types.

# API
+ [Parsing schemas](#parsing-schemas)
+ [Avro types](#avro-types)
+ [Records](#records)
+ [Streams](#streams)
## Parsing schemas
### `avsc.parse(schema, [opts])`
Parse a schema and return an instance of the corresponding
[`Type`](#class-type).
+ `schema` {Object|String} Schema (type object or type name string).
+ `opts` {Object} Parsing options. The following keys are currently supported:
+ `namespace` {String} Optional parent namespace.
+ `registry` {Object} Optional registry of predefined type names.
+ `unwrapUnions` {Boolean} By default, Avro expects all unions to be wrapped
inside an object with a single key. Setting this to `true` will prevent
this, slightly improving performance (encoding is then done on the first
type which validates).
+ `typeHook` {Function} Function called after each new Avro type is
instantiated. The new type is available as `this` and the relevant schema
as first and only argument.
### `avsc.parseFile(path, [opts])`
Convenience function to parse a schema file directly.
+ `path` {String} Path to schema file.
+ `opts` {Object} Parsing options (identical to those of `parse`).
## Avro types
All the classes below are available in the `avsc.types` namespace:
+ [`Type`](#class-type)
+ [`PrimitiveType`](#class-primitivetypename)
+ [`ArrayType`](#class-arraytypeschema-opts)
+ [`EnumType`](#class-enumtypeschema-opts)
+ [`FixedType`](#class-fixedtypeschema-opts)
+ [`MapType`](#class-maptypeschema-opts)
+ [`RecordType`](#class-recordtypeschema-opts)
+ [`UnionType`](#class-uniontypeschema-opts)
### Class `Type`
"Abstract" base Avro type class. All implementations inherit from this type.
"Abstract" base Avro type class. All implementations inherit from it.
##### `type.type`

@@ -14,2 +59,3 @@

##### `type.random()`

@@ -19,2 +65,3 @@

##### `type.isValid(obj)`

@@ -26,2 +73,3 @@

##### `type.encode(obj, [size,] [unsafe])`

@@ -39,2 +87,3 @@

##### `type.decode(buf, [resolver,] [unsafe])`

@@ -49,2 +98,3 @@

##### `type.createResolver(writerType)`

@@ -54,13 +104,25 @@

Create a resolver that can be be passed to the `type`'s `decode` method. This
will enable decoding objects which had been serialized using `writerType`,
according to the Avro [resolution rules][schema-resolution]. If the schemas are
incompatible, this method will throw an error.
##### `type.toString()`
Return the canonical version of the schema. This can be used to compare schemas
for equality.
Returns the [canonical version of the schema][canonical-schema]. This can be
used to compare schemas for equality.
##### `type.createFingerprint(algorithm)`
+ `algorithm` {String} Algorithm to use to generate the fingerprint. Defaults
to `md5`.
+ `algorithm` {String} Algorithm to use to generate the schema's
[fingerprint][]. Defaults to `md5`.
##### `Type.fromSchema(schema, [opts])`
Alias for `avsc.parse`.
#### Class `PrimitiveType(name)`

@@ -167,13 +229,19 @@

# Records
## Records
Each [`RecordType`](#class-recordtype-opts) generates a corresponding `Record`
constructor when its schema is parsed. It is available using the `RecordType`'s
`getRecordConstructor` methods. This makes decoding records more efficient and
lets us provide the following convenience methods:
### Class `Record(...)`
Specific record class, programmatically generated for each record schema.
Calling the constructor directly can sometimes be a convenient shortcut to
instantiate new records of a given type.
#### `Record.random()`
#### `Record.decode(buf, [resolver])`
#### `Record.decode(buf, [resolver,] [unsafe])`
#### `record.$encode([opts])`
#### `record.$encode([size,] [unsafe])`

@@ -187,13 +255,34 @@ #### `record.$isValid()`

As a convenience, the following function is available to read an Avro object
container file stored locally:
### `avsc.decodeFile(path, [opts])`
+ `path` {String} Path to Avro file.
+ `opts` {Object} Decoding options, passed to `BlockDecoder`.
Returns a readable stream of decoded objects from an Avro container file.
For other use-cases, the following stream classes are available in the
`avsc.streams` namespace:
+ [`BlockDecoder`](#blockdecoderopts)
+ [`RawDecoder`](#rawdecoderopts)
+ [`BlockEncoder`](#blockencoderopts)
+ [`RawEncoder`](#rawencoderopts)
### Class `BlockDecoder([opts])`
+ `opts` {Object} Decoding options. Available keys:
+ `readerType` {AvroType} Reader type.
+ `includeBuffer` {Boolean}
+ `unordered` {Boolean}
+ `decode` {Boolean} Whether to decode records before returning them.
Defaults to `true`.
+ `parseOpts` {Object} Options passed to instantiate the writer's `Type`.
A duplex stream which decodes bytes coming from on Avro object container file.
#### Event `'metadata'`
+ `writerType` {Type} The type used to write the file.
+ `type` {Type} The type used to write the file.
+ `codec` {String} The codec's name.
+ `header` {Object} The file's header, containing in particular the raw schema

@@ -204,30 +293,35 @@ and codec.

+ `data` {...} Decoded element. If `includeBuffer` was set, `data` will be an
object `{object, buffer}`.
+ `data` {Object|Buffer} Decoded element or raw bytes.
### Class `RawDecoder([opts])`
### Class `RawDecoder(type, [opts])`
+ `type` {Type} Writer type. Required since the input doesn't contain a header.
+ `opts` {Object} Decoding options. Available keys:
+ `writerType` {Type}
+ `readerType` {Type}
+ `includeBuffer` {Boolean}
+ `decode` {Boolean} Whether to decode records before returning them.
Defaults to `true`.
A duplex stream which can be used to decode a stream of serialized Avro objects
with no headers or blocks.
#### Event `'data'`
+ `data` {...} Decoded element. If `includeBuffer` was set, `data` will be an
object `{object, buffer}`.
+ `data` {Object|Buffer} Decoded element or raw bytes.
### Class `BlockEncoder([opts])`
### Class `BlockEncoder(type, [opts])`
+ `type` {Type} The type to use for encoding.
+ `opts` {Object} Encoding options. Available keys:
+ `writerType` {AvroType} Writer type. As a convenience, this will be
inferred if writing `Record` instances (from the first one passed).
+ `codec` {String}
+ `blockSize` {Number}
+ `omitHeader` {Boolean}
+ `unordered` {Boolean}
+ `unsafe` {Boolean}
+ `codec` {String} Name of codec to use for encoding.
+ `blockSize` {Number} Maximum uncompressed size of each block data. A new
block will be started when this number is exceeded. If it is too small to
fit a single element, it will be increased appropriately. Defaults to 64kB.
+ `omitHeader` {Boolean} Don't emit the header. This can be useful when
appending to an existing container file. Defaults to `false`.
+ `unsafe` {Boolean} Whether to check each record before encoding it.
Defaults to `true`.
A duplex stream to create Avro container object files.
#### Event `'data'`

@@ -238,11 +332,22 @@

### Class `RawEncoder([opts])`
### Class `RawEncoder(type, [opts])`
+ `type` {Type} The type to use for encoding.
+ `opts` {Object} Encoding options. Available keys:
+ `writerType` {AvroType} Writer type. As a convenience, this will be
inferred if writing `Record` instances (from the first one passed).
+ `unsafe` {Boolean}
+ `batchSize` {Number} To increase performance, records are serialized in
batches. Use this option to control how often batches are emitted. If it is
too small to fit a single record, it will be increased automatically.
Defaults to 64kB.
+ `unsafe` {Boolean} Whether to check each record before encoding it.
Defaults to `true`.
The encoding equivalent of `RawDecoder`.
#### Event `'data'`
+ `data` {Buffer} Serialized bytes.
[canonical-schema]: https://avro.apache.org/docs/current/spec.html#Parsing+Canonical+Form+for+Schemas
[schema-resolution]: https://avro.apache.org/docs/current/spec.html#Schema+Resolution
[fingerprint]: https://avro.apache.org/docs/current/spec.html#Schema+Fingerprints

117

doc/quickstart.md
# Quickstart
`avsc`'s API is built around `Type`s.
## Parsing schemas
### `avsc.parse(schema, [opts])`
## What is a `Type`?
Parse a schema and return an instance of the corresponding `Type`.
Each Avro type maps to a corresponding JavaScript `Type`:
+ `schema` {Object|String} Schema (type object or type name string).
+ `opts` {Object} Parsing options. The following keys are currently supported:
+ `namespace` {String} Optional parent namespace.
+ `registry` {Object} Optional registry of predefined type names.
+ `unwrapUnions` {Boolean} By default, Avro expects all unions to be wrapped
inside an object with a single key. Setting this to `true` will prevent
this, slightly improving performance (encoding is then done on the first
type which validates).
+ `typeHook` {Function} Function called after each new Avro type is
instantiated. The new type is available as `this` and the relevant schema
as first and only argument.
+ `int`, `long`, and other Avro primitives map to a `PrimitiveType`.
+ `array`s map to `ArrayType`s
+ `enum`s map to `EnumType`s
+ ...
### `avsc.parseFile(path, [opts])`
An instance of a `Type` knows how to encode and decode its corresponding
objects. For example the `string` `PrimitiveType` knows how to encode and
decode JavaScript strings:
Convenience function to parse a schema file directly.
```javascript
var avsc = require('avsc'); // This will be assumed in the snippets below.
+ `path` {String} Path to schema file.
+ `opts` {Object} Parsing options (identical to those of `parse`).
var stringType = new avsc.types.PrimitiveType('string');
var buf = stringType.encode('Hi'); // Buffer containing 'Hi''s Avro encoding.
var s = stringType.decode(buf); // === 'Hi'!
```
Each `type` also provides other methods which can be useful. Here are two
(refer to the API for the full list):
## Decoding Avro files
+ Type compatibility checks:
### `avsc.decodeFile(path, [opts])`
```javascript
var b1 = stringType.isValid('hello'); // true, 'hello' is a valid string.
var b2 = stringType.isValid(-2); // false, -2 is not.
```
+ `path` {String} Path to Avro file.
+ `opts` {Object} Decoding options, passed either to `BlockDecoder` or
`RawDecoder`, as appropriate.
+ Random object generation:
Returns a readable stream of an Avro file's (either container object file or
fragments) contents.
```javascript
var s = stringType.random(); // A random string.
```
## How do I get a `Type`?
It is possible to instantiate types directly by calling their constructors
(available in the `avsc.types` namespace, this is what we used earlier), but in
the vast majority of use-cases they will be automatically generated by parsing
an existing schema. `avsc` exposes a `parse` method to do the heavy lifting:
```javascript
// Equivalent to what we did earlier.
var stringType = avsc.parse('string');
// A slightly more complex type.
var mapType = avsc.parse({type: 'map', values: 'long'});
// The sky is the limit!
var personType = avsc.parse({
name: 'Person',
type: 'record',
fields: [
{name: 'name', type: 'string'},
{name: 'phone', type: ['null', 'string'], default: null},
{name: 'address', type: {
name: 'Address',
type: 'record',
fields: [
{name: 'city', type: 'string'},
{name: 'zip', type: 'int'}
]
}}
]
});
```
Since schemas are often stored in JSON files, `avsc` also exposes a `parseFile`
method just for that:
```javascript
var couponType = avsc.parseFile('schemas/Coupon.avsc');
```
## What else?
The methods we mentioned earlier can now really shine:
```javascript
personType.isValid({
name: 'Ann',
phone: null,
address: {city: 'Cambridge', zip: 02139}
}); // true
personType.isValid({
name: 'Bob',
phone: {string: '617-000-1234'},
address: {city: 'Boston'}
}); // false (Missing the zip code.)
```
## I forgot to ask about files.
Tomorrow.

@@ -5,3 +5,2 @@ /* jshint node: true */

// TODO: Add append option to `createWriteStream`.

@@ -8,0 +7,0 @@ var streams = require('./streams'),

/* jshint node: true */
// TODO: Add snappy support. Or maybe support for custom decompressors.
// TODO: Get `Decoder` to work even for decompressor that don't yield to the
// event loop.

@@ -304,2 +302,3 @@ 'use strict';

* + batchSize
* + unsafe
*

@@ -327,2 +326,7 @@ */

if (!this._unsafe && !this._type.isValid(obj)) {
this.emit('error', new AvscError('invalid object: %j', obj));
return;
}
this._writeObj.call(tap, obj);

@@ -329,0 +333,0 @@ if (!tap.isValid()) {

/* jshint node: true */
'use strict';
// TODO: Add `equals` (and `compare`?) method to each type.
// TODO: Add regex check for valid type and field names.
// TODO: Support JS keywords as record field names (e.g. `null`).
// TODO: Add logging using `debuglog` to help identify schema parsing errors.
// TODO: Enable recursive schema JSON serialization without a replacer.
// TODO: Implement `clone` method.
// TODO: Implememt `toString` method on types which returns canonical string.
// TODO: Create `Field` class.
'use strict';
var Tap = require('./tap'),

@@ -15,0 +12,0 @@ utils = require('./utils'),

{
"name": "avsc",
"version": "0.2.0",
"version": "1.0.0",
"description": "A serialization API to make you smile",

@@ -5,0 +5,0 @@ "keywords": [

@@ -6,2 +6,9 @@ # Avsc [![NPM version](https://img.shields.io/npm/v/avsc.svg)](https://www.npmjs.com/package/avsc) [![Build status](https://travis-ci.org/mtth/avsc.svg?branch=master)](https://travis-ci.org/mtth/avsc) [![Coverage status](https://coveralls.io/repos/mtth/avsc/badge.svg?branch=master&service=github)](https://coveralls.io/github/mtth/avsc?branch=master)

## Features
+ Arbitrary Avro schema support.
+ No dependencies.
+ [Fast!](#performance) Did you know that Avro could be faster than JSON?
## Installation

@@ -13,44 +20,83 @@

`avsc` is compatible with [io.js][] and versions of [node.js][] from and
including `0.11`.
## Example
```javascript
var avsc = require('avsc');
## Documentation
// Generate an Avro type.
var type = avsc.parse({
type: 'record',
name: 'Person',
fields: [
{name: 'name', type: 'string'},
{name: 'age', type: 'int'}
]
});
+ [Quickstart](https://github.com/mtth/avsc/blob/master/doc/quickstart.md)
+ [API](https://github.com/mtth/avsc/blob/master/doc/api.md)
+ [Advanced usage](https://github.com/mtth/avsc/blob/master/doc/advanced.md)
// Use it to serialize a JavaScript object.
var buf = type.encode({name: 'Ann', age: 25});
A few examples to boot:
// And deserialize it back.
var obj = type.decode(buf);
```
+ Encode and decode JavaScript objects using an Avro schema file:
```javascript
var avsc = require('avsc'); // Implied in all other examples below.
## Documentation
var type = avsc.parseFile('Person.avsc');
var buf = type.encode({name: 'Ann', age: 25}); // Serialize a JS object.
var obj = type.decode(buf); // And deserialize it back.
```
+ [Quickstart](https://github.com/mtth/avsc/blob/master/doc/quickstart.md)
+ [API](https://github.com/mtth/avsc/blob/master/doc/api.md)
+ Get a readable record stream from an Avro container file:
```javascript
avsc.decodeFile('records.avro')
.on('data', function (record) { /* Do something with the record. */ });
```
## Status
+ Generate a random instance from a schema object:
What's already there:
```javascript
var type = avsc.parse({
name: 'Pet',
type: 'record',
fields: [
{name: 'kind', type: {name: 'Kind', type: 'enum', symbols: ['CAT', 'DOG']}},
{name: 'name', type: 'string'},
{name: 'isFurry', type: 'boolean'}
]
});
var pet = type.random(); // E.g. {kind: 'CAT', name: 'qwXlrew', isFurry: true}
```
+ Parsing and resolving schemas (including schema evolution).
+ Encoding, decoding, validating, and generating data.
+ Reading and writing container files.
+ Create a writable stream to serialize objects on the fly:
Coming up:
```javascript
var type = avsc.parse({type: 'array', items: 'int'});
var encoder = new avsc.streams.RawEncoder(type)
.on('data', function (chunk) { /* Use the encoded chunk somehow. */ });
```
+ Protocols.
+ Sort order.
+ Logical types.
## Performance
Despite being written in pure JavaScript, `avsc` is still fast: supporting
encoding and decoding throughput rates in the 100,000s per second for complex
schemas.
Schema | Decode (operations/sec) | Encode (operations/sec)
---|:-:|:-:
[`ArrayString.avsc`](https://github.com/mtth/avsc/blob/master/benchmarks/schemas/ArrayString.avsc) | 905k | 280k
[`Coupon.avsc`](https://github.com/mtth/avsc/blob/master/benchmarks/schemas/Coupon.avsc) | 290k | 302k
[`Person.avsc`](https://github.com/mtth/avsc/blob/master/benchmarks/schemas/Person.avsc) | 1586k | 620k
[`User.avsc`](https://github.com/mtth/avsc/blob/master/benchmarks/schemas/User.avsc) | 116k | 284k
In fact, it is generally faster than the built-in JSON parser (also producing
encodings orders of magnitude smaller before compression). See the
[benchmarks][] page for the raw numbers.
## Limitations
+ Protocols aren't yet implemented.
+ JavaScript doesn't natively support the `long` type, so numbers larger than
`Number.MAX_SAFE_INTEGER` (or smaller than the corresponding lower bound)
will suffer a loss of precision.
[io.js]: https://iojs.org/en/
[node.js]: https://nodejs.org/en/
[benchmarks]: https://github.com/mtth/avsc/blob/master/doc/benchmarks.md
SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc