Comparing version 0.2.0 to 1.0.0
@@ -8,9 +8,60 @@ Advanced usage | ||
TOOD | ||
Avro supports reading data written by another schema (as long as the reader's | ||
and writer's schemas are compatible). We can do this by creating an appropriate | ||
`Resolver`: | ||
```javascript | ||
var avsc = require('avsc'); | ||
// A schema's first version. | ||
var v1 = avsc.parse({ | ||
type: 'record', | ||
name: 'Person', | ||
fields: [ | ||
{name: 'name', type: 'string'}, | ||
{name: 'age', type: 'int'} | ||
] | ||
}); | ||
// The updated version. | ||
has been added: | ||
var v2 = avsc.parse({ | ||
type: 'record', | ||
name: 'Person', | ||
fields: [ | ||
{ | ||
name: 'name', type: [ | ||
'string', | ||
{ | ||
name: 'Name', | ||
type: 'record', | ||
fields: [ | ||
{name: 'first', type: 'string'}, | ||
{name: 'last', type: 'string'} | ||
] | ||
} | ||
] | ||
}, | ||
{name: 'phone', type: ['null', 'string'], default: null} | ||
] | ||
}); | ||
var resolver = v2.createResolver(v1); | ||
var buf = v1.encode({name: 'Ann', age: 25}); | ||
var obj = v2.decode(buf, resolver); // === {name: {string: 'Ann'}, phone: null} | ||
``` | ||
Type hooks | ||
---------- | ||
Using the `typeHook` option, it is possible to introduce custom behavior on any | ||
type. This can for example be used to override a type's `isValid` or `random` | ||
method. | ||
Below we show an example implementing a custom random float generator. | ||
```javascript | ||
var avsc = require('avsc'); | ||
/** | ||
@@ -17,0 +68,0 @@ * Hook which allows setting a range for float types. |
171
doc/api.md
# API | ||
+ [Parsing schemas](#parsing-schemas) | ||
+ [Avro types](#avro-types) | ||
+ [Records](#records) | ||
+ [Streams](#streams) | ||
## Parsing schemas | ||
### `avsc.parse(schema, [opts])` | ||
Parse a schema and return an instance of the corresponding | ||
[`Type`](#class-type). | ||
+ `schema` {Object|String} Schema (type object or type name string). | ||
+ `opts` {Object} Parsing options. The following keys are currently supported: | ||
+ `namespace` {String} Optional parent namespace. | ||
+ `registry` {Object} Optional registry of predefined type names. | ||
+ `unwrapUnions` {Boolean} By default, Avro expects all unions to be wrapped | ||
inside an object with a single key. Setting this to `true` will prevent | ||
this, slightly improving performance (encoding is then done on the first | ||
type which validates). | ||
+ `typeHook` {Function} Function called after each new Avro type is | ||
instantiated. The new type is available as `this` and the relevant schema | ||
as first and only argument. | ||
### `avsc.parseFile(path, [opts])` | ||
Convenience function to parse a schema file directly. | ||
+ `path` {String} Path to schema file. | ||
+ `opts` {Object} Parsing options (identical to those of `parse`). | ||
## Avro types | ||
All the classes below are available in the `avsc.types` namespace: | ||
+ [`Type`](#class-type) | ||
+ [`PrimitiveType`](#class-primitivetypename) | ||
+ [`ArrayType`](#class-arraytypeschema-opts) | ||
+ [`EnumType`](#class-enumtypeschema-opts) | ||
+ [`FixedType`](#class-fixedtypeschema-opts) | ||
+ [`MapType`](#class-maptypeschema-opts) | ||
+ [`RecordType`](#class-recordtypeschema-opts) | ||
+ [`UnionType`](#class-uniontypeschema-opts) | ||
### Class `Type` | ||
"Abstract" base Avro type class. All implementations inherit from this type. | ||
"Abstract" base Avro type class. All implementations inherit from it. | ||
##### `type.type` | ||
@@ -14,2 +59,3 @@ | ||
##### `type.random()` | ||
@@ -19,2 +65,3 @@ | ||
##### `type.isValid(obj)` | ||
@@ -26,2 +73,3 @@ | ||
##### `type.encode(obj, [size,] [unsafe])` | ||
@@ -39,2 +87,3 @@ | ||
##### `type.decode(buf, [resolver,] [unsafe])` | ||
@@ -49,2 +98,3 @@ | ||
##### `type.createResolver(writerType)` | ||
@@ -54,13 +104,25 @@ | ||
Create a resolver that can be be passed to the `type`'s `decode` method. This | ||
will enable decoding objects which had been serialized using `writerType`, | ||
according to the Avro [resolution rules][schema-resolution]. If the schemas are | ||
incompatible, this method will throw an error. | ||
##### `type.toString()` | ||
Return the canonical version of the schema. This can be used to compare schemas | ||
for equality. | ||
Returns the [canonical version of the schema][canonical-schema]. This can be | ||
used to compare schemas for equality. | ||
##### `type.createFingerprint(algorithm)` | ||
+ `algorithm` {String} Algorithm to use to generate the fingerprint. Defaults | ||
to `md5`. | ||
+ `algorithm` {String} Algorithm to use to generate the schema's | ||
[fingerprint][]. Defaults to `md5`. | ||
##### `Type.fromSchema(schema, [opts])` | ||
Alias for `avsc.parse`. | ||
#### Class `PrimitiveType(name)` | ||
@@ -167,13 +229,19 @@ | ||
# Records | ||
## Records | ||
Each [`RecordType`](#class-recordtype-opts) generates a corresponding `Record` | ||
constructor when its schema is parsed. It is available using the `RecordType`'s | ||
`getRecordConstructor` methods. This makes decoding records more efficient and | ||
lets us provide the following convenience methods: | ||
### Class `Record(...)` | ||
Specific record class, programmatically generated for each record schema. | ||
Calling the constructor directly can sometimes be a convenient shortcut to | ||
instantiate new records of a given type. | ||
#### `Record.random()` | ||
#### `Record.decode(buf, [resolver])` | ||
#### `Record.decode(buf, [resolver,] [unsafe])` | ||
#### `record.$encode([opts])` | ||
#### `record.$encode([size,] [unsafe])` | ||
@@ -187,13 +255,34 @@ #### `record.$isValid()` | ||
As a convenience, the following function is available to read an Avro object | ||
container file stored locally: | ||
### `avsc.decodeFile(path, [opts])` | ||
+ `path` {String} Path to Avro file. | ||
+ `opts` {Object} Decoding options, passed to `BlockDecoder`. | ||
Returns a readable stream of decoded objects from an Avro container file. | ||
For other use-cases, the following stream classes are available in the | ||
`avsc.streams` namespace: | ||
+ [`BlockDecoder`](#blockdecoderopts) | ||
+ [`RawDecoder`](#rawdecoderopts) | ||
+ [`BlockEncoder`](#blockencoderopts) | ||
+ [`RawEncoder`](#rawencoderopts) | ||
### Class `BlockDecoder([opts])` | ||
+ `opts` {Object} Decoding options. Available keys: | ||
+ `readerType` {AvroType} Reader type. | ||
+ `includeBuffer` {Boolean} | ||
+ `unordered` {Boolean} | ||
+ `decode` {Boolean} Whether to decode records before returning them. | ||
Defaults to `true`. | ||
+ `parseOpts` {Object} Options passed to instantiate the writer's `Type`. | ||
A duplex stream which decodes bytes coming from on Avro object container file. | ||
#### Event `'metadata'` | ||
+ `writerType` {Type} The type used to write the file. | ||
+ `type` {Type} The type used to write the file. | ||
+ `codec` {String} The codec's name. | ||
+ `header` {Object} The file's header, containing in particular the raw schema | ||
@@ -204,30 +293,35 @@ and codec. | ||
+ `data` {...} Decoded element. If `includeBuffer` was set, `data` will be an | ||
object `{object, buffer}`. | ||
+ `data` {Object|Buffer} Decoded element or raw bytes. | ||
### Class `RawDecoder([opts])` | ||
### Class `RawDecoder(type, [opts])` | ||
+ `type` {Type} Writer type. Required since the input doesn't contain a header. | ||
+ `opts` {Object} Decoding options. Available keys: | ||
+ `writerType` {Type} | ||
+ `readerType` {Type} | ||
+ `includeBuffer` {Boolean} | ||
+ `decode` {Boolean} Whether to decode records before returning them. | ||
Defaults to `true`. | ||
A duplex stream which can be used to decode a stream of serialized Avro objects | ||
with no headers or blocks. | ||
#### Event `'data'` | ||
+ `data` {...} Decoded element. If `includeBuffer` was set, `data` will be an | ||
object `{object, buffer}`. | ||
+ `data` {Object|Buffer} Decoded element or raw bytes. | ||
### Class `BlockEncoder([opts])` | ||
### Class `BlockEncoder(type, [opts])` | ||
+ `type` {Type} The type to use for encoding. | ||
+ `opts` {Object} Encoding options. Available keys: | ||
+ `writerType` {AvroType} Writer type. As a convenience, this will be | ||
inferred if writing `Record` instances (from the first one passed). | ||
+ `codec` {String} | ||
+ `blockSize` {Number} | ||
+ `omitHeader` {Boolean} | ||
+ `unordered` {Boolean} | ||
+ `unsafe` {Boolean} | ||
+ `codec` {String} Name of codec to use for encoding. | ||
+ `blockSize` {Number} Maximum uncompressed size of each block data. A new | ||
block will be started when this number is exceeded. If it is too small to | ||
fit a single element, it will be increased appropriately. Defaults to 64kB. | ||
+ `omitHeader` {Boolean} Don't emit the header. This can be useful when | ||
appending to an existing container file. Defaults to `false`. | ||
+ `unsafe` {Boolean} Whether to check each record before encoding it. | ||
Defaults to `true`. | ||
A duplex stream to create Avro container object files. | ||
#### Event `'data'` | ||
@@ -238,11 +332,22 @@ | ||
### Class `RawEncoder([opts])` | ||
### Class `RawEncoder(type, [opts])` | ||
+ `type` {Type} The type to use for encoding. | ||
+ `opts` {Object} Encoding options. Available keys: | ||
+ `writerType` {AvroType} Writer type. As a convenience, this will be | ||
inferred if writing `Record` instances (from the first one passed). | ||
+ `unsafe` {Boolean} | ||
+ `batchSize` {Number} To increase performance, records are serialized in | ||
batches. Use this option to control how often batches are emitted. If it is | ||
too small to fit a single record, it will be increased automatically. | ||
Defaults to 64kB. | ||
+ `unsafe` {Boolean} Whether to check each record before encoding it. | ||
Defaults to `true`. | ||
The encoding equivalent of `RawDecoder`. | ||
#### Event `'data'` | ||
+ `data` {Buffer} Serialized bytes. | ||
[canonical-schema]: https://avro.apache.org/docs/current/spec.html#Parsing+Canonical+Form+for+Schemas | ||
[schema-resolution]: https://avro.apache.org/docs/current/spec.html#Schema+Resolution | ||
[fingerprint]: https://avro.apache.org/docs/current/spec.html#Schema+Fingerprints |
# Quickstart | ||
`avsc`'s API is built around `Type`s. | ||
## Parsing schemas | ||
### `avsc.parse(schema, [opts])` | ||
## What is a `Type`? | ||
Parse a schema and return an instance of the corresponding `Type`. | ||
Each Avro type maps to a corresponding JavaScript `Type`: | ||
+ `schema` {Object|String} Schema (type object or type name string). | ||
+ `opts` {Object} Parsing options. The following keys are currently supported: | ||
+ `namespace` {String} Optional parent namespace. | ||
+ `registry` {Object} Optional registry of predefined type names. | ||
+ `unwrapUnions` {Boolean} By default, Avro expects all unions to be wrapped | ||
inside an object with a single key. Setting this to `true` will prevent | ||
this, slightly improving performance (encoding is then done on the first | ||
type which validates). | ||
+ `typeHook` {Function} Function called after each new Avro type is | ||
instantiated. The new type is available as `this` and the relevant schema | ||
as first and only argument. | ||
+ `int`, `long`, and other Avro primitives map to a `PrimitiveType`. | ||
+ `array`s map to `ArrayType`s | ||
+ `enum`s map to `EnumType`s | ||
+ ... | ||
### `avsc.parseFile(path, [opts])` | ||
An instance of a `Type` knows how to encode and decode its corresponding | ||
objects. For example the `string` `PrimitiveType` knows how to encode and | ||
decode JavaScript strings: | ||
Convenience function to parse a schema file directly. | ||
```javascript | ||
var avsc = require('avsc'); // This will be assumed in the snippets below. | ||
+ `path` {String} Path to schema file. | ||
+ `opts` {Object} Parsing options (identical to those of `parse`). | ||
var stringType = new avsc.types.PrimitiveType('string'); | ||
var buf = stringType.encode('Hi'); // Buffer containing 'Hi''s Avro encoding. | ||
var s = stringType.decode(buf); // === 'Hi'! | ||
``` | ||
Each `type` also provides other methods which can be useful. Here are two | ||
(refer to the API for the full list): | ||
## Decoding Avro files | ||
+ Type compatibility checks: | ||
### `avsc.decodeFile(path, [opts])` | ||
```javascript | ||
var b1 = stringType.isValid('hello'); // true, 'hello' is a valid string. | ||
var b2 = stringType.isValid(-2); // false, -2 is not. | ||
``` | ||
+ `path` {String} Path to Avro file. | ||
+ `opts` {Object} Decoding options, passed either to `BlockDecoder` or | ||
`RawDecoder`, as appropriate. | ||
+ Random object generation: | ||
Returns a readable stream of an Avro file's (either container object file or | ||
fragments) contents. | ||
```javascript | ||
var s = stringType.random(); // A random string. | ||
``` | ||
## How do I get a `Type`? | ||
It is possible to instantiate types directly by calling their constructors | ||
(available in the `avsc.types` namespace, this is what we used earlier), but in | ||
the vast majority of use-cases they will be automatically generated by parsing | ||
an existing schema. `avsc` exposes a `parse` method to do the heavy lifting: | ||
```javascript | ||
// Equivalent to what we did earlier. | ||
var stringType = avsc.parse('string'); | ||
// A slightly more complex type. | ||
var mapType = avsc.parse({type: 'map', values: 'long'}); | ||
// The sky is the limit! | ||
var personType = avsc.parse({ | ||
name: 'Person', | ||
type: 'record', | ||
fields: [ | ||
{name: 'name', type: 'string'}, | ||
{name: 'phone', type: ['null', 'string'], default: null}, | ||
{name: 'address', type: { | ||
name: 'Address', | ||
type: 'record', | ||
fields: [ | ||
{name: 'city', type: 'string'}, | ||
{name: 'zip', type: 'int'} | ||
] | ||
}} | ||
] | ||
}); | ||
``` | ||
Since schemas are often stored in JSON files, `avsc` also exposes a `parseFile` | ||
method just for that: | ||
```javascript | ||
var couponType = avsc.parseFile('schemas/Coupon.avsc'); | ||
``` | ||
## What else? | ||
The methods we mentioned earlier can now really shine: | ||
```javascript | ||
personType.isValid({ | ||
name: 'Ann', | ||
phone: null, | ||
address: {city: 'Cambridge', zip: 02139} | ||
}); // true | ||
personType.isValid({ | ||
name: 'Bob', | ||
phone: {string: '617-000-1234'}, | ||
address: {city: 'Boston'} | ||
}); // false (Missing the zip code.) | ||
``` | ||
## I forgot to ask about files. | ||
Tomorrow. |
@@ -5,3 +5,2 @@ /* jshint node: true */ | ||
// TODO: Add append option to `createWriteStream`. | ||
@@ -8,0 +7,0 @@ var streams = require('./streams'), |
/* jshint node: true */ | ||
// TODO: Add snappy support. Or maybe support for custom decompressors. | ||
// TODO: Get `Decoder` to work even for decompressor that don't yield to the | ||
// event loop. | ||
@@ -304,2 +302,3 @@ 'use strict'; | ||
* + batchSize | ||
* + unsafe | ||
* | ||
@@ -327,2 +326,7 @@ */ | ||
if (!this._unsafe && !this._type.isValid(obj)) { | ||
this.emit('error', new AvscError('invalid object: %j', obj)); | ||
return; | ||
} | ||
this._writeObj.call(tap, obj); | ||
@@ -329,0 +333,0 @@ if (!tap.isValid()) { |
/* jshint node: true */ | ||
'use strict'; | ||
// TODO: Add `equals` (and `compare`?) method to each type. | ||
// TODO: Add regex check for valid type and field names. | ||
// TODO: Support JS keywords as record field names (e.g. `null`). | ||
// TODO: Add logging using `debuglog` to help identify schema parsing errors. | ||
// TODO: Enable recursive schema JSON serialization without a replacer. | ||
// TODO: Implement `clone` method. | ||
// TODO: Implememt `toString` method on types which returns canonical string. | ||
// TODO: Create `Field` class. | ||
'use strict'; | ||
var Tap = require('./tap'), | ||
@@ -15,0 +12,0 @@ utils = require('./utils'), |
{ | ||
"name": "avsc", | ||
"version": "0.2.0", | ||
"version": "1.0.0", | ||
"description": "A serialization API to make you smile", | ||
@@ -5,0 +5,0 @@ "keywords": [ |
104
README.md
@@ -6,2 +6,9 @@ # Avsc [![NPM version](https://img.shields.io/npm/v/avsc.svg)](https://www.npmjs.com/package/avsc) [![Build status](https://travis-ci.org/mtth/avsc.svg?branch=master)](https://travis-ci.org/mtth/avsc) [![Coverage status](https://coveralls.io/repos/mtth/avsc/badge.svg?branch=master&service=github)](https://coveralls.io/github/mtth/avsc?branch=master) | ||
## Features | ||
+ Arbitrary Avro schema support. | ||
+ No dependencies. | ||
+ [Fast!](#performance) Did you know that Avro could be faster than JSON? | ||
## Installation | ||
@@ -13,44 +20,83 @@ | ||
`avsc` is compatible with [io.js][] and versions of [node.js][] from and | ||
including `0.11`. | ||
## Example | ||
```javascript | ||
var avsc = require('avsc'); | ||
## Documentation | ||
// Generate an Avro type. | ||
var type = avsc.parse({ | ||
type: 'record', | ||
name: 'Person', | ||
fields: [ | ||
{name: 'name', type: 'string'}, | ||
{name: 'age', type: 'int'} | ||
] | ||
}); | ||
+ [Quickstart](https://github.com/mtth/avsc/blob/master/doc/quickstart.md) | ||
+ [API](https://github.com/mtth/avsc/blob/master/doc/api.md) | ||
+ [Advanced usage](https://github.com/mtth/avsc/blob/master/doc/advanced.md) | ||
// Use it to serialize a JavaScript object. | ||
var buf = type.encode({name: 'Ann', age: 25}); | ||
A few examples to boot: | ||
// And deserialize it back. | ||
var obj = type.decode(buf); | ||
``` | ||
+ Encode and decode JavaScript objects using an Avro schema file: | ||
```javascript | ||
var avsc = require('avsc'); // Implied in all other examples below. | ||
## Documentation | ||
var type = avsc.parseFile('Person.avsc'); | ||
var buf = type.encode({name: 'Ann', age: 25}); // Serialize a JS object. | ||
var obj = type.decode(buf); // And deserialize it back. | ||
``` | ||
+ [Quickstart](https://github.com/mtth/avsc/blob/master/doc/quickstart.md) | ||
+ [API](https://github.com/mtth/avsc/blob/master/doc/api.md) | ||
+ Get a readable record stream from an Avro container file: | ||
```javascript | ||
avsc.decodeFile('records.avro') | ||
.on('data', function (record) { /* Do something with the record. */ }); | ||
``` | ||
## Status | ||
+ Generate a random instance from a schema object: | ||
What's already there: | ||
```javascript | ||
var type = avsc.parse({ | ||
name: 'Pet', | ||
type: 'record', | ||
fields: [ | ||
{name: 'kind', type: {name: 'Kind', type: 'enum', symbols: ['CAT', 'DOG']}}, | ||
{name: 'name', type: 'string'}, | ||
{name: 'isFurry', type: 'boolean'} | ||
] | ||
}); | ||
var pet = type.random(); // E.g. {kind: 'CAT', name: 'qwXlrew', isFurry: true} | ||
``` | ||
+ Parsing and resolving schemas (including schema evolution). | ||
+ Encoding, decoding, validating, and generating data. | ||
+ Reading and writing container files. | ||
+ Create a writable stream to serialize objects on the fly: | ||
Coming up: | ||
```javascript | ||
var type = avsc.parse({type: 'array', items: 'int'}); | ||
var encoder = new avsc.streams.RawEncoder(type) | ||
.on('data', function (chunk) { /* Use the encoded chunk somehow. */ }); | ||
``` | ||
+ Protocols. | ||
+ Sort order. | ||
+ Logical types. | ||
## Performance | ||
Despite being written in pure JavaScript, `avsc` is still fast: supporting | ||
encoding and decoding throughput rates in the 100,000s per second for complex | ||
schemas. | ||
Schema | Decode (operations/sec) | Encode (operations/sec) | ||
---|:-:|:-: | ||
[`ArrayString.avsc`](https://github.com/mtth/avsc/blob/master/benchmarks/schemas/ArrayString.avsc) | 905k | 280k | ||
[`Coupon.avsc`](https://github.com/mtth/avsc/blob/master/benchmarks/schemas/Coupon.avsc) | 290k | 302k | ||
[`Person.avsc`](https://github.com/mtth/avsc/blob/master/benchmarks/schemas/Person.avsc) | 1586k | 620k | ||
[`User.avsc`](https://github.com/mtth/avsc/blob/master/benchmarks/schemas/User.avsc) | 116k | 284k | ||
In fact, it is generally faster than the built-in JSON parser (also producing | ||
encodings orders of magnitude smaller before compression). See the | ||
[benchmarks][] page for the raw numbers. | ||
## Limitations | ||
+ Protocols aren't yet implemented. | ||
+ JavaScript doesn't natively support the `long` type, so numbers larger than | ||
`Number.MAX_SAFE_INTEGER` (or smaller than the corresponding lower bound) | ||
will suffer a loss of precision. | ||
[io.js]: https://iojs.org/en/ | ||
[node.js]: https://nodejs.org/en/ | ||
[benchmarks]: https://github.com/mtth/avsc/blob/master/doc/benchmarks.md |
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
No v1
QualityPackage is not semver >=1. This means it is not stable and does not support ^ ranges.
Found 1 instance in 1 package
85609
12
1
101
2207