What is avsc?
The avsc npm package is a library for encoding and decoding data in the Avro serialization format. It provides tools for working with Avro schemas, serializing and deserializing data, and performing schema evolution.
What are avsc's main functionalities?
Schema Definition
This feature allows you to define Avro schemas using JSON. The code sample demonstrates how to define a simple Avro schema for a record with 'name' and 'age' fields.
const avro = require('avsc');
const type = avro.Type.forSchema({
type: 'record',
fields: [
{name: 'name', type: 'string'},
{name: 'age', type: 'int'}
]
});
Serialization
This feature allows you to serialize JavaScript objects into Avro binary format. The code sample shows how to serialize an object with 'name' and 'age' fields into a buffer.
const avro = require('avsc');
const type = avro.Type.forSchema({
type: 'record',
fields: [
{name: 'name', type: 'string'},
{name: 'age', type: 'int'}
]
});
const buf = type.toBuffer({name: 'John Doe', age: 30});
Deserialization
This feature allows you to deserialize Avro binary data back into JavaScript objects. The code sample demonstrates how to deserialize a buffer back into an object.
const avro = require('avsc');
const type = avro.Type.forSchema({
type: 'record',
fields: [
{name: 'name', type: 'string'},
{name: 'age', type: 'int'}
]
});
const buf = type.toBuffer({name: 'John Doe', age: 30});
const obj = type.fromBuffer(buf);
Schema Evolution
This feature supports schema evolution, allowing you to read data written with an older schema using a newer schema. The code sample shows how to evolve a schema by adding a new field with a default value.
const avro = require('avsc');
const oldType = avro.Type.forSchema({
type: 'record',
fields: [
{name: 'name', type: 'string'}
]
});
const newType = avro.Type.forSchema({
type: 'record',
fields: [
{name: 'name', type: 'string'},
{name: 'age', type: 'int', 'default': 0}
]
});
const buf = oldType.toBuffer({name: 'John Doe'});
const obj = newType.fromBuffer(buf);
Other packages similar to avsc
avro-js
avro-js is another library for working with Avro data in JavaScript. It provides similar functionality for schema definition, serialization, and deserialization. However, avro-js is generally considered to be less performant compared to avsc.
node-avro-io
node-avro-io is a library for Avro serialization and deserialization in Node.js. It offers similar features to avsc but is less actively maintained and has fewer features related to schema evolution.
Avsc
Pure JavaScript implementation of the Avro specification.
Features
- Arbitrary Avro schema support.
- No dependencies.
- Fast! Did you know that Avro could be faster than JSON?
Installation
$ npm install avsc
avsc
is compatible with io.js and versions of node.js from and
including 0.11
.
Documentation
A few examples to boot:
-
Encode and decode JavaScript objects using an Avro schema file:
var avsc = require('avsc');
var type = avsc.parseFile('Person.avsc');
var buf = type.encode({name: 'Ann', age: 25});
var obj = type.decode(buf);
-
Get a readable record stream from an Avro container file:
avsc.decodeFile('records.avro')
.on('data', function (record) { });
-
Generate a random instance from a schema object:
var type = avsc.parse({
name: 'Pet',
type: 'record',
fields: [
{name: 'kind', type: {name: 'Kind', type: 'enum', symbols: ['CAT', 'DOG']}},
{name: 'name', type: 'string'},
{name: 'isFurry', type: 'boolean'}
]
});
var pet = type.random();
-
Create a writable stream to serialize objects on the fly:
var type = avsc.parse({type: 'array', items: 'int'});
var encoder = new avsc.streams.RawEncoder(type)
.on('data', function (chunk) { });
Performance
Despite being written in pure JavaScript, avsc
is still fast: supporting
encoding and decoding throughput rates in the 100,000s per second for complex
schemas.
In fact, it is generally faster than the built-in JSON parser (also producing
encodings orders of magnitude smaller before compression). See the
benchmarks page for the raw numbers.
Limitations
- Protocols aren't yet implemented.
- JavaScript doesn't natively support the
long
type, so numbers larger than
Number.MAX_SAFE_INTEGER
(or smaller than the corresponding lower bound)
will suffer a loss of precision.