clarinet
clarinet
is a sax-like streaming parser for JSON. works in the browser and node.js. clarinet
is inspired (and forked) from sax-js. just like you shouldn't use sax
when you need dom
you shouldn't use clarinet
when you need JSON.parse
. for a more detailed introduction and a performance study please refer to this article.
design goals
clarinet
is very much like yajl but written in javascript:
- written in javascript
- portable
- robust (~110 tests pass before even announcing the project)
- data representation independent
- fast
- generates verbose, useful error messages including context of where
the error occurs in the input text.
- can parse json data off a stream, incrementally
- simple to use
- tiny
motivation
the reason behind this work was to create better full text support in node. creating indexes out of large (or many) json files doesn't require a full understanding of the json file, but it does require something like clarinet
.
installation
node.js
- install npm
npm install clarinet
var clarinet = require('clarinet');
browser
- minimize clarinet.js
- load it into your webpage
usage
basics
var clarinet = require("clarinet")
, parser = clarinet.parser()
;
parser.onerror = function (e) {
};
parser.onvalue = function (v) {
};
parser.onopenobject = function (key) {
};
parser.onkey = function (key) {
};
parser.oncloseobject = function () {
};
parser.onopenarray = function () {
};
parser.onclosearray = function () {
};
parser.onend = function () {
};
parser.write('{"foo": "bar"}').close();
var stream = require("clarinet").createStream(options);
stream.on("error", function (e) {
console.error("error!", e)
this._parser.error = null
this._parser.resume()
})
stream.on("openobject", function (node) {
})
fs.createReadStream("file.json")
.pipe(stream)
.pipe(fs.createReadStream("file-altered.json"))
arguments
pass the following arguments to the parser function. all are optional.
opt
- object bag of settings regarding string formatting. all default to false
.
settings supported:
trim
- boolean. whether or not to trim text and comment nodes.normalize
- boolean. if true, then turn any whitespace into a single
space.
methods
write
- write bytes onto the stream. you don't have to do this all at
once. you can keep writing as much as you want.
close
- close the stream. once closed, no more data may be written until
it is done processing the buffer, which is signaled by the end
event.
resume
- to gracefully handle errors, assign a listener to the error
event. then, when the error is taken care of, you can call resume
to
continue parsing. otherwise, the parser will not continue while in an error
state.
members
at all times, the parser object will have the following members:
line
, column
, position
- indications of the position in the json
document where the parser currently is looking.
closed
- boolean indicating whether or not the parser can be written to.
if it's true
, then wait for the ready
event to write again.
opt
- any options passed into the constructor.
and a bunch of other stuff that you probably shouldn't touch.
events
all events emit with a single argument. to listen to an event, assign a
function to on<eventname>
. functions get executed in the this-context of
the parser object. the list of supported events are also in the exported
EVENTS
array.
when using the stream interface, assign handlers using the EventEmitter
on
function in the normal fashion.
error
- indication that something bad happened. the error will be hanging
out on parser.error
, and must be deleted before parsing can continue. by
listening to this event, you can keep an eye on that kind of stuff. note:
this happens much more in strict mode. argument: instance of Error
.
value
- a json value. argument: value, can be a bool, null, string on number
openobject
- object was opened. argument: key, a string with the first key of the object (if any)
key
- an object key: argument: key, a string with the current key
closeobject
- indication that an object was closed
openarray
- indication that an array was opened
closearray
- indication that an array was closed
end
- indication that the closed stream has ended.
ready
- indication that the stream has reset, and is ready to be written
to.
samples
some samples are available to help you get started. one that creates a list of top npm contributors, and another that gets a bunch of data from twitter and generates valid json.
roadmap
check issues
contribute
everyone is welcome to contribute. patches, bug-fixes, new features
- create an issue so the community can comment on your idea
- fork
clarinet
- create a new branch
git checkout -b my_branch
- create tests for the changes you made
- make sure you pass both existing and newly inserted tests
- commit your changes
- push to your branch
git push origin my_branch
- create an pull request
helpful tips:
check index.html
. there's two env vars you can set, CRECORD
and CDEBUG
.
CRECORD
allows you to record
the event sequence from a new json test so you don't have to write everything.CDEBUG
can be set to info
or debug
. info
will console.log
all emits, debug
will console.log
what happens to each char.
in test/clarinet.js
there's two lines you might want to change. #8
where you define seps
, if you are isolating a test you probably just want to run one sep, so change this array to [undefined]
. #718
which says for (var key in docs) {
is where you can change the docs you want to run. e.g. to run foobar
i would do something like for (var key in {foobar:''}) {
.
meta
(oO)--',-
in caos