clarinet
clarinet
is a sax-like streaming parser for JSON (pun intended). works in the browser and node.js. clarinet
is inspired (and forked) from sax-js. just like you shouldn't use sax
when you need dom
you shouldn't use clarinet
when you need JSON.parse
.
design goals
clarinet
is very much like yajl but written in javascript:
- written in javascript
- portable
- robust (~110 tests pass before even announcing the project)
- data representation independent
- fast
- generates verbose, useful error messages including context of where
the error occurs in the input text.
- can parse json data off a stream, incrementally
- simple to use
- tiny
motivation
the reason behind this work was to create better full text support in node. creating indexes out of large (or many) json files doesn't require a full understanding of the json file, but it does require something like clarinet
.
installation
node.js
- install npm
npm install clarinet
var clarinet = require('clarinet');
browser
- minimize clarinet.js
- load it into your webpage
usage
basics
var clarinet = require("clarinet")
, parser = clarinet.parser()
;
parser.onerror = function (e) {
};
parser.onvalue = function (v) {
};
parser.onopenobject = function (key) {
};
parser.onkey = function (key) {
};
parser.oncloseobject = function () {
};
parser.onopenarray = function () {
};
parser.onclosearray = function () {
};
parser.onend = function () {
};
parser.write('{"foo": "bar"}').close();
var stream = require("clarinet").createStream(options);
stream.on("error", function (e) {
console.error("error!", e)
this._parser.error = null
this._parser.resume()
})
stream.on("openobject", function (node) {
})
fs.createReadStream("file.json")
.pipe(stream)
.pipe(fs.createReadStream("file-altered.json"))
arguments
pass the following arguments to the parser function. all are optional.
opt
- object bag of settings regarding string formatting. all default to false
.
settings supported:
trim
- boolean. whether or not to trim text and comment nodes.normalize
- boolean. if true, then turn any whitespace into a single
space.
methods
write
- write bytes onto the stream. you don't have to do this all at
once. you can keep writing as much as you want.
close
- close the stream. once closed, no more data may be written until
it is done processing the buffer, which is signaled by the end
event.
resume
- to gracefully handle errors, assign a listener to the error
event. then, when the error is taken care of, you can call resume
to
continue parsing. otherwise, the parser will not continue while in an error
state.
members
at all times, the parser object will have the following members:
line
, column
, position
- indications of the position in the json
document where the parser currently is looking.
closed
- Boolean indicating whether or not the parser can be written to.
If it's true
, then wait for the ready
event to write again.
opt
- Any options passed into the constructor.
and a bunch of other stuff that you probably shouldn't touch.
events
all events emit with a single argument. to listen to an event, assign a
function to on<eventname>
. functions get executed in the this-context of
the parser object. the list of supported events are also in the exported
EVENTS
array.
when using the stream interface, assign handlers using the EventEmitter
on
function in the normal fashion.
error
- indication that something bad happened. the error will be hanging
out on parser.error
, and must be deleted before parsing can continue. by
listening to this event, you can keep an eye on that kind of stuff. note:
this happens much more in strict mode. argument: instance of Error
.
value
- a json value. argument: value, can be a bool, null, string on number
openobject
- object was opened. argument: key, a string with the first key of the object (if any)
key
- an object key: argument: key, a string with the current key
closeobject
- indication that an object was closed
openarray
- indication that an array was opened
closearray
- indication that an array was closed
end
- indication that the closed stream has ended.
ready
- indication that the stream has reset, and is ready to be written
to.
samples
added some samples to help you get started. one that creates a list of top npm contributors, and another that gets a bunch of data from twitter and generates valid json. first sample is a good use for clarinet
, second not so much. i just needed data and this was a good way to battle test it. but the twitter
sample is a good example of when not to use clarinet
. http requests take forever and you had more than enough time to JSON.parse
. so what i did is stupid.
roadmap
check issues
contribute
everyone is welcome to contribute. patches, bug-fixes, new features
- create an issue so the community can comment on your idea
- fork
clarinet
- create a new branch
git checkout -b my_branch
- create tests for the changes you made
- make sure you pass both existing and newly inserted tests
- commit your changes
- push to your branch
git push origin my_branch
- create an pull request
helpful tips:
check index.html
. there's two env vars you can set, CRECORD
and CDEBUG
.
CRECORD
allows you to record
the event sequence from a new json test so you don't have to write everything.CDEBUG
can be set to info
or debug
. info
will console.log
all emits, debug
will console.log
what happens to each char.
in test/clarinet.js
there's two lines you might want to change. #8
where you define seps
, if you are isolating a test you probably just want to run one sep, so change this array to [undefined]
. #718
which says for (var key in docs) {
is where you can change the docs you want to run. e.g. to run foobar
i would do something like for (var key in {foobar:''}) {
.
this is not ideal so if you improve it send a pull request.
differences to yajl
yajl
is written in c, probably won't work in the browserclarinet
does not do validations.openobject
emits the name of the first key, while in yajl
the first key emit is separatedclarinet
emits value
for any value. yajl
emits string
for string, null
for null, and so on.
if these differences bother you feel free to send in a pull request.
meta
(oO)--',-
in caos