dirty-json
npm install dirty-json
A JSON parser that tries to handle non-conforming or otherwise invalid JSON.
You can play around with a demo here: http://rmarcus.info/dirty-json/
You might also be interested in my blog post about the parser.
Turn this:
[5, .5, 'single quotes', "quotes in "quotes" in quotes"]
Into this:
[5,0.5,"single quotes","quotes in \"quotes\" in quotes"]
Why?
We all love JSON. But sometimes, out in that scary place called "the real world", we see something like this:
{ "user": "<div class="user">Ryan</div>" }
Or even something like this:
{ user: '<div class="user">
Ryan
</div>' }
While these are obviously cringe-worthy, we still want a way to parse them. dirty-json
provides a library to do exactly that.
Examples
dirty-json
does not require object keys to be quoted, and can handle single-quoted value strings.
const dJSON = require('dirty-json');
const r = dJSON.parse("{ test: 'this is a test'}")
console.log(JSON.stringify(r));
dirty-json
can handle embedded quotes in strings.
const dJSON = require('dirty-json');
const r = dJSON.parse('{ "test": "some text "a quote" more text"}');
console.log(JSON.stringify(r));
dirty-json
can handle newlines inside of a string.
const dJSON = require('dirty-json');
const r = dJSON.parse('{ "test": "each \n on \n new \n line"}');
console.log(JSON.stringify(r));
Optionally, dirty-json
can handle duplicate keys differently from standard JSON.
const dJSON = require('dirty-json');
const r = dJSON.parse('{"key": 1, "key": 2, \'key\': [1, 2, 3]}');
console.log(JSON.stringify(r));
const r = dJSON.parse('{"key": 1, "key": 2, \'key\': [1, 2, 3]}', {"duplicateKeys": true});
console.log(JSON.stringify(r));
But what about THIS ambiguous example?
Since dirty-json
is handling malformed JSON, it will not always produce the result that you "think" it should. That's why you should only use this when you absolutely need it. Malformed JSON is malformed for a reason.
How does it work?
Currently dirty-json
uses a lexer powered by lex and a hand-written LR(1)
parser. It shouldn't be used in any environment that requires reliable or fast results.
Security concerns
This package makes heavy use of regular expressions in its lexer. As a result, it may be vulnerable to a REDOS attack. Versions prior to 0.5.1
and after 0.0.5
were definitely vulnerable (thanks to Jamie Davis for pointing this out). I believe version 0.5.1
and later are safe, but since I do not know of any tool to verify a RegEx, I can't prove it.
Acknowledgements
Thanks to user Moai- and 0x0a0dfor fixing array prototype leakage.
License
Copyright 2020, 2018, 2016, 2015, 2014 Ryan Marcus
dirty-json is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
dirty-json is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with dirty-json. If not, see http://www.gnu.org/licenses/.