Tippex
Erase comments, strings and regular expressions from JavaScript code.
Why?
Say you want to do some very simple code analysis, such as finding import
and export
statements. You could just skim over the code with a regex, but you'll get bad results if matches exist inside comments or strings:
import a from './a.js';
Instead, you might generate an abstract syntax tree with a parser like Acorn, and traverse the AST looking for nodes of a specific type. But for a lot of simple tasks that's overkill – parsing is expensive, traversing is a lot less simple than using regular expressions, and if you're doing anything in the browser it's better to avoid large dependencies.
Tippex offers some middle ground. It's as robust as a full-fledged parser, but miniscule – and an order of magnitude faster. (Americans: Tippex is what you oddballs call 'Liquid Paper' or 'Wite-Out'.)
What does it do?
Tippex simply replaces comments with equivalent whitespace, and removes the contents of strings (including ES6 template strings) and regular expressions.
So this...
var a = 1;
var b = 2;
var c = /\w+/;
var d = 'some text';
var e = "some more text";
var f = `an ${ 'unnecessarily' ? `${'complicated'}` : `${'template'}` } string`;
...becomes this:
var a = 1;
var b = 2;
var c = / /;
var d = ' ';
var e = " ";
var f = ` ${ ' ' ? `${' '}` : `${' '}` } `;
Once that's done, you can search for patterns (such as var
or =
) in complete confidence that you won't get any false positives.
Installation
npm install --save tippex
...or download from npmcdn.com (UMD version, ES6 exports version).
Usage
import * as tippex from 'tippex';
var erased = tippex.erase( 'var a = 1; // line comment' );
var found = tippex.find( 'var a = 1; // line comment' );
Sometimes you might need to match a regular expression against the original string, but ignoring comments etc. For that you can use tippex.match
:
var code = `
import a from './a.js';
// import b from './b.js'; TODO do we need this?
`;
var importPattern = /import (.+?) from '([^']+)'/g;
var importDeclarations = [];
tippex.match( code, importPattern, ( match, name, source ) => {
importDeclarations.push({ name, source });
});
console.log( importDeclarations );
(A complete regular expression for ES6 imports would be a bit more complicated; this is for illustrative purposes.)
To replace occurrences of a pattern that aren't inside strings or comments, use tippex.replace
:
code = tippex.replace( code, importPattern, ( match, name, source ) => {
return `var ${name} = require('${source}')`;
});
Known issues
It's extremely difficult to distinguish between regular expression literals and division operators in certain edge cases at the lexical level. Fortunately, these cases are rare and generally somewhat contrived. If you encounter one in the wild, please raise an issue so we can try to accommodate it.
License
MIT
Follow @Rich_Harris on Twitter for more artisanal, hand-crafted JavaScript.