Security News
JSR Working Group Kicks Off with Ambitious Roadmap and Plans for Open Governance
At its inaugural meeting, the JSR Working Group outlined plans for an open governance model and a roadmap to enhance JavaScript package management.
@digitak/grubber
Advanced tools
Grubber is a lightweight and friendly utility to parse code with regular expressions in a 100% safe way - without having to use an AST 🐛
In a higher level, Grubber also exposes helper functions to parse the dependencies of a file in many languages (Javascript, Typescript, Css, Scss, Python, Rust, C / C++, Nim, ...).
The problem with parsing a source file with regular expressions is that you cannot be sure your match is not commented or inside a string.
For example, let's say you are looking for all const
statements in a Javascript file - you would use a regular expression similar to:
/\bconst\s+/g
But what if the file you want to parse is something like:
const x = 12;
// const y = 13;
let z = "const ";
Then you would match three const
when only one should be matched.
Grubber understands what is a string, what is a comment and what is code so that you can overcome the issue very easily:
import { grub } from "@digitak/grubber";
const content = `
const x = 12
// const y = 13
let z = "const "
`;
const results = grub(content).find(/\bconst\s+/);
console.log(results.length); // will print 1 as expected
For the sake of the demonstration we used a simple regex, but remember that Ecmascript is a tricky language! Effectively finding all
const
statements would require a more refined regex. Ex:foo.const = 12
would be matched. Languages that use semi-colon at the end of every statement or strict indentation are much easier to parse in a 100% safe way.
Use your favorite package manager:
npm install @digitak/grubber
Grubber exports one main function grub
:
export function grub(
source: string,
languageOrRules: LanguageName | Rule[] = "es"
): {
// find one or more expressions and return an array of fragments
find: (...expressions: Array<string | RegExp>) => Fragment[],
// replace one or more expressions and return the patched string
replace: (...fromTos: Array<{
from: string | RegExp,
to: string | RegExp
}>) => string,
// find all dependencies (ex: `imports` in Typescript, `use` in Rust)
findDependencies: () => Fragment[],
// replace all dependencies by the given value
// you can use special replace patterns like "$1" to replace
// with the first captured group
replaceDependencies: (to: string) => string,
}
The find
and findDependencies
methods both return an array of fragments:
export type Fragment = {
slice: string // the matched substring
start: number // start of the matched substring
end: number // end of the matched substring
groups: string[] = [] // the captured groups
};
You can use any of the preset languages:
export type LanguageName =
| 'es' // Ecmascript (Javascript / Typescript / Haxe): the default
| 'rs' // Rust
| 'css'
| 'scss'
| 'sass'
| 'c'
| 'cpp'
| 'py' // Python
| 'nim'
;
Example:
// find all semi-colons inside the rust source code
grub(rustCodeToParse, "rs").find(";");
You may define custom rules for the grubber parser, ie. what should be ignored an treated as "not code".
A Rule
has the following type:
export type Rule =
| {
expression: string | RegExp // the expression to ignore
// if returns false, the match is ignored
onExpressionMatch?: (match: RegExpMatchArray) => boolean | void
}
| {
startAt: string | RegExp // start of the expression to ignore
stopAt: string | RegExp // stop of the expression to ignore
// if returns false, the match is ignored
onStartMatch?: (match: RegExpMatchArray) => boolean | void
onStopMatch?: (match: RegExpMatchArray) => boolean | void
}
;
For example, the rules used for the C
language are:
const rules: Rule[] = [
{
// string
expression: /".*?[^\\](?:\\\\)*"/,
},
{
// single line comment
expression: /\/\/.*/,
},
{
// multiline comment
expression: /\/\*((?:.|\s)*?)\*\//,
},
];
Rules are quite simple for most languages but get complicated for Ecmascript because of the
${...}
syntax. Hopefully the job is already done for you!
🌿 🐛 🌿
FAQs
Parse code files and patch it without having to use an AST
We found that @digitak/grubber demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
At its inaugural meeting, the JSR Working Group outlined plans for an open governance model and a roadmap to enhance JavaScript package management.
Security News
Research
An advanced npm supply chain attack is leveraging Ethereum smart contracts for decentralized, persistent malware control, evading traditional defenses.
Security News
Research
Attackers are impersonating Sindre Sorhus on npm with a fake 'chalk-node' package containing a malicious backdoor to compromise developers' projects.