
Security News
Dutch National Police Disrupt Redline and Meta Malware Operations
Dutch National Police and FBI dismantle Redline and Meta infostealer malware-as-a-service operations in Operation Magnus, seizing servers and source code.
tree-sitter
Advanced tools
Tree-sitter is a parser generator tool and an incremental parsing library. It is used to build parsers for programming languages and to parse code into syntax trees. Tree-sitter is designed to be fast and efficient, making it suitable for real-time applications like code editors.
Parsing Code
This feature allows you to parse source code into a syntax tree. In this example, we parse a simple JavaScript code snippet and print the resulting syntax tree.
const Parser = require('tree-sitter');
const JavaScript = require('tree-sitter-javascript');
const parser = new Parser();
parser.setLanguage(JavaScript);
const sourceCode = 'const x = 1 + 2;';
const tree = parser.parse(sourceCode);
console.log(tree.rootNode.toString());
Querying Syntax Trees
This feature allows you to query syntax trees using a pattern-matching language. In this example, we query the syntax tree for binary expressions with an identifier on the left and a number on the right.
const Parser = require('tree-sitter');
const JavaScript = require('tree-sitter-javascript');
const { query } = require('tree-sitter-query');
const parser = new Parser();
parser.setLanguage(JavaScript);
const sourceCode = 'const x = 1 + 2;';
const tree = parser.parse(sourceCode);
const q = query(JavaScript, '(binary_expression left: (identifier) right: (number))');
const matches = q.matches(tree.rootNode);
console.log(matches);
Incremental Parsing
This feature allows you to incrementally parse code, which is useful for real-time applications like code editors. In this example, we first parse a JavaScript code snippet and then update the syntax tree with a modified version of the code.
const Parser = require('tree-sitter');
const JavaScript = require('tree-sitter-javascript');
const parser = new Parser();
parser.setLanguage(JavaScript);
let sourceCode = 'const x = 1 + 2;';
let tree = parser.parse(sourceCode);
sourceCode = 'const x = 1 + 2 + 3;';
tree = parser.parse(sourceCode, tree);
console.log(tree.rootNode.toString());
Esprima is a high-performance, standard-compliant ECMAScript parser. It parses JavaScript code into an abstract syntax tree (AST). Compared to Tree-sitter, Esprima is specifically focused on JavaScript and does not support incremental parsing.
Acorn is a small, fast, JavaScript-based JavaScript parser. It generates an abstract syntax tree (AST) and is known for its performance and modularity. Unlike Tree-sitter, Acorn is limited to JavaScript and does not support incremental parsing.
Incremental parsers for node
npm install tree-sitter
First, you'll need a Tree-sitter grammar for the language you want to parse. There are many existing grammars such as tree-sitter-javascript and tree-sitter-go. You can also develop a new grammar using the Tree-sitter CLI.
Once you've got your grammar, create a parser with that grammar.
const Parser = require('tree-sitter');
const JavaScript = require('tree-sitter-javascript');
const parser = new Parser();
parser.setLanguage(JavaScript);
Then you can parse some source code,
const sourceCode = 'let x = 1; console.log(x);';
const tree = parser.parse(sourceCode);
and inspect the syntax tree.
console.log(tree.rootNode.toString());
// (program
// (lexical_declaration
// (variable_declarator (identifier) (number)))
// (expression_statement
// (call_expression
// (member_expression (identifier) (property_identifier))
// (arguments (identifier)))))
const callExpression = tree.rootNode.child(1).firstChild;
console.log(callExpression);
// { type: 'call_expression',
// startPosition: {row: 0, column: 16},
// endPosition: {row: 0, column: 30},
// startIndex: 0,
// endIndex: 30 }
If your source code changes, you can update the syntax tree. This will take less time than the first parse.
// Replace 'let' with 'const'
const newSourceCode = 'const x = 1; console.log(x);';
tree.edit({
startIndex: 0,
oldEndIndex: 3,
newEndIndex: 5,
startPosition: {row: 0, column: 0},
oldEndPosition: {row: 0, column: 3},
newEndPosition: {row: 0, column: 5},
});
const newTree = parser.parse(newCode, tree);
If you have source code stored in a superstring TextBuffer
, you can parse that source code on a background thread with a Promise
-based interface:
const {TextBuffer} = require('superstring');
async function test() {
const buffer = new TextBuffer('const x= 1; console.log(x);');
const newTree = await parser.parseTextBuffer(buffer, oldTree);
}
Using a background thread can introduce a slight delay, so you may want to allow some work to be done on the main thread, in the hopes that parsing will complete so quickly that you won't even need a background thread:
async function test2() {
const buffer = new TextBuffer('const x= 1; console.log(x);');
const newTree = await parser.parseTextBuffer(buffer, oldTree, {
syncOperationCount: 1000
});
}
FAQs
Incremental parsers for node
The npm package tree-sitter receives a total of 420,847 weekly downloads. As such, tree-sitter popularity was classified as popular.
We found that tree-sitter demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 9 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Dutch National Police and FBI dismantle Redline and Meta infostealer malware-as-a-service operations in Operation Magnus, seizing servers and source code.
Research
Security News
Socket is tracking a new trend where malicious actors are now exploiting the popularity of LLM research to spread malware through seemingly useful open source packages.
Security News
Research
Noxia, a new dark web bulletproof host, offers dirt cheap servers for Python, Node.js, Go, and Rust, enabling cybercriminals to distribute malware and execute supply chain attacks.