
Security News
npm Adopts OIDC for Trusted Publishing in CI/CD Workflows
npm now supports Trusted Publishing with OIDC, enabling secure package publishing directly from CI/CD workflows without relying on long-lived tokens.
web-tree-sitter
Advanced tools
web-tree-sitter is a JavaScript library that provides a way to parse and analyze code using the Tree-sitter parsing library. It is designed to work in web environments and allows developers to build syntax trees for various programming languages, enabling tasks such as syntax highlighting, code navigation, and static analysis.
Initialize Parser
This code demonstrates how to initialize the Tree-sitter parser with a specific language (JavaScript in this case). It loads the language WASM file and sets it to the parser.
const Parser = require('web-tree-sitter');
async function initializeParser() {
await Parser.init();
const parser = new Parser();
const Lang = await Parser.Language.load('tree-sitter-javascript.wasm');
parser.setLanguage(Lang);
return parser;
}
initializeParser().then(parser => console.log('Parser initialized'));
Parse Code
This code sample shows how to parse a string of JavaScript code into a syntax tree. The resulting tree can be used for further analysis or manipulation.
const Parser = require('web-tree-sitter');
async function parseCode(code) {
await Parser.init();
const parser = new Parser();
const Lang = await Parser.Language.load('tree-sitter-javascript.wasm');
parser.setLanguage(Lang);
const tree = parser.parse(code);
return tree;
}
parseCode('const x = 42;').then(tree => console.log(tree.rootNode.toString()));
Query Syntax Tree
This example demonstrates how to query a syntax tree using Tree-sitter's query language. It searches for function declarations and captures the function names.
const Parser = require('web-tree-sitter');
async function querySyntaxTree(code) {
await Parser.init();
const parser = new Parser();
const Lang = await Parser.Language.load('tree-sitter-javascript.wasm');
parser.setLanguage(Lang);
const tree = parser.parse(code);
const query = Lang.query('(function_declaration name: (identifier) @function-name)');
const matches = query.matches(tree.rootNode);
return matches;
}
querySyntaxTree('function foo() {}').then(matches => console.log(matches));
Esprima is a high-performance, standard-compliant ECMAScript parser written in JavaScript. It is used for parsing JavaScript code into an abstract syntax tree (AST). Compared to web-tree-sitter, Esprima is specifically focused on JavaScript and does not support other languages.
Acorn is a small, fast, JavaScript-based JavaScript parser. It generates an abstract syntax tree (AST) for JavaScript code. Similar to Esprima, Acorn is focused on JavaScript and is known for its performance and modularity. Unlike web-tree-sitter, it does not support multiple languages.
WebAssembly bindings to the Tree-sitter parsing library.
You can download the tree-sitter.js
and tree-sitter.wasm
files from the latest GitHub release and load
them using a standalone script:
<script src="/the/path/to/tree-sitter.js"></script>
<script>
const { Parser } = window.TreeSitter;
Parser.init().then(() => { /* the library is ready */ });
</script>
You can also install the web-tree-sitter
module from NPM and load it using a system like Webpack:
const { Parser } = require('web-tree-sitter');
Parser.init().then(() => { /* the library is ready */ });
or Vite:
import { Parser } from 'web-tree-sitter';
Parser.init().then(() => { /* the library is ready */ });
With Vite, you also need to make sure your server provides the tree-sitter.wasm
file to your public
directory. You can do this automatically with a postinstall
script in your package.json
:
"postinstall": "cp node_modules/web-tree-sitter/tree-sitter.wasm public"
You can also use this module with deno:
import Parser from "npm:web-tree-sitter";
await Parser.init();
// the library is ready
To use the debug version of the library, replace your import of web-tree-sitter
with web-tree-sitter/debug
:
import { Parser } from 'web-tree-sitter/debug'; // or require('web-tree-sitter/debug')
Parser.init().then(() => { /* the library is ready */ });
This will load the debug version of the .js
and .wasm
file, which includes debug symbols and assertions.
[!NOTE] The
tree-sitter.js
file on GH releases is an ES6 module. If you are interested in using a pure CommonJS library, such as for Electron, you should note that on our NPM package, we use conditional exports to provide both the ES6 and CommonJS modules. If you've set up your project correctly, and need to use CommonJS, your package manager will automatically handle this for you. As of writing, we do not host a CommonJS version of the library on GH releases, and if you do not use the NPM registry, you'll have to build the library yourself.
First, create a parser:
const parser = new Parser();
Then assign a language to the parser. Tree-sitter languages are packaged as individual .wasm
files (more on this below):
const { Language } = require('web-tree-sitter');
const JavaScript = await Language.load('/path/to/tree-sitter-javascript.wasm');
parser.setLanguage(JavaScript);
Now you can parse source code:
const sourceCode = 'let x = 1; console.log(x);';
const tree = parser.parse(sourceCode);
and inspect the syntax tree.
console.log(tree.rootNode.toString());
// (program
// (lexical_declaration
// (variable_declarator (identifier) (number)))
// (expression_statement
// (call_expression
// (member_expression (identifier) (property_identifier))
// (arguments (identifier)))))
const callExpression = tree.rootNode.child(1).firstChild;
console.log(callExpression);
// { type: 'call_expression',
// startPosition: {row: 0, column: 16},
// endPosition: {row: 0, column: 30},
// startIndex: 0,
// endIndex: 30 }
If your source code changes, you can update the syntax tree. This will take less time than the first parse.
// Replace 'let' with 'const'
const newSourceCode = 'const x = 1; console.log(x);';
tree.edit({
startIndex: 0,
oldEndIndex: 3,
newEndIndex: 5,
startPosition: {row: 0, column: 0},
oldEndPosition: {row: 0, column: 3},
newEndPosition: {row: 0, column: 5},
});
const newTree = parser.parse(newSourceCode, tree);
If your text is stored in a data structure other than a single string, you can parse it by supplying a callback to parse
instead of a string:
const sourceLines = [
'let x = 1;',
'console.log(x);'
];
const tree = parser.parse((index, position) => {
let line = sourceLines[position.row];
if (line) return line.slice(position.column);
});
.wasm
language filesThere are several options on how to get the .wasm
files for the languages you want to parse.
The recommended way is to just install the package from npm. For example, to parse JavaScript, you can install the tree-sitter-javascript
package:
npm install tree-sitter-javascript
Then you can find the .wasm
file in the node_modules/tree-sitter-javascript
directory.
You can also download the .wasm
files from GitHub releases, so long as the repository uses our reusable workflow to publish
them.
For example, you can download the JavaScript .wasm
file from the tree-sitter-javascript releases page.
.wasm
filesYou can also generate the .wasm
file for your desired grammar. Shown below is an example of how to generate the .wasm
file for the JavaScript grammar.
IMPORTANT: Emscripten, Docker, or Podman need to be installed.
First install tree-sitter-cli
, and the tree-sitter language for which to generate .wasm
(tree-sitter-javascript
in this example):
npm install --save-dev tree-sitter-cli tree-sitter-javascript
Then just use tree-sitter cli tool to generate the .wasm
.
npx tree-sitter build --wasm node_modules/tree-sitter-javascript
If everything is fine, file tree-sitter-javascript.wasm
should be generated in current directory.
Notice that executing .wasm
files in Node.js is considerably slower than running Node.js bindings.
However, this could be useful for testing purposes:
const Parser = require('web-tree-sitter');
(async () => {
await Parser.init();
const parser = new Parser();
const Lang = await Parser.Language.load('tree-sitter-javascript.wasm');
parser.setLanguage(Lang);
const tree = parser.parse('let x = 1;');
console.log(tree.rootNode.toString());
})();
web-tree-sitter
can run in the browser, but there are some common pitfalls.
web-tree-sitter
needs to load the tree-sitter.wasm
file. By default, it assumes that this file is available in the
same path as the JavaScript code. Therefore, if the code is being served from http://localhost:3000/bundle.js
, then
the wasm file should be at http://localhost:3000/tree-sitter.wasm
.
For server side frameworks like NextJS, this can be tricky as pages are often served from a path such as
http://localhost:3000/_next/static/chunks/pages/index.js
. The loader will therefore look for the wasm file at
http://localhost:3000/_next/static/chunks/pages/tree-sitter.wasm
. The solution is to pass a locateFile
function in
the moduleOptions
argument to Parser.init()
:
await Parser.init({
locateFile(scriptName: string, scriptDirectory: string) {
return scriptName;
},
});
locateFile
takes in two parameters, scriptName
, i.e. the wasm file name, and scriptDirectory
, i.e. the directory
where the loader expects the script to be. It returns the path where the loader will look for the wasm file. In the NextJS
case, we want to return just the scriptName
so that the loader will look at http://localhost:3000/tree-sitter.wasm
and not http://localhost:3000/_next/static/chunks/pages/tree-sitter.wasm
.
For more information on the module options you can pass in, see the emscripten documentation.
Most bundlers will notice that the tree-sitter.js
file is attempting to import fs
, i.e. node's file system library.
Since this doesn't exist in the browser, the bundlers will get confused. For Webpack, you can fix this by adding the
following to your webpack config:
{
resolve: {
fallback: {
fs: false
}
}
}
FAQs
Tree-sitter bindings for the web
The npm package web-tree-sitter receives a total of 522,981 weekly downloads. As such, web-tree-sitter popularity was classified as popular.
We found that web-tree-sitter demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 8 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
npm now supports Trusted Publishing with OIDC, enabling secure package publishing directly from CI/CD workflows without relying on long-lived tokens.
Research
/Security News
A RubyGems malware campaign used 60 malicious packages posing as automation tools to steal credentials from social media and marketing tool users.
Security News
The CNA Scorecard ranks CVE issuers by data completeness, revealing major gaps in patch info and software identifiers across thousands of vulnerabilities.