ast-generator

Package Overview

Dependencies

Maintainers

Versions

Alerts

File Explorer

Advanced tools

License

Install Socket

Detect and block malicious and high-risk dependencies

Install

ast-generator

Helper to generate a TypeScript or JavaScript module for an arbitrary AST definition from a specification.

0.6.0

latest

Source

npm

Version published: 7 months ago

Maintainers: 1

Created: 4 years ago

Source

AST generator is a command line tool that will help you generate TypeScript code for arbitrary ASTs

flowchart LR
    A["ast.grammar"] --> B["Run generate-ast"]
    B --> C["generated-ast.ts"]

    A@{ shape: doc }
    B@{ shape: proc }
    C@{ shape: doc }

It’s recommended to create the following standard file structure:

mylang/
  ast.grammar        // The input grammar
  generated-ast.ts   // The generated TypeScript module
  index.ts           // You define semantics here

Example grammar

Let’s define an example AST for a simple drawing program.

The following grammar definition (in a file called ast.grammar) describes three nodes (Document, Circle, Rect), and one union (Shape), with various properties.

// In ast.grammar

Document {
  version?: 1 | 2
  shapes: Shape*
}

Shape =
  | Circle
  | Rect

Circle {
  cx: number
  cy: number
  r: number
}

Rect {
  x: number
  y: number
  width: number
  height: number
}

What will be generated?

This definition will generate a TypeScript module with the following things in it.

Types for nodes and unions

export type Node = Document | Shape | Circle

export type Document = {
  type: "Document"
  version: 1 | 2 | null
  shapes: Shape[]
}

export type Shape = Circle | Rect

export type Circle = {
  type: "Circle"
  cx: number
  cy: number
  r: number
}

export type Rect = {
  type: "Rect"
  x: number
  y: number
  width: number
  height: number
}

Constructors for nodes

Each node will get a lowercased function to construct the associated node type.

export function document(version: 1 | 2 | null, shapes: Shape[]): Document {}
export function circle(cx: number, cy: number, r: number): Circle {}
export function rect(x: number, y: number, width: number, height: number): Rect {}

[!NOTE]
Note that there is no constructor for a "shape". A shape is either a circle or a rect.

Predicates for nodes and unions

// Predicates for all nodes
export function isDocument(value: unknown): value is Document {}
export function isCircle(value: unknown): value is Circle {}
export function isRect(value: unknown): value is Rect {}

// Predicates for all unions
export function isNode(value: unknown): value is Node {}
export function isShape(value: unknown): value is Shape {}

Usage

This definition will generate a TypeScript module you can use as follows in your index.ts:

import type { Document, Shape, Rect, Circle } from "./generated-ast"
import { document, rect, circle } from "./generated-ast"
import { isShape } from "./generated-ast"

Another way to import is using a * as import.

import * as G from "./generated-ast"

A full example:

import * as G from "./generated-ast"

const mydoc = G.document(1, [
  G.circle(10, 10, 5),
  G.rect(0, 0, 10, 10),
  G.circle(20, 20, 10),
])

console.log(mydoc.shapes[0].type) // "Circle"
console.log(mydoc.shapes[0].cx) // 10

console.log(G.isShape(mydoc)) // false
console.log(G.isShape(mydoc.shapes[0])) // true

Settings

To change the default discriminator field on all nodes:

// In ast.grammar
settings {
  discriminator = "_kind"
}

This would produce the node types as:

export type Document = {
  _kind: "Document" // 👈
  version: 1 | 2 | null
  shapes: Shape[]
}

export type Circle = {
  _kind: "Circle" // 👈
  cx: number
  cy: number
  r: number
}

You can use the following settings to configure the generated output:

Setting	Default Value	Description
`output`	`"generated-ast.ts"`	Where to write the generated output to (relative to the grammar file)
`discriminator`	`"type"`	The discriminating field added to every node to identify its node type

Assigning semantic meaning to nodes

An abstract syntax tree represents something you want to give meaning to. To do this, you can define custom properties and methods that will be available on every node.

For example:

// In ast.grammar

semantic property area
semantic method prettify()
semantic method check()

Document {
  version?: 1 | 2
  shapes: Shape*
}

// etc

[!NOTE]
Don't forget to re-run the code generator after changing the grammar.

After this, it will be as if every node (Document, Circle, Rect) have a area property and prettify and check methods.

But what does the area property return? And what do prettify or check do? That’s completely up to you!

Defining a semantic property

In your index.ts, let’s define the area property:

// index.ts
import * as G from "./generated-ast"

declare module "./generated-ast" {
  interface Semantics {
    area: number // 1️⃣
  }
}

// 2️⃣
G.defineProperty("area", {
  Circle: (node) => Math.PI * node.r * node.r,
  Rect: (node) => node.width * node.height,
})

const mydoc = G.document(1, [
  G.circle(10, 10, 5),
  G.rect(0, 0, 10, 10),
  G.circle(20, 20, 10),
])

console.log(mydoc.shapes[0].area) // 78.54
console.log(mydoc.shapes[1].area) // 100
console.log(mydoc.area) // Error: Semantic property 'area' is only partially defined and missing definition for 'Document'

Step 1️⃣ is to augment the Semantics interface. This will make TypeScript understand that every node in the AST will have an area property that will be a number.

Step 2️⃣ is to define how the area property should be computed for each specified node type. The return types will have to match the type you specified in the Semantics augmentation.

Note that in this case, we defined the property partially. An area is not defined on the Document node type. This is a choice. If it makes sense, we could also choose to implement it there, for example, by summing the areas of all the shapes inside it.

Defining a semantic methods

// index.ts
import * as G from "./generated-ast"

declare module "./generated-ast" {
  interface Semantics {
    area: number

    // 1️⃣ Add these
    prettify(): string
    check(): void
  }
}

// 2️⃣
G.defineMethod("prettify", {
  Node: (node) => JSON.stringify(node, null, 2),
})

// 2️⃣
G.defineMethod("check", {
  Circle: (node) => {
    if (node.r < 0) {
      throw new Error("Radius must be positive")
    }
  },

  Rect: (node) => {
    if (node.width < 0 || node.height < 0) {
      throw new Error("Width and height must be positive")
    }
  },
})

const mydoc = G.document(1, [
  G.circle(10, 10, 5),
  G.rect(0, 0, 10, 10),
  G.circle(20, 20, 10),
])

console.log(mydoc.shapes[0].area) // 78.54
console.log(mydoc.shapes[1].area) // 100
console.log(mydoc.area) // Error: Semantic property 'area' is only partially defined and missing definition for 'Document'

Should I use a property or method?

It depends what you want. Both are lazily evaluated, but properties will be evaluated at most once for each node, and be cached. Methods will be re-evaluated every time you call them.

To clarify the difference, suppose you add a randomProp property and a randomMethod, both with the same implementation.

G.defineMethod("random", {
  Node: (node) => Math.random(),
})

mynode.random() // 0.168729
mynode.random() // 0.782916

Versus:

G.defineProperty("random", {
  Node: (node) => Math.random(),
})

mynode.random // 0.437826
mynode.random // 0.437826 (cached!)

Cross-calling

Both methods and properties can use other semantic properties or methods in their definitions, which makes them very powerful. As long as there is no infinite loop, you’re free to write them however.

For example, in the definition of check, we could choose to rely on the area property:

G.defineMethod("check", {
  Circle: (node) => {
    if (node.area < 0) {
      throw new Error("Area must be positive")
    }
  },
  React: (node) => {
    if (node.area < 0) {
      throw new Error("Area must be positive")
    }
  },
})

Partial or exhaustive?

When authoring semantic properties or methods, you can choose to define them partially (e.g. not all node types necessarily have an area) or to define them exhaustively (e.g. all nodes should have a prettify() output defined). This depends on the use case.

When defining the semantics, you can pick between:

defineProperty() allows partial definitions
definePropertyExhaustively() will require a definition for every node type

The benefit of using definePropertyExhaustively is that if you add a new node to the grammar, TypeScript will help you remember to also define the semantics for it.

Similarly:

defineMethod()
defineMethodExhaustively()

[0.6.0] - 2025-01-31

Breaking Put settings in a new settings block, i.e.
```
settings {
  output = "../gen-here-plz.ts"
}
```
instead of the old (no longer supported):
```
set output "../gen-here-plz.ts"
```
Add understood settings discriminator, and output for now. More settings will be added later.
Generate to generated-ast.ts by default, but allow specifying it through output = "../somewhere-else.ts"
No longer support passing output file as a CLI argument

Keywords

FAQs

What is ast-generator?

Is ast-generator well maintained?

Package last updated on 31 Jan 2025

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

ast-generator

Example grammar

What will be generated?

Types for nodes and unions

Constructors for nodes

Predicates for nodes and unions

Usage

Settings

Assigning semantic meaning to nodes

Defining a semantic property

Defining a semantic methods

Should I use a property or method?

Cross-calling

Partial or exhaustive?

[0.6.0] - 2025-01-31

Keywords

Related posts

rv Is a New Rust-Powered Ruby Version Manager Inspired by Python's uv

Nx Investigation Reveals GitHub Actions Workflow Exploit Led to npm Token Theft, Prompting Switch to Trusted Publishing