You're Invited:Meet the Socket Team at RSAC and BSidesSF 2026, March 23–26.RSVP →

Book a Demo Sign in

@anduril-code/compact.md

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

@anduril-code/compact.md

Token-efficient, context-aware compression for agent pipelines.

latest

npm

Version: 0.1.6

Version published: 4 weeks ago

Maintainers: 1

Created: 4 weeks ago

Source

compact.md

Token-efficient Markdown compression and document intelligence for agent pipelines.

Why compact.md

Markdown has become the lingua franca of AI agents, but it wastes 30–50% of tokens on formatting syntax: table borders, heading markers, repetitive delimiters, whitespace padding. Every token spent on structure is a token not spent on content.

compact.md gives agents a spectrum of strategies for fitting more useful content into a context window:

Lossless compression — compact()/expand() deterministically encode and decode Markdown with zero information loss. expand(compact(md)) === md, always.
Targeted extraction — pull out only the sections an agent needs, with optional truncation limits.
AI summarization — abstractive LLM summaries (~200 tokens by default) for breadth-first exploration of large docs, with results cached so repeated calls are free.

The library and CLI expose the lossless path. The MCP server exposes all three.

Features

Lossless round-trip — expand(compact(md)) === md, always, verified by property tests
30–50% token reduction on typical agent documents (lossless path)
Zero runtime dependencies for the core encode/decode path
Library + CLI + MCP server — one package, three interfaces
Stage-based pipeline — structural, whitespace, dedup, and semantic stages, each independently toggleable
Readable without expansion — compact format is parseable by LLMs even before expanding
Section navigation — list document structure with per-section token counts before loading any content
Targeted extraction — retrieve specific sections verbatim with character/row/item truncation limits
AI summarization — LLM-powered abstractive summaries with docType-aware prompts and in-process caching

Installation

npm install compact.md
# or
bun add compact.md

Quick Start

import { compact, expand, verify } from 'compact.md';

const md = `# Project Status

## Tasks

- [x] Database migration
- [ ] Frontend integration

| Name  | Role    | Status |
|-------|---------|--------|
| Alice | Lead    | Active |
| Bob   | Backend | Active |
`;

const result = compact(md);
console.log(result.output);
// # Project Status
// ## Tasks
// [x] Database migration
// [] Frontend integration
// |: Name, Role, Status
// | Alice, Lead, Active
// | Bob, Backend, Active

const restored = expand(result.output);
// restored === md  ✓

console.log(verify(md)); // true

With options and stats:

const { output, stats } = compact(md, {
  dedup: true,
  semantic: true,
  stats: true,
});

console.log(stats.savings); // e.g. 0.38 (38% fewer tokens)

API Reference

Library

import { compact, compactDiff, expand, pruneLog, verify, createPipeline } from 'compact.md';

`compact(markdown, options?): CompactResult`

Compresses a Markdown string. Returns { output: string, stats? }.

Option	Type	Default	Description
`dedup`	`boolean`	`false`	Enable deduplication stage (dictionary substitution for repeated substrings)
`semantic`	`boolean`	`false`	Enable semantic stage (strip redundant markup, normalize unicode punctuation)
`keepComments`	`boolean`	`false`	Preserve HTML comments (stripped by default)
`onlySections`	`string[]`	—	Keep only the listed heading sections
`stripSections`	`string[]`	—	Remove the listed heading sections
`unwrapLines`	`boolean`	`false`	Join soft-wrapped paragraph lines into a single line
`tableDelimiter`	`string`	`","`	Cell delimiter used in compact table rows
`versionMarker`	`boolean`	`false`	Prepend `%compact.md:1` version header
`stats`	`boolean`	`false`	Compute and return token-saving statistics

`expand(compactText, options?): string`

Expands compact.md format back to standard Markdown.

Option	Type	Default	Description
`tableDelimiter`	`string`	`","`	Cell delimiter used when reading compact table rows

`verify(markdown, options?): boolean`

Returns true if expand(compact(markdown)) === markdown.

`compactDiff(diffText, options?): string`

Compresses unified git diff text (lossy, one-way). Useful for PR review and change analysis.

Option	Type	Default	Description
`context`	`number`	`1`	Context lines to keep around changed lines (`0` strips all context)
`compactHeaders`	`boolean`	`true`	Replace `diff/index/---/+++` header block with `=== path`
`changesOnly`	`boolean`	`false`	Emit only file path + changed lines (`+`/`-`)

`pruneLog(logText, options?): LogPruneResult`

Lossy log/terminal output pruning for test, build, and CI output.

Option	Type	Default	Description
`stripAnsi`	`boolean`	`true`	Strip ANSI and terminal control sequences
`foldProgress`	`boolean`	`true`	Fold spinner/progress runs
`stripTimestamps`	`'auto' \| 'strip' \| 'keep'`	`'auto'`	Timestamp pruning mode
`elidePassingTests`	`boolean`	`true`	Remove passing tests when failures exist
`foldDebugLines`	`boolean`	`true`	Fold debug-level log lines into a summary count
`elideHealthChecks`	`boolean`	`true`	Remove `/health`/`/readyz`-style noise
`foldJsonLines`	`boolean`	`true`	Aggregate JSON-per-line logs by severity
`foldFrameworkStartup`	`boolean`	`true`	Fold startup banner and boot boilerplate
`stripUserAgents`	`boolean`	`true`	Replace long user-agent strings with `<ua>`
`dedupeStackTraces`	`boolean`	`true`	Collapse repeated stack traces in retry loops
`foldRepeatedLines`	`boolean`	`true`	Fold repetitive normalized lines
`foldGlobalRepeats`	`boolean`	`true`	Fold non-consecutive repeated normalized lines
`allowTokenExpansion`	`boolean`	`false`	Keep transformed output even if token count increases
`thresholdTokens`	`number`	—	Optional token gate threshold metadata
`profile`	`'test' \| 'ci' \| 'lint' \| 'runtime'`	—	Preset pruning strategy; can be overridden by explicit options
`customRules`	`LogCustomRule[]`	—	Optional strip/fold/block rules

pruneLog() also accepts an optional tokenCounter ({ count(text): number }) for custom tokenization parity in no-regression decisions.

`createPipeline(stages): Pipeline`

Assembles a custom pipeline from an ordered array of Stage objects for advanced use cases.

CLI

Install globally or run via npx:

npx compact.md <command> [options]

Command	Description
`compact`	Compress a Markdown file to compact.md format
`changes`	Compress unified diff output for lower token usage
`prune`	Lossy prune of terminal/log output
`expand`	Expand a compact.md file back to Markdown
`extract`	Extract and compress specific sections only
`verify`	Assert lossless round-trip for a file
`metrics`	Report token savings without writing output
`sections`	List the heading sections in a document
`locate`	Search sections by keyword

# Compress
compact.md compact input.md -o output.cmd

# Expand
compact.md expand output.cmd -o restored.md

# Verify round-trip
compact.md verify input.md

# Stats only
compact.md metrics input.md

# Pipe-friendly
cat doc.md | compact.md compact > compressed.cmd
git diff | compact.md changes --changes-only
cat test-output.log | compact.md prune --stats
cat lint.log | compact.md prune --profile lint --stats
cat server.log | compact.md prune --profile runtime

# With options
compact.md compact input.md --dedup --semantic --stats

MCP Server

Add to your MCP client config:

{
  "mcpServers": {
    "compact-md": {
      "command": "npx",
      "args": ["compact-md-mcp"]
    }
  }
}

The MCP server exposes a spectrum of token-reduction strategies. Tools are grouped below by fidelity tier — from lossless to AI-summarized:

Lossless compression

Tool	Description
`compact_md_compact`	Compress Markdown to compact.md format — fully reversible
`compact_md_expand`	Expand compact.md format back to standard Markdown
`compact_md_verify`	Assert that round-trip is lossless for a given input
`compact_md_metrics`	Report token savings without writing any output
`compact_md_changes`	Compress unified git diff text (one-way, lossy)
`compact_md_prune`	Lossy pruning for logs/terminal output with token gate + optional summarize fallback

Section navigation (start here for unknown documents)

Tool	Description
`compact_md_sections`	List the section TOC with per-section token counts — use this first to budget context before loading content
`compact_md_locate`	Search sections by keyword to find relevant content without reading the whole document

Targeted extraction (verbatim content, optionally truncated)

Tool	Description
`compact_md_extract`	Retrieve exact section content, with optional `maxChars` / `maxListItems` / `maxTableRows` truncation

AI summarization (lossy, cached, higher token reduction)

Tool	Description
`compact_md_summarize`	Abstractive LLM summary (~200 tokens by default). Supports `docType`: `auto` \| `guide` \| `reference` \| `spec`. Results are cached — repeated calls on unchanged files are instant.
`compact_md_batch`	Summarize multiple files in parallel in a single round-trip. Ideal for repo onboarding.

Recommended agent workflow

1. compact_md_sections          → see document structure + token sizes
2a. doc is small (<500 tokens)  → read it directly
2b. need a high-level gist      → compact_md_summarize
2c. need a specific section     → compact_md_extract with onlySections
2d. need compressed full doc    → compact_md_compact

Compact Format Reference

Every transformation is lossless and reverses exactly on expand. Most of the token savings come from tables, list syntax, and tight block packing — not from rewriting every construct.

Construct	Standard Markdown	compact.md output
Heading	`## Section`	`## Section` (unchanged)
Ordered list item	`1. First`	`+ First`
Nested unordered item	`··- Nested` (2-space indent)	`..- Nested`
Table header row	`\| A \| B \|` + `\|---	---
Table data row	`\| 1 \| 2 \|`	`\| 1, 2`
Task list (incomplete)	`- [ ] Todo`	`[] Todo`
Task list (complete)	`- [x] Done`	`[x] Done`
Code fence	```python … ```	```python … ``` (unchanged)
Horizontal rule	`---`	`---` (unchanged)
Version marker (optional)	—	`%compact.md:1`

What changes: tables (separator row and padding eliminated), ordered list numbers (1. → +), nested list indentation (spaces → .. per level), and task list brackets (- [ ] → []). Consecutive compact blocks (headings, tables, HR) are also tightly packed with a single newline between them instead of a blank line.

What passes through unchanged: headings, code blocks, horizontal rules, paragraphs, blockquotes, bold, italic, inline code, links, images, and frontmatter.

Note: The parser also accepts a shorthand heading syntax (:1 Title, :2 Section, …) and single-backtick code fences (`python … `) for manually authored compact input, but compact() does not produce these forms.

Dedup dictionary

When dedup: true and savings exceed 5%, repeated substrings are replaced with §N tokens and a dictionary is prepended:

§1=repeated substring here
§2=another repeated phrase
§§
(rest of compact content)

Development

bun install         # install dependencies
bun test            # run tests
bun run build       # compile ESM + CJS + type declarations
bun run lint        # biome check (lint + format)
bun run typecheck   # tsc --noEmit

Contributing

Read AGENTS.md before contributing — it documents the architecture invariants, the one-way dependency graph, and the rules that keep files small and the core zero-dependency.

The primary invariant is lossless round-trip: expand(compact(md)) === md for all inputs, always. When in doubt between two approaches, prefer the one that makes this guarantee easier to maintain.

License

MIT

FAQs

What is @anduril-code/compact.md?

Is @anduril-code/compact.md well maintained?

Package last updated on 27 Feb 2026

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

@anduril-code/compact.md

compact.md

Why compact.md

Features

Installation

Quick Start

API Reference

Library

compact(markdown, options?): CompactResult

expand(compactText, options?): string

verify(markdown, options?): boolean

compactDiff(diffText, options?): string

pruneLog(logText, options?): LogPruneResult

createPipeline(stages): Pipeline

CLI

MCP Server

Compact Format Reference

Dedup dictionary

Development

Contributing

License

Related posts

TypeScript 6.0 Released: The Final JavaScript-Based Version

Trivy Supply Chain Attack Expands to Compromised Docker Images

`compact(markdown, options?): CompactResult`

`expand(compactText, options?): string`

`verify(markdown, options?): boolean`

`compactDiff(diffText, options?): string`

`pruneLog(logText, options?): LogPruneResult`

`createPipeline(stages): Pipeline`