You're Invited:Meet the Socket Team at RSAC and BSidesSF 2026, March 23–26.RSVP
Socket
Book a DemoSign in
Socket

@anduril-code/ctx

Package Overview
Dependencies
Maintainers
1
Versions
7
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@anduril-code/ctx

Token-efficient, context-aware compression for agent pipelines.

npmnpm
Version
0.1.6
Version published
Weekly downloads
53
-75.35%
Maintainers
1
Weekly downloads
 
Created
Source

ctx

Token-efficient Markdown compression and document intelligence for agent pipelines.

npm version license: MIT node >=20 bun >=1.3

Why ctx

Markdown has become the lingua franca of AI agents, but it wastes 30–50% of tokens on formatting syntax: table borders, heading markers, repetitive delimiters, whitespace padding. Every token spent on structure is a token not spent on content.

ctx gives agents a spectrum of strategies for fitting more useful content into a context window:

  • Lossless compressioncompact()/expand() deterministically encode and decode Markdown with zero information loss. expand(compact(md)) === md, always.
  • Targeted extraction — pull out only the sections an agent needs, with optional truncation limits.
  • AI summarization — abstractive LLM summaries (~200 tokens by default) for breadth-first exploration of large docs, with results cached so repeated calls are free.

The library and CLI expose the lossless path. The MCP server exposes all three.

Features

  • Lossless round-tripexpand(compact(md)) === md, always, verified by property tests
  • 30–50% token reduction on typical agent documents (lossless path)
  • Zero runtime dependencies for the core encode/decode path
  • Library + CLI + MCP server — one package, three interfaces
  • Stage-based pipeline — structural, whitespace, dedup, and semantic stages, each independently toggleable
  • Readable without expansion — compact format is parseable by LLMs even before expanding
  • Section navigation — list document structure with per-section token counts before loading any content
  • Targeted extraction — retrieve specific sections verbatim with character/row/item truncation limits
  • AI summarization — LLM-powered abstractive summaries with docType-aware prompts and in-process caching

Installation

npm install @anduril-code/ctx
# or
bun add @anduril-code/ctx

Quick Start

import { compact, expand, verify } from '@anduril-code/ctx';

const md = `# Project Status

## Tasks

- [x] Database migration
- [ ] Frontend integration

| Name  | Role    | Status |
|-------|---------|--------|
| Alice | Lead    | Active |
| Bob   | Backend | Active |
`;

const result = compact(md);
console.log(result.output);
// # Project Status
// ## Tasks
// [x] Database migration
// [] Frontend integration
// |: Name, Role, Status
// | Alice, Lead, Active
// | Bob, Backend, Active

const restored = expand(result.output);
// restored === md  ✓

console.log(verify(md)); // true

With options and stats:

const { output, stats } = compact(md, {
  dedup: true,
  semantic: true,
  stats: true,
});

console.log(stats.savings); // e.g. 0.38 (38% fewer tokens)

API Reference

Library

import { compact, compactDiff, expand, pruneLog, verify, createPipeline } from '@anduril-code/ctx';

compact(markdown, options?): CompactResult

Compresses a Markdown string. Returns { output: string, stats? }.

OptionTypeDefaultDescription
dedupbooleanfalseEnable deduplication stage (dictionary substitution for repeated substrings)
semanticbooleanfalseEnable semantic stage (strip redundant markup, normalize unicode punctuation)
keepCommentsbooleanfalsePreserve HTML comments (stripped by default)
onlySectionsstring[]Keep only the listed heading sections
stripSectionsstring[]Remove the listed heading sections
unwrapLinesbooleanfalseJoin soft-wrapped paragraph lines into a single line
tableDelimiterstring","Cell delimiter used in compact table rows
versionMarkerbooleanfalsePrepend %ctx:1 version header
statsbooleanfalseCompute and return token-saving statistics

expand(compactText, options?): string

Expands compact format back to standard Markdown.

OptionTypeDefaultDescription
tableDelimiterstring","Cell delimiter used when reading compact table rows

verify(markdown, options?): boolean

Returns true if expand(compact(markdown)) === markdown.

compactDiff(diffText, options?): string

Compresses unified git diff text (lossy, one-way). Useful for PR review and change analysis.

OptionTypeDefaultDescription
contextnumber1Context lines to keep around changed lines (0 strips all context)
compactHeadersbooleantrueReplace diff/index/---/+++ header block with === path
changesOnlybooleanfalseEmit only file path + changed lines (+/-)

pruneLog(logText, options?): LogPruneResult

Lossy log/terminal output pruning for test, build, and CI output.

OptionTypeDefaultDescription
stripAnsibooleantrueStrip ANSI and terminal control sequences
foldProgressbooleantrueFold spinner/progress runs
stripTimestamps'auto' | 'strip' | 'keep''auto'Timestamp pruning mode
elidePassingTestsbooleantrueRemove passing tests when failures exist
foldDebugLinesbooleantrueFold debug-level log lines into a summary count
elideHealthChecksbooleantrueRemove /health//readyz-style noise
foldJsonLinesbooleantrueAggregate JSON-per-line logs by severity
foldFrameworkStartupbooleantrueFold startup banner and boot boilerplate
stripUserAgentsbooleantrueReplace long user-agent strings with <ua>
dedupeStackTracesbooleantrueCollapse repeated stack traces in retry loops
foldRepeatedLinesbooleantrueFold repetitive normalized lines
foldGlobalRepeatsbooleantrueFold non-consecutive repeated normalized lines
allowTokenExpansionbooleanfalseKeep transformed output even if token count increases
thresholdTokensnumberOptional token gate threshold metadata
profile'test' | 'ci' | 'lint' | 'runtime'Preset pruning strategy; can be overridden by explicit options
customRulesLogCustomRule[]Optional strip/fold/block rules

pruneLog() also accepts an optional tokenCounter ({ count(text): number }) for custom tokenization parity in no-regression decisions.

createPipeline(stages): Pipeline

Assembles a custom pipeline from an ordered array of Stage objects for advanced use cases.

CLI

Install globally or run via npx:

npx @anduril-code/ctx <command> [options]
CommandDescription
compactCompress a Markdown file to compact format
changesCompress unified diff output for lower token usage
pruneLossy prune of terminal/log output
expandExpand a compact format file back to Markdown
extractExtract and compress specific sections only
verifyAssert lossless round-trip for a file
metricsReport token savings without writing output
sectionsList the heading sections in a document
locateSearch sections by keyword
# Compress
ctx compact input.md -o output.cmd

# Expand
ctx expand output.cmd -o restored.md

# Verify round-trip
ctx verify input.md

# Stats only
ctx metrics input.md

# Pipe-friendly
cat doc.md | ctx compact > compressed.cmd
git diff | ctx changes --changes-only
cat test-output.log | ctx prune --stats
cat lint.log | ctx prune --profile lint --stats
cat server.log | ctx prune --profile runtime

# With options
ctx compact input.md --dedup --semantic --stats

MCP Server

Add to your MCP client config:

{
  "mcpServers": {
    "ctx": {
      "command": "npx",
      "args": ["ctx-mcp"]
    }
  }
}

The MCP server exposes a spectrum of token-reduction strategies. Tools are grouped below by fidelity tier — from lossless to AI-summarized:

Lossless compression

ToolDescription
ctx_compactCompress Markdown to compact format — fully reversible
ctx_expandExpand compact format back to standard Markdown
ctx_verifyAssert that round-trip is lossless for a given input
ctx_metricsReport token savings without writing any output
ctx_changesCompress unified git diff text (one-way, lossy)
ctx_pruneLossy pruning for logs/terminal output with token gate + optional summarize fallback

Section navigation (start here for unknown documents)

ToolDescription
ctx_sectionsList the section TOC with per-section token counts — use this first to budget context before loading content
ctx_locateSearch sections by keyword to find relevant content without reading the whole document

Targeted extraction (verbatim content, optionally truncated)

ToolDescription
ctx_extractRetrieve exact section content, with optional maxChars / maxListItems / maxTableRows truncation

AI summarization (lossy, cached, higher token reduction)

ToolDescription
ctx_summarizeAbstractive LLM summary (~200 tokens by default). Supports docType: auto | guide | reference | spec. Results are cached — repeated calls on unchanged files are instant.
ctx_batchSummarize multiple files in parallel in a single round-trip. Ideal for repo onboarding.

Recommended agent workflow

1. ctx_sections          → see document structure + token sizes
2a. doc is small (<500 tokens)  → read it directly
2b. need a high-level gist      → ctx_summarize
2c. need a specific section     → ctx_extract with onlySections
2d. need compressed full doc    → ctx_compact

Compact Format Reference

Every transformation is lossless and reverses exactly on expand. Most of the token savings come from tables, list syntax, and tight block packing — not from rewriting every construct.

ConstructStandard Markdownctx output
Heading## Section## Section (unchanged)
Ordered list item1. First+ First
Nested unordered item··- Nested (2-space indent)..- Nested
Table header row| A | B | + `|------
Table data row| 1 | 2 || 1, 2
Task list (incomplete)- [ ] Todo[] Todo
Task list (complete)- [x] Done[x] Done
Code fence```python … ``````python … ``` (unchanged)
Horizontal rule------ (unchanged)
Version marker (optional)%ctx:1

What changes: tables (separator row and padding eliminated), ordered list numbers (1.+), nested list indentation (spaces → .. per level), and task list brackets (- [ ][]). Consecutive compact blocks (headings, tables, HR) are also tightly packed with a single newline between them instead of a blank line.

What passes through unchanged: headings, code blocks, horizontal rules, paragraphs, blockquotes, bold, italic, inline code, links, images, and frontmatter.

Note: The parser also accepts a shorthand heading syntax (:1 Title, :2 Section, …) and single-backtick code fences (`python … `) for manually authored compact input, but compact() does not produce these forms.

Dedup dictionary

When dedup: true and savings exceed 5%, repeated substrings are replaced with §N tokens and a dictionary is prepended:

§1=repeated substring here
§2=another repeated phrase
§§
(rest of compact content)

Development

bun install         # install dependencies
bun test            # run tests
bun run build       # compile ESM + CJS + type declarations
bun run lint        # biome check (lint + format)
bun run typecheck   # tsc --noEmit

Contributing

Read AGENTS.md before contributing — it documents the architecture invariants, the one-way dependency graph, and the rules that keep files small and the core zero-dependency.

The primary invariant is lossless round-trip: expand(compact(md)) === md for all inputs, always. When in doubt between two approaches, prefer the one that makes this guarantee easier to maintain.

License

MIT

FAQs

Package last updated on 27 Feb 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts