
Security News
TeamPCP Is Systematically Targeting Security Tools Across the OSS Ecosystem
TeamPCP is targeting security tools across the OSS ecosystem, turning scanners and CI pipelines into infostealers to access enterprise secrets.
@anduril-code/compact.md
Advanced tools
Token-efficient Markdown compression and document intelligence for agent pipelines.
Markdown has become the lingua franca of AI agents, but it wastes 30–50% of tokens on formatting syntax: table borders, heading markers, repetitive delimiters, whitespace padding. Every token spent on structure is a token not spent on content.
compact.md gives agents a spectrum of strategies for fitting more useful content into a context window:
compact()/expand() deterministically encode and decode Markdown with zero information loss. expand(compact(md)) === md, always.The library and CLI expose the lossless path. The MCP server exposes all three.
expand(compact(md)) === md, always, verified by property testsdocType-aware prompts and in-process cachingnpm install compact.md
# or
bun add compact.md
import { compact, expand, verify } from 'compact.md';
const md = `# Project Status
## Tasks
- [x] Database migration
- [ ] Frontend integration
| Name | Role | Status |
|-------|---------|--------|
| Alice | Lead | Active |
| Bob | Backend | Active |
`;
const result = compact(md);
console.log(result.output);
// # Project Status
// ## Tasks
// [x] Database migration
// [] Frontend integration
// |: Name, Role, Status
// | Alice, Lead, Active
// | Bob, Backend, Active
const restored = expand(result.output);
// restored === md ✓
console.log(verify(md)); // true
With options and stats:
const { output, stats } = compact(md, {
dedup: true,
semantic: true,
stats: true,
});
console.log(stats.savings); // e.g. 0.38 (38% fewer tokens)
import { compact, compactDiff, expand, pruneLog, verify, createPipeline } from 'compact.md';
compact(markdown, options?): CompactResultCompresses a Markdown string. Returns { output: string, stats? }.
| Option | Type | Default | Description |
|---|---|---|---|
dedup | boolean | false | Enable deduplication stage (dictionary substitution for repeated substrings) |
semantic | boolean | false | Enable semantic stage (strip redundant markup, normalize unicode punctuation) |
keepComments | boolean | false | Preserve HTML comments (stripped by default) |
onlySections | string[] | — | Keep only the listed heading sections |
stripSections | string[] | — | Remove the listed heading sections |
unwrapLines | boolean | false | Join soft-wrapped paragraph lines into a single line |
tableDelimiter | string | "," | Cell delimiter used in compact table rows |
versionMarker | boolean | false | Prepend %compact.md:1 version header |
stats | boolean | false | Compute and return token-saving statistics |
expand(compactText, options?): stringExpands compact.md format back to standard Markdown.
| Option | Type | Default | Description |
|---|---|---|---|
tableDelimiter | string | "," | Cell delimiter used when reading compact table rows |
verify(markdown, options?): booleanReturns true if expand(compact(markdown)) === markdown.
compactDiff(diffText, options?): stringCompresses unified git diff text (lossy, one-way). Useful for PR review and change analysis.
| Option | Type | Default | Description |
|---|---|---|---|
context | number | 1 | Context lines to keep around changed lines (0 strips all context) |
compactHeaders | boolean | true | Replace diff/index/---/+++ header block with === path |
changesOnly | boolean | false | Emit only file path + changed lines (+/-) |
pruneLog(logText, options?): LogPruneResultLossy log/terminal output pruning for test, build, and CI output.
| Option | Type | Default | Description |
|---|---|---|---|
stripAnsi | boolean | true | Strip ANSI and terminal control sequences |
foldProgress | boolean | true | Fold spinner/progress runs |
stripTimestamps | 'auto' | 'strip' | 'keep' | 'auto' | Timestamp pruning mode |
elidePassingTests | boolean | true | Remove passing tests when failures exist |
foldDebugLines | boolean | true | Fold debug-level log lines into a summary count |
elideHealthChecks | boolean | true | Remove /health//readyz-style noise |
foldJsonLines | boolean | true | Aggregate JSON-per-line logs by severity |
foldFrameworkStartup | boolean | true | Fold startup banner and boot boilerplate |
stripUserAgents | boolean | true | Replace long user-agent strings with <ua> |
dedupeStackTraces | boolean | true | Collapse repeated stack traces in retry loops |
foldRepeatedLines | boolean | true | Fold repetitive normalized lines |
foldGlobalRepeats | boolean | true | Fold non-consecutive repeated normalized lines |
allowTokenExpansion | boolean | false | Keep transformed output even if token count increases |
thresholdTokens | number | — | Optional token gate threshold metadata |
profile | 'test' | 'ci' | 'lint' | 'runtime' | — | Preset pruning strategy; can be overridden by explicit options |
customRules | LogCustomRule[] | — | Optional strip/fold/block rules |
pruneLog() also accepts an optional tokenCounter ({ count(text): number }) for custom tokenization parity in no-regression decisions.
createPipeline(stages): PipelineAssembles a custom pipeline from an ordered array of Stage objects for advanced use cases.
Install globally or run via npx:
npx compact.md <command> [options]
| Command | Description |
|---|---|
compact | Compress a Markdown file to compact.md format |
changes | Compress unified diff output for lower token usage |
prune | Lossy prune of terminal/log output |
expand | Expand a compact.md file back to Markdown |
extract | Extract and compress specific sections only |
verify | Assert lossless round-trip for a file |
metrics | Report token savings without writing output |
sections | List the heading sections in a document |
locate | Search sections by keyword |
# Compress
compact.md compact input.md -o output.cmd
# Expand
compact.md expand output.cmd -o restored.md
# Verify round-trip
compact.md verify input.md
# Stats only
compact.md metrics input.md
# Pipe-friendly
cat doc.md | compact.md compact > compressed.cmd
git diff | compact.md changes --changes-only
cat test-output.log | compact.md prune --stats
cat lint.log | compact.md prune --profile lint --stats
cat server.log | compact.md prune --profile runtime
# With options
compact.md compact input.md --dedup --semantic --stats
Add to your MCP client config:
{
"mcpServers": {
"compact-md": {
"command": "npx",
"args": ["compact-md-mcp"]
}
}
}
The MCP server exposes a spectrum of token-reduction strategies. Tools are grouped below by fidelity tier — from lossless to AI-summarized:
Lossless compression
| Tool | Description |
|---|---|
compact_md_compact | Compress Markdown to compact.md format — fully reversible |
compact_md_expand | Expand compact.md format back to standard Markdown |
compact_md_verify | Assert that round-trip is lossless for a given input |
compact_md_metrics | Report token savings without writing any output |
compact_md_changes | Compress unified git diff text (one-way, lossy) |
compact_md_prune | Lossy pruning for logs/terminal output with token gate + optional summarize fallback |
Section navigation (start here for unknown documents)
| Tool | Description |
|---|---|
compact_md_sections | List the section TOC with per-section token counts — use this first to budget context before loading content |
compact_md_locate | Search sections by keyword to find relevant content without reading the whole document |
Targeted extraction (verbatim content, optionally truncated)
| Tool | Description |
|---|---|
compact_md_extract | Retrieve exact section content, with optional maxChars / maxListItems / maxTableRows truncation |
AI summarization (lossy, cached, higher token reduction)
| Tool | Description |
|---|---|
compact_md_summarize | Abstractive LLM summary (~200 tokens by default). Supports docType: auto | guide | reference | spec. Results are cached — repeated calls on unchanged files are instant. |
compact_md_batch | Summarize multiple files in parallel in a single round-trip. Ideal for repo onboarding. |
Recommended agent workflow
1. compact_md_sections → see document structure + token sizes
2a. doc is small (<500 tokens) → read it directly
2b. need a high-level gist → compact_md_summarize
2c. need a specific section → compact_md_extract with onlySections
2d. need compressed full doc → compact_md_compact
Every transformation is lossless and reverses exactly on expand. Most of the token savings come from tables, list syntax, and tight block packing — not from rewriting every construct.
| Construct | Standard Markdown | compact.md output |
|---|---|---|
| Heading | ## Section | ## Section (unchanged) |
| Ordered list item | 1. First | + First |
| Nested unordered item | ··- Nested (2-space indent) | ..- Nested |
| Table header row | | A | B | + `|--- | --- |
| Table data row | | 1 | 2 | | | 1, 2 |
| Task list (incomplete) | - [ ] Todo | [] Todo |
| Task list (complete) | - [x] Done | [x] Done |
| Code fence | ```python … ``` | ```python … ``` (unchanged) |
| Horizontal rule | --- | --- (unchanged) |
| Version marker (optional) | — | %compact.md:1 |
What changes: tables (separator row and padding eliminated), ordered list numbers (1. → +), nested list indentation (spaces → .. per level), and task list brackets (- [ ] → []). Consecutive compact blocks (headings, tables, HR) are also tightly packed with a single newline between them instead of a blank line.
What passes through unchanged: headings, code blocks, horizontal rules, paragraphs, blockquotes, bold, italic, inline code, links, images, and frontmatter.
Note: The parser also accepts a shorthand heading syntax (
:1 Title,:2 Section, …) and single-backtick code fences (`python … `) for manually authored compact input, butcompact()does not produce these forms.
When dedup: true and savings exceed 5%, repeated substrings are replaced with §N tokens and a dictionary is prepended:
§1=repeated substring here
§2=another repeated phrase
§§
(rest of compact content)
bun install # install dependencies
bun test # run tests
bun run build # compile ESM + CJS + type declarations
bun run lint # biome check (lint + format)
bun run typecheck # tsc --noEmit
Read AGENTS.md before contributing — it documents the architecture invariants, the one-way dependency graph, and the rules that keep files small and the core zero-dependency.
The primary invariant is lossless round-trip: expand(compact(md)) === md for all inputs, always. When in doubt between two approaches, prefer the one that makes this guarantee easier to maintain.
MIT
FAQs
Token-efficient, context-aware compression for agent pipelines.
We found that @anduril-code/compact.md demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
TeamPCP is targeting security tools across the OSS ecosystem, turning scanners and CI pipelines into infostealers to access enterprise secrets.

Security News
TypeScript 6.0 introduces new standard APIs, modern default settings, and deprecations as it prepares projects for the upcoming TypeScript 7.0 release.

Security News
/Research
Newly published Trivy Docker images (0.69.4, 0.69.5, and 0.69.6) were found to contain infostealer IOCs and were pushed to Docker Hub without corresponding GitHub releases.