Agent Context Explorer
Companion tool for Sanity Agent Context. Explores your Agent Context server, documents what works and what doesn't, and produces exploration-results.md — ready to copy directly into your Agent Context Document.
Why This Tool?
When building AI agents that work with Agent Context, agents need to know:
- What data exists and what's reliably populated
- How to query effectively for different question types
- What caveats and hazards will trip them up (the most valuable knowledge)
Manual exploration is time-consuming and doesn't scale. This tool automates that process, producing structured documentation that helps production agents work correctly from day one.
What You Get
After running the explorer, you'll have documentation that tells agents:
- Which document types to query (and which to ignore)
- Working query patterns for different question types
- Critical hazards — data gaps, external system dependencies, and silent failure modes that would otherwise cause wrong answers
Installation
npm install -g @sanity/agent-context-explorer
npm install @sanity/agent-context-explorer
Quick Start
- Create a questions file (
questions.json). Always include expected_answer — the explorer uses it to guide exploration and validate results:
{
"questions": [
{ "question": "What sizes does the Trailblazer Hiking Boot come in?", "expected_answer": "US 7-13, including half sizes" },
{ "question": "Is the Ultralight Tent waterproof?", "expected_answer": "Yes, it has a 3000mm waterproof rating with taped seams" },
{ "question": "What's the difference between the Summit Pack and the Daybreak Pack?", "expected_answer": "Summit is 65L for multi-day trips with a frame; Daybreak is 28L for day hikes" }
]
}
agent-context-explorer \
--mcp-url https://api.sanity.io/vX/agent-context/YOUR_PROJECT_ID/YOUR_DATASET/YOUR_SLUG \
--questions ./questions.json \
--sanity-token $SANITY_API_READ_TOKEN \
--anthropic-api-key $ANTHROPIC_API_KEY
- Copy the contents of
exploration-results.md into your Agent Context Document's instructions field. The output directory will be timestamped (e.g., ./explorer-output-2026-02-11T09-22-30/).
Question File Format
{
"questions": [
{
"question": string,
"expected_answer"?: string,
"id"?: string
}
]
}
Always include expected_answer. The explorer uses it in two ways:
- Guides exploration — injected as a hint so the agent knows what to look for
- Validates results — the agent's answer is graded against yours, and this flows into confidence levels in the output
Without expected answers, the explorer can't tell if it found the right data. Even a rough or partial answer is better than none.
Tips for writing good questions:
- Always provide
expected_answer — even approximate answers help the explorer validate findings
- Cover different question types (specs, compatibility, comparisons, troubleshooting)
- Include questions you expect to succeed AND ones that might fail
- 20-50 questions is typically sufficient for good coverage. More questions = richer documentation
- Use an AI assistant to help generate questions — start by listing questions your customers actually ask, then use an AI to expand them into a diverse set covering different categories and edge cases
Output Files
The tool generates several files in the output directory:
exploration-results.md (Primary Output)
This is the file you copy into your Agent Context Document. It contains LLM-ready instructions with five sections:
- Schema Reference — Document types, what they're used for, and key fields
- Query Patterns — Working query examples organized by use case, with inline warnings
- Critical Rules — "Always do X" and "Never do Y" statements derived from exploration
- Known Limitations — What data is NOT available (null fields, external dependencies)
- Exploration Coverage — What was tested and confidence levels
The synthesis agent may add additional sections when findings naturally cluster (e.g., "Locale Handling" if locale issues were prominent).
Failures are the most valuable output — they document what would trip up a naive agent, preventing wrong answers and wasted queries.
logs/*.json
Individual exploration logs for each question, containing:
- All tool calls made
- Learnings and caveats discovered
- Validation results against expected answers
metrics.json
Aggregated statistics: success rates, confidence distribution, category coverage, and validation results.
Answer Validation
When you provide expected_answer in your questions, the explorer validates the agent's answer against yours using an LLM comparison. This appears in the CLI output as:
[1/12] ✓ Success (high confidence)
Validation: full
Match levels:
full | Agent's answer conveys the same information as expected (even if worded differently) |
partial | Answer contains some expected information but is missing parts |
none | Answer is different or contradictory |
gap_identified | Agent correctly determined the data doesn't exist in this dataset |
Validation results flow into the final exploration-results.md — full matches produce [High] confidence patterns, partial matches produce [Medium], and failures are documented in the Known Limitations section.
CLI Options
--mcp-url <url> | Agent Context server URL (required) | — |
--questions <path> | Path to questions JSON file (required) | — |
--sanity-token <token> | Sanity API read token for authentication | — |
--anthropic-api-key <key> | Anthropic API key (or set ANTHROPIC_API_KEY env var) | — |
--model <model> | Claude model for exploration | claude-sonnet-4-20250514 |
--output <dir> | Output directory | ./explorer-output-{timestamp} |
--concurrency <n> | Number of questions to explore in parallel | 3 |
--help | Show help message | — |
How It Works
- Connect — Establishes connection to the Agent Context server and discovers available tools
- Explore — For each question, an AI agent attempts to answer it using the available tools
- Learn — The agent documents what worked, what failed, and what surprised it
- Synthesize — All exploration logs are synthesized into a unified knowledge document
The key insight: failures are more valuable than successes. A naive agent can figure out what works through trial-and-error. What they can't discover is why something that looks right doesn't work, or why data that should exist is missing.
Example Output
Here's a snippet from a generated exploration-results.md:
## Schema Reference
| Document Type | Use For | Key Fields |
|---------------|---------|------------|
| product | Product info, specs | name, description, specs, variants |
| category | Product categorization | title, slug, products[] |
| support-article | Help content | title, body, relatedProducts[] |
## Query Patterns
### Product Details
**When to use:** User asks about a specific product's features or specifications
*[_type == "product" && name match $productName][0]{
name, description, specs, variants
}
**Important:** Always use `name` not `title` — the `title` field is null on products.
### Product Comparison
**When to use:** User wants to compare two or more products
*[_type == "product" && name in $productNames]{
name, specs, variants
}
**Important:** Filter results by locale if your dataset has multiple language variants.
## Critical Rules
- Always use `name` for product lookups — `title` is null on all product documents
- Always filter by locale when querying products to avoid duplicate results
- Never query `inventory` or `stockLevel` — these fields are always null (managed in external system)
## Known Limitations
- Inventory and stock data lives in Shopify, not this dataset [High confidence]
- Pricing data lives in Commerce API [High confidence]
- The `title` field on products is always null — use `name` instead [High confidence]
## Exploration Coverage
**Validated areas:** product specs, product comparison, category browsing, support content
**Confidence:** High — 12 questions explored with 92% success rate
**Not explored:** user reviews, order history, real-time inventory
Using the Output
After running the explorer:
- Open
exploration-results.md in your output directory
- Review the generated instructions — adjust if needed for your specific use case
- Copy the entire contents into your Agent Context Document's
instructions field
This gives your agent dataset-specific knowledge from day one.
Requirements
- Node.js 20+
- An Anthropic API key (pass via
--anthropic-api-key or set ANTHROPIC_API_KEY env var)
- A Sanity Agent Context server URL (see Agent Context setup)
License
MIT