@superagent-ai/mastra
Superagent security processors for Mastra AI agents. Protect your AI applications with threat detection (Guard) and PII redaction (Redact).
Built on top of @superagent-ai/safety-agent SDK.
Installation
npm install @superagent-ai/mastra
Features
- Guard Processor - Detect and block prompt injection, system prompt extraction, and data exfiltration attempts
- Redact Processor - Automatically remove PII/PHI from user inputs before processing
- TypeScript - Full type safety with exported types
- Multi-Provider Support - Use various LLM providers (OpenAI, Anthropic, Google, etc.) through the safety-agent SDK
Environment Variables
The processors require specific environment variables depending on which features you use:
SUPERAGENT_API_KEY=your-superagent-api-key
ANTHROPIC_API_KEY=your-anthropic-api-key
OPENAI_API_KEY=your-openai-api-key
GOOGLE_API_KEY=your-google-api-key
Note: The Guard processor uses Superagent's hosted model by default and only requires SUPERAGENT_API_KEY. The Redact processor requires an additional API key for the LLM provider you choose.
Quick Start
import { Agent } from "@mastra/core/agent";
import {
SuperagentGuardInputProcessor,
SuperagentRedactInputProcessor,
} from "@superagent-ai/mastra";
const agent = new Agent({
name: "secure-agent",
instructions: "You are a helpful assistant.",
model: "anthropic/claude-3-5-haiku-20241022",
inputProcessors: [
new SuperagentGuardInputProcessor({
apiKey: process.env.SUPERAGENT_API_KEY!,
}),
new SuperagentRedactInputProcessor({
apiKey: process.env.SUPERAGENT_API_KEY!,
model: "anthropic/claude-3-5-haiku-20241022",
}),
],
});
Processors
SuperagentGuardInputProcessor
Analyzes user inputs for security threats before they reach your agent. Uses Superagent's optimized guard model by default.
new SuperagentGuardInputProcessor({
apiKey: "your-api-key",
model: "superagent/guard-1.7b",
systemPrompt: "Custom instructions for classification",
});
Detects:
- Prompt injection attempts
- System prompt extraction attacks
- Data exfiltration attempts
Response when blocked:
When a threat is detected, the processor triggers a tripwire and the request is aborted with the violation types (e.g., prompt_injection, system_prompt_extraction).
SuperagentRedactInputProcessor
Removes sensitive information from user inputs before the agent processes them.
new SuperagentRedactInputProcessor({
apiKey: "your-api-key",
model: "anthropic/claude-3-5-haiku-20241022",
entities: ["email addresses", "social security numbers", "phone numbers"],
rewrite: false,
});
Default entities redacted:
- SSNs, Driver's License, Passport Numbers
- API Keys, Secrets, Passwords
- Names, Addresses, Phone Numbers
- Emails, Credit Card Numbers
Example output:
Input: "My email is john@example.com and SSN is 123-45-6789"
Output: "My email is <EMAIL_REDACTED> and SSN is <SSN_REDACTED>"
Configuration
Common Options
apiKey | string | Yes | Your Superagent API key |
model | string | No | Model in "provider/model" format |
Guard Options
systemPrompt | string | No | - | Custom instructions to steer classification behavior |
Redact Options
model | string | No | anthropic/claude-3-5-haiku-20241022 | Model for redaction |
entities | string[] | No | Standard PII | Custom entity types to redact |
rewrite | boolean | No | false | Rewrite text contextually instead of using placeholders |
Supported Models
Use the provider/model format when specifying models:
| Superagent | superagent/{model} | None (default for guard) |
| Anthropic | anthropic/{model} | ANTHROPIC_API_KEY |
| OpenAI | openai/{model} | OPENAI_API_KEY |
| Google | google/{model} | GOOGLE_API_KEY |
| Groq | groq/{model} | GROQ_API_KEY |
| Fireworks | fireworks/{model} | FIREWORKS_API_KEY |
| AWS Bedrock | bedrock/{model} | AWS_BEDROCK_API_KEY |
| OpenRouter | openrouter/{provider}/{model} | OPENROUTER_API_KEY |
| Vercel AI Gateway | vercel/{provider}/{model} | AI_GATEWAY_API_KEY |
Direct SDK Usage
For advanced use cases, you can use the safety-agent SDK directly:
import { createClient } from "@superagent-ai/mastra";
const client = createClient({ apiKey: process.env.SUPERAGENT_API_KEY });
const guardResult = await client.guard({
input: "user message to analyze",
});
if (guardResult.classification === "block") {
console.log("Blocked:", guardResult.violation_types);
}
const redactResult = await client.redact({
input: "My email is john@example.com",
model: "anthropic/claude-3-5-haiku-20241022",
});
console.log(redactResult.redacted);
API Reference
For more information about the underlying APIs:
License
MIT