Socket
Book a DemoInstallSign in
Socket

@superagent-ai/safety-agent

Package Overview
Dependencies
Maintainers
1
Versions
7
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@superagent-ai/safety-agent

A lightweight TypeScript guardrail SDK for content safety

latest
npmnpm
Version
0.1.6
Version published
Maintainers
1
Created
Source

Superagent Safety Agent

A lightweight TypeScript guardrail SDK for content safety with support for multiple LLM providers.

Installation

npm install @superagent-ai/safety-agent

Prerequisites

  • Sign up for an account at superagent.sh
  • Create an API key from your dashboard
  • Set the SUPERAGENT_API_KEY environment variable or pass it to createClient()

Quick Start

import { createClient } from "@superagent-ai/safety-agent";

const client = createClient();

// Guard - Classify input as safe or unsafe (uses default Superagent model)
const guardResult = await client.guard({
  input: "user message to analyze"
});

// Or specify a different model explicitly
const guardResultWithModel = await client.guard({
  input: "user message to analyze",
  model: "openai/gpt-4o-mini" 
});

if (guardResult.classification === "block") {
  console.log("Blocked:", guardResult.violation_types);
}

console.log(`Tokens used: ${guardResult.usage.totalTokens}`);

// Redact - Sanitize sensitive content
const redactResult = await client.redact({
  input: "My email is john@example.com and SSN is 123-45-6789",
  model: "openai/gpt-4o-mini"
});

console.log(redactResult.redacted);
// "My email is [REDACTED_EMAIL] and SSN is [REDACTED_SSN]"
console.log(`Tokens used: ${redactResult.usage.totalTokens}`);

Supported Providers

Use the provider/model format when specifying models:

ProviderModel FormatRequired Env Variables
Superagentsuperagent/{model}None (default for guard)
Anthropicanthropic/{model}ANTHROPIC_API_KEY
AWS Bedrockbedrock/{model}AWS_BEDROCK_API_KEY
AWS_BEDROCK_REGION (optional)
Fireworksfireworks/{model}FIREWORKS_API_KEY
Googlegoogle/{model}GOOGLE_API_KEY
Groqgroq/{model}GROQ_API_KEY
OpenAIopenai/{model}OPENAI_API_KEY
OpenRouteropenrouter/{provider}/{model}OPENROUTER_API_KEY
Vercel AI Gatewayvercel/{provider}/{model}AI_GATEWAY_API_KEY

Set the appropriate API key environment variable for your chosen provider. The Superagent guard model is used by default for guard() and requires no API keys.

File Support

The guard() method supports analyzing various file types in addition to plain text.

PDF Support

PDFs can be analyzed by providing a URL or Blob. Text is extracted from each page and analyzed in parallel for optimal performance.

// Analyze PDF from URL
const result = await client.guard({
  input: "https://example.com/document.pdf",
  model: "openai/gpt-4o-mini"
});

// Analyze PDF from Blob (browser)
const pdfBlob = new Blob([pdfData], { type: 'application/pdf' });
const result = await client.guard({
  input: pdfBlob,
  model: "openai/gpt-4o-mini"
});

Notes:

  • Each page is analyzed in parallel for low latency
  • Uses OR logic: blocks if ANY page contains a violation
  • Text extraction only (no OCR for scanned PDFs)
  • Works with all text-capable models

Image Support

Images can be analyzed using vision-capable models:

  • URLs (e.g., https://example.com/image.png) - automatically fetched and analyzed
  • Blob/File objects - processed based on MIME type

Supported Providers for Images

ProviderVision ModelsNotes
OpenAIgpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4.1Full image support
Anthropicclaude-3-*, claude-sonnet-4-*, claude-opus-4-*, claude-haiku-4-*Full image support
Googlegemini-*Full image support

Other providers (Fireworks, Groq, OpenRouter, Vercel, Bedrock) currently support text-only analysis.

Supported Image Formats

  • PNG (image/png)
  • JPEG (image/jpeg, image/jpg)
  • GIF (image/gif)
  • WebP (image/webp)

See the Image Input Examples section below for usage examples.

API Reference

createClient(config)

Creates a new safety agent client.

const client = createClient({
  apiKey: "your-api-key"
});

Options

OptionTypeRequiredDefaultDescription
apiKeystringNoSUPERAGENT_API_KEY env varAPI key for Superagent usage tracking

client.guard(options)

Classifies input content as pass or block.

Supports multiple input types:

  • Plain text: Analyzed directly
  • URLs (starting with http:// or https://): Automatically fetched and analyzed
  • Blob/File: Analyzed based on MIME type (images use vision models)
  • URL objects: Fetched and analyzed

Automatically chunks large text inputs and processes them in parallel for low latency. Uses OR logic: blocks if ANY chunk contains a violation.

Default Model: If no model is specified, uses superagent/guard-0.6b (no API keys required).

Options

OptionTypeRequiredDefaultDescription
inputstring | Blob | URLYes-The input to analyze (text, URL, or Blob)
modelstringNosuperagent/guard-0.6bModel in provider/model format (e.g., openai/gpt-4o-mini)
systemPromptstringNo-Custom system prompt that replaces the default guard prompt
chunkSizenumberNo8000Characters per chunk. Set to 0 to disable chunking

Response

FieldTypeDescription
classification"pass" | "block"Whether the content passed or should be blocked
violation_typesstring[]Types of violations detected
cwe_codesstring[]CWE codes associated with violations
usageTokenUsageToken usage information

Example

const result = await client.guard({
  input: "user message to analyze",
  model: "openai/gpt-4o-mini",
  systemPrompt: `You are a safety classifier. Block any requests for medical advice.
  
  Respond with JSON: { "classification": "pass" | "block", "violation_types": [], "cwe_codes": [] }`
});

if (result.classification === "block") {
  console.log("Blocked:", result.violation_types);
}

Example (Chunking)

For large inputs, the guard method automatically splits content into chunks and processes them in parallel:

// Auto-chunking (default: 8000 chars)
const result = await client.guard({
  input: veryLongDocument,
  model: "openai/gpt-4o-mini"
});

// Custom chunk size
const result = await client.guard({
  input: veryLongDocument,
  model: "openai/gpt-4o-mini",
  chunkSize: 4000 // Smaller chunks
});

// Disable chunking
const result = await client.guard({
  input: shortText,
  model: "openai/gpt-4o-mini",
  chunkSize: 0
});

Example (URL Input)

Analyze content from a URL - the content is automatically fetched and processed:

// Analyze text from a URL
const result = await client.guard({
  input: "https://example.com/document.txt",
  model: "openai/gpt-4o-mini"
});

// Analyze JSON from an API
const result = await client.guard({
  input: "https://api.example.com/data.json",
  model: "openai/gpt-4o-mini"
});

// Using a URL object
const url = new URL("https://example.com/content");
const result = await client.guard({
  input: url,
  model: "openai/gpt-4o-mini"
});

Example (Image Input) {#example-image-input}

Analyze images using vision-capable models. See Image Support for supported providers and models.

// Analyze image from URL (auto-detected by image extension or content-type)
const result = await client.guard({
  input: "https://example.com/image.png",
  model: "openai/gpt-4o"  // Must be a vision-capable model
});

// Analyze image from Blob (browser)
const imageBlob = new Blob([imageData], { type: 'image/png' });
const result = await client.guard({
  input: imageBlob,
  model: "anthropic/claude-3-5-sonnet-20241022"
});

// Analyze uploaded file (browser)
const file = document.getElementById('upload').files[0];
const result = await client.guard({
  input: file,
  model: "google/gemini-1.5-pro"
});

Note: Image analysis requires a vision-capable model from a supported provider (OpenAI, Anthropic, or Google). The SDK automatically detects image inputs and routes them to vision-capable models.

client.redact(options)

Redacts sensitive or dangerous content from text.

Options

OptionTypeRequiredDefaultDescription
inputstringYes-The input text to redact
modelstringYes-Model in provider/model format (e.g., openai/gpt-4o-mini)
entitiesstring[]NoDefault PII entitiesList of entity types to redact (e.g., ["emails", "phone numbers"])
rewritebooleanNofalseWhen true, rewrites text contextually instead of using placeholders

Response

FieldTypeDescription
redactedstringThe sanitized text with redactions applied
findingsstring[]Descriptions of what was redacted
usageTokenUsageToken usage information

Example (Placeholder Mode - Default)

const result = await client.redact({
  input: "My email is john@example.com and SSN is 123-45-6789",
  model: "openai/gpt-4o-mini"
});

console.log(result.redacted);
// "My email is <EMAIL_REDACTED> and SSN is <SSN_REDACTED>"

Example (Rewrite Mode)

const result = await client.redact({
  input: "My email is john@example.com and SSN is 123-45-6789",
  model: "openai/gpt-4o-mini",
  rewrite: true
});

console.log(result.redacted);
// "My email is on file and my social security number has been provided"

Example (Custom Entities)

const result = await client.redact({
  input: "Contact john@example.com or call 555-123-4567",
  model: "openai/gpt-4o-mini",
  entities: ["email addresses"] // Only redact emails, keep phone numbers
});

console.log(result.redacted);
// "Contact <EMAIL_REDACTED> or call 555-123-4567"

Token Usage

Both guard() and redact() methods return token usage information in the usage field:

FieldTypeDescription
promptTokensnumberNumber of tokens in the prompt/input
completionTokensnumberNumber of tokens in the completion/output
totalTokensnumberTotal tokens used (promptTokens + completionTokens)

Example

const result = await client.guard({
  input: "user message to analyze",
  model: "openai/gpt-4o-mini"
});

console.log(`Used ${result.usage.totalTokens} tokens`);
console.log(`Prompt: ${result.usage.promptTokens}, Completion: ${result.usage.completionTokens}`);

Custom System Prompts

Override default prompts for custom classification behavior:

const result = await client.guard({
  input: "user message",
  model: "openai/gpt-4o-mini",
  systemPrompt: `Your custom classification prompt here...
  
  Respond with JSON: { "classification": "pass" | "block", "violation_types": [], "cwe_codes": [] }`
});

Keywords

guardrail

FAQs

Package last updated on 22 Dec 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts