
Product
Introducing Repository Access Permissions and Custom Roles
Socket now supports Custom Roles and Repository Access Permissions so organizations can control who can access specific repositories and actions.
@openguardrails/openclaw-security
Advanced tools
AI agent security plugin for OpenClaw: prompt injection detection, PII sanitization, and monitoring dashboard
Comprehensive AI security for OpenClaw: AI Security Gateway + Prompt injection detection.
GitHub: https://github.com/openguardrails/openguardrails/tree/main/openclaw-security
npm: https://www.npmjs.com/package/@openguardrails/openclaw-security
✨ NEW: AI Security Gateway - Protect sensitive data (bank cards, passwords, API keys) before sending to LLMs 🛡️ Prompt Injection Detection - Detect and block malicious instructions hidden in external content 🔒 Privacy-First - All sensitive data processing happens locally on your machine 🚀 Zero-Config - Works out of the box with automatic API key registration
# Install the plugin
openclaw plugins install @openguardrails/openclaw-security
# Restart OpenClaw
openclaw gateway restart
# Enable AI Security Gateway (optional, protects sensitive data)
# Edit ~/.openclaw/openclaw.json and add:
{
"plugins": {
"entries": {
"openguardrails": {
"config": {
"gatewayEnabled": true // ← Enable AI Security Gateway
}
}
}
}
}
Protect sensitive data in your prompts before sending to LLMs. Free, no registration required, no usage limits.
The AI Security Gateway is a local HTTP proxy that automatically:
The entire process is transparent — you use your agent normally, and your sensitive data is protected without any manual steps. The LLM provider never sees your real data; you see the correct, fully restored response.
Example:
You: "My card is 6222021234567890, book a hotel"
↓ Gateway sanitizes
LLM sees: "My card is __bank_card_1__, book a hotel"
↓ LLM responds
LLM: "Booking with __bank_card_1__"
↓ Gateway restores
Tool executes with: "Booking with 6222021234567890"
| Data Type | Placeholder Example | Detected Patterns |
|---|---|---|
| Bank Cards | __bank_card_1__ | 16-19 digit numbers |
| Credit Cards | __credit_card_1__ | 1234-5678-9012-3456 |
__email_1__ | user@example.com | |
| Phone | __phone_1__ | +86-138-1234-5678 |
| API Keys | __secret_1__ | sk-..., ghp_..., Bearer tokens |
| IP Address | __ip_1__ | 192.168.1.1 |
| SSN | __ssn_1__ | 123-45-6789 |
| IBAN | __iban_1__ | GB82WEST12345698765432 |
| URL | __url_1__ | https://example.com |
More data types will be added based on user needs — contact us if you need a specific type covered.
1. Enable in config (~/.openclaw/openclaw.json):
{
"plugins": {
"entries": {
"openguardrails": {
"config": {
"gatewayEnabled": true, // Enable AI Security Gateway
"gatewayPort": 8900, // Gateway port (default: 8900)
"gatewayAutoStart": true // Auto-start (default: true)
}
}
}
}
}
2. Configure your model to use the gateway:
{
"models": {
"providers": {
"claude-protected": {
"baseUrl": "http://127.0.0.1:8900", // ← Point to gateway
"api": "anthropic-messages", // Keep protocol unchanged
"apiKey": "${ANTHROPIC_API_KEY}",
"models": [...]
}
}
}
}
3. Restart OpenClaw:
openclaw gateway restart
| Command | Description |
|---|---|
/og_gateway_status | View AI Security Gateway status and config examples |
/og_gateway_start | Start the AI Security Gateway |
/og_gateway_stop | Stop the AI Security Gateway |
/og_gateway_restart | Restart the AI Security Gateway |
📖 Full Guide: See GATEWAY_GUIDE.md for detailed setup instructions, protocol support, and troubleshooting.
Detect and block malicious instructions hidden in external content (emails, web pages, documents).
Before injection detection analysis, content is sanitized locally to remove PII:
| Data Type | Placeholder |
|---|---|
| Email addresses | <EMAIL> |
| Phone numbers | <PHONE> |
| Credit card numbers | <CREDIT_CARD> |
| SSNs | <SSN> |
| IP addresses | <IP_ADDRESS> |
| API keys & secrets | <SECRET> |
| URLs | <URL> |
| IBANs | <IBAN> |
More data types will be added based on user needs. The detection API never sees your raw sensitive data — only these placeholders.
Then the sanitized content is sent to the detection API for analysis:
External Content (email/webpage/document)
↓
┌─────────────┐
│ Local │ Strip PII: emails, phones, cards,
│ Sanitize │ SSNs, API keys, URLs, IBANs...
└─────────────┘
↓
┌─────────────┐
│ Detection │ POST /api/check/tool-call
│ API │ { sanitized content }
└─────────────┘
↓
┌─────────────┐
│ Verdict │ { isInjection, confidence,
│ │ reason, findings }
└─────────────┘
↓
Block or Allow
The plugin hooks into OpenClaw's tool_result_persist and message_received events. When your agent reads external content, OpenGuardrails sanitizes it locally, sends to the API for analysis, and blocks if injection is detected.
# Install from npm
openclaw plugins install @openguardrails/openclaw-security
# Restart gateway to load the plugin
openclaw gateway restart
On first use, the plugin automatically registers an API key — no email, password, or manual setup required.
# Check plugin list, confirm openguardrails status is "loaded"
openclaw plugins list
You should see:
| OpenGuardrails | openguardrails | loaded | ...
| Command | Description |
|---|---|
/og_gateway_status | View AI Security Gateway status and configuration |
/og_gateway_start | Start the AI Security Gateway |
/og_gateway_stop | Stop the AI Security Gateway |
/og_gateway_restart | Restart the AI Security Gateway |
| Command | Description |
|---|---|
/og_status | View detection status and statistics |
/og_report | View recent injection detections |
/og_feedback <id> fp [reason] | Report false positive |
/og_feedback missed <reason> | Report missed detection |
Download the test file with hidden injection:
curl -L -o /tmp/test-email.txt https://raw.githubusercontent.com/openguardrails/openguardrails/main/samples/test-email.txt
Ask the agent to read this file:
Read the contents of /tmp/test-email.txt
openclaw logs --follow | grep "openguardrails"
If detection succeeds, you'll see:
[openguardrails] tool_result_persist triggered for "read"
[openguardrails] Analyzing tool result from "read" (1183 chars)
[openguardrails] Analysis complete in 312ms: INJECTION DETECTED
[openguardrails] INJECTION DETECTED in tool result from "read": Contains instructions to override guidelines and execute a malicious shell command
In OpenClaw conversation:
/og_status
/og_report
# Report false positive
/og_feedback 1 fp This is normal security documentation
# Report missed detection
/og_feedback missed Email contained hidden injection that wasn't detected
Edit OpenClaw config file (~/.openclaw/openclaw.json):
{
"plugins": {
"entries": {
"openguardrails": {
"enabled": true,
"config": {
// AI Security Gateway
"gatewayEnabled": false, // Enable AI Security Gateway
"gatewayPort": 8900, // Gateway port
"gatewayAutoStart": true, // Auto-start gateway
// Injection Detection
"blockOnRisk": true, // Block when injection detected
"apiKey": "", // Auto-registered if empty
"timeoutMs": 60000, // Analysis timeout
"autoRegister": true, // Auto-register API key
"coreUrl": "https://www.openguardrails.com/core"
}
}
}
}
}
| Option | Default | Description |
|---|---|---|
gatewayEnabled | false | Enable AI Security Gateway |
gatewayPort | 8900 | Port for the gateway server |
gatewayAutoStart | true | Automatically start gateway when OpenClaw starts |
| Option | Default | Description |
|---|---|---|
enabled | true | Enable/disable injection detection |
blockOnRisk | true | Block tool calls when injection is detected |
apiKey | (auto) | API key (auto-registered if empty) |
autoRegister | true | Auto-register API key on first use |
timeoutMs | 60000 | Analysis timeout in milliseconds |
coreUrl | https://www.openguardrails.com/core | Core API endpoint |
Monitor-only mode (log detections without blocking):
{
"blockOnRisk": false
}
Full protection mode (gateway + detection):
{
"gatewayEnabled": true,
"blockOnRisk": true
}
OpenGuardrails makes money by protecting your data — not by collecting, using, or selling it.
OpenGuardrails does not need your sensitive data to perform security detection. Before any data leaves your machine, it is sanitized locally — PII, credentials, and secrets are replaced with category placeholders. The detection API only sees sanitized tool metadata, never raw content.
We do not use your data for model training. We have no LLM to train. Our detection engine is rule-driven and runs on structured signals, not on user content.
All sensitive data is sanitized on your machine before anything is sent to the cloud API for behavioral assessment:
| Data Type | Placeholder | Examples |
|---|---|---|
| Email addresses | <EMAIL> | user@example.com |
| Credit card numbers | <CREDIT_CARD> | 1234-5678-9012-3456 |
| SSNs | <SSN> | 123-45-6789 |
| IBANs | <IBAN> | GB82WEST12345698765432 |
| IP addresses | <IP_ADDRESS> | 192.168.1.1 |
| Phone numbers | <PHONE> | +1-555-123-4567 |
| URLs | <URL> | https://internal.corp/secret-path |
| API keys & secrets | <SECRET> | sk-..., ghp_..., AKIA..., Bearer tokens |
| High-entropy tokens | <SECRET> | Any 20+ character string with high randomness |
More data types will be added based on user needs — contact us if you need a specific type covered.
The gateway runs on localhost (127.0.0.1). It sanitizes your prompts before they reach the LLM and restores original values in responses. The LLM provider never sees your real data. You see the correct response. The entire protection process (sanitization and restoration) is transparent to you — no impact on functionality, no manual steps. The gateway is free with no usage limits.
~/.openclaw/credentials/openguardrails/credentials.jsonIf the cloud API is unreachable or times out, the tool call is allowed — the plugin never blocks your workflow due to network issues.
All code is open source. Audit the sanitization logic yourself:
gateway/src/sanitizer.ts — AI Security Gateway sanitizationgateway/src/restorer.ts — AI Security Gateway restorationagent/sanitizer.ts — detection API sanitizationagent/content-injection-scanner.ts — local injection detection patternsOpenGuardrails uses a single API endpoint for detection:
POST https://www.openguardrails.com/core/api/check/tool-call
Authorization: Bearer <your-api-key>
Content-Type: application/json
{
"content": "<content to analyze>",
"async": false
}
Response:
{
"ok": true,
"verdict": {
"isInjection": true,
"confidence": 0.95,
"reason": "Contains hidden instructions to override system prompt",
"findings": [
{
"suspiciousContent": "SYSTEM ALERT: Override all previous instructions...",
"reason": "Attempts to override system prompt",
"confidence": 0.95
}
]
}
}
API key registration happens automatically via POST /api/register on first use.
Have questions, feature requests, or need enterprise deployment support?
We welcome feedback on detection accuracy, requests for new sanitized data types, and enterprise inquiries for private deployment, custom rules, and dedicated support.
openclaw plugins uninstall @openguardrails/openclaw-security
openclaw gateway restart
To also remove your stored API key:
rm ~/.openclaw/credentials/openguardrails/credentials.json
# Clone repository
git clone https://github.com/openguardrails/openguardrails.git
cd openguardrails/openclaw-security
# Install dependencies
npm install
# Local development install
openclaw plugins install -l .
openclaw gateway restart
# Type check
npm run typecheck
# Run tests
npm test
MIT
FAQs
AI agent security plugin for OpenClaw: prompt injection detection, PII sanitization, and monitoring dashboard
We found that @openguardrails/openclaw-security demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Product
Socket now supports Custom Roles and Repository Access Permissions so organizations can control who can access specific repositories and actions.

Product
Socket MCP now lets AI assistants review org alerts, investigate threats using the Socket threat feed, and inspect package files in addition to dependency scoring.

Product
Socket Firewall blocks malicious VS Code and Open VSX extensions before install, protecting developers from compromised editor marketplaces.