
Research
/Security News
Mini Shai-Hulud Campaign Hits Red Hat Cloud Services npm Packages
A mini Shai-Hulud campaign compromised Red Hat Cloud Services npm packages to steal developer and CI/CD secrets during installation.
@apideck/agent-analytics
Advanced tools
Track AI agent and bot traffic to your Next.js / Vercel app — PostHog, webhooks, or any custom analytics backend. Detects Claude, ChatGPT, Perplexity, Google-Extended, and more.
Drop-in Next.js / Vercel middleware that tracks ClaudeBot, GPTBot, Perplexity, and 20+ AI crawlers in PostHog — or any analytics backend you already pay for.
Install · Quick start · How it works · Adapters · Markdown mirror · FAQ
Client-side analytics libraries run in the browser. AI crawlers don't. That means every time ClaudeBot, GPTBot, or Perplexity fetches a page on your site, your dashboard stays empty.
You can't see:
Server logs have the data, but turning them into analytics is a pipeline project. This library is the one-line version.
import { trackVisit, posthogAnalytics } from '@apideck/agent-analytics'
const analytics = posthogAnalytics({ apiKey: process.env.POSTHOG_KEY! })
export function middleware(req: NextRequest) {
void trackVisit(req, { analytics }) // ← that's the whole thing
return NextResponse.next()
}
One line of middleware. Fire-and-forget. Zero impact on your response latency. Events in PostHog within seconds.
{
"event": "doc_view",
"distinct_id": "anon_7f3a1b2c", // hashed ip:ua, no person profile
"timestamp": "2026-04-19T08:30:00.000Z",
"properties": {
"$process_person_profile": false, // PostHog: don't create a person
"$current_url": "https://example.com/docs/intro",
"path": "/docs/intro",
"method": "GET",
"user_agent": "ClaudeBot/1.0 (+https://claude.ai/bot)",
"is_ai_bot": true, // strict: matches a branded AI crawler
"bot_name": "Claude", // 'Claude' | 'ChatGPT' | ... | 'curl' | 'axios' | 'Electron' | 'Browser' | 'Other'
"ua_category": "declared-crawler", // 'declared-crawler' | 'coding-agent-hint' | 'browser' | 'other'
"coding_agent_hint": false, // loose: HTTP-library / automation UA (curl, axios, got, colly, Electron, ...)
"referer": "https://claude.ai/",
"source": "page-view" // whatever label you passed
}
}
Now you can build:
npm install @apideck/agent-analytics
# or
pnpm add @apideck/agent-analytics
# or
yarn add @apideck/agent-analytics
Zero dependencies. Runs on Node 18+, Edge, Bun, and anywhere the Web Fetch API exists.
1. Pick an adapter
Ships with PostHog, webhook, and custom adapters. BYO analytics. |
2. Wire the middleware
Works in any middleware that hands you a |
3. Ship it
Hit any page with a spoofed UA:
Event lands in PostHog in seconds. |
Request Response (unchanged)
Agent ─────────────────────► middleware ───────────────────► Agent
│
│ fire-and-forget
│ keepalive: true
▼
┌──────────────────┐
│ AnalyticsAdapter │
│ (PostHog / │
│ webhook / │
│ custom fn) │
└──────────────────┘
The middleware call:
req.headers.get('user-agent')AI_BOT_PATTERN (ClaudeBot, GPTBot, PerplexityBot, Google-Extended, Applebot-Extended, CCBot, Bytespider, Amazonbot, Meta-ExternalAgent, MistralAI-User, Cursor, Windsurf, and more)ip:ua with djb2 → stable anon distinct_id (same bot from same network = same visitor, no PII)keepalive: true so the request survives after the response returnsBy default every request is captured so coding-agent traffic (axios, curl, Electron, …) surfaces alongside branded crawlers. Pass onlyBots: true to restrict capture to UAs matching the built-in AI bot pattern.
| Agent | UA signature | bot_name label |
|---|---|---|
| Anthropic | ClaudeBot, Claude-User, Anthropic-* | Claude |
| OpenAI | ChatGPT-User, GPTBot, OAI-SearchBot | ChatGPT |
| Perplexity | PerplexityBot, Perplexity-User | Perplexity |
Google-Extended, Googlebot | Google | |
| Apple | Applebot-Extended, Applebot | Apple |
| Meta | Meta-ExternalAgent, FacebookBot | Meta |
| Amazon | Amazonbot | Amazon |
| Bytedance | Bytespider | Bytespider |
| Common Crawl | CCBot | Common Crawl |
| Mistral | MistralAI-User | Mistral |
| Cohere | cohere-ai | Cohere |
| DuckDuckGo | DuckAssistBot | DuckDuckGo |
| You.com | YouBot | You.com |
| AI2 | AI2Bot | AI2 |
| Diffbot | Diffbot | Diffbot |
| Coding agents | Cursor, Windsurf | Cursor / Windsurf |
New agents appear every month. Patch releases ship as the list grows — watch the repo for updates. Raise a PR if you spot one we're missing.
coding_agent_hint: true)Coding agents like Claude Code, Cline, Cursor, and Windsurf don't identify themselves by name in their user agent. They use whatever HTTP library they're built on, so detection is a loose heuristic — the UAs below are also used by legitimate curl scripts, CI jobs, and server-to-server traffic.
is_ai_bot stays false for these so your strict AI-traffic segment is clean. The coding_agent_hint property is the wider net; pair it with other signals (path patterns, JA4 fingerprints via Vercel Log Drains, HEAD-then-GET request shape) when you need higher confidence.
| Agent | Signature observed | bot_name |
|---|---|---|
| Claude Code | axios/1.8.4 | axios |
| Cline / Junie | curl/8.4.0 | curl |
| Cursor | got (sindresorhus/got) | got |
| Windsurf | colly (Go) | colly |
| VS Code | Electron/ marker | Electron |
| Other automation | node-fetch, python-requests, Go-http-client, okhttp, aiohttp, Deno | exact library name |
Playwright-based agents (Aider, OpenCode) spoof full Mozilla/Safari UAs and are indistinguishable from real browsers by UA alone. They'll show up as bot_name: Browser, ua_category: browser. Catching those needs TLS fingerprinting (JA4) or behavioural analysis.
Credit: coding-agent signatures catalogued by Addy Osmani.
posthogAnalyticsimport { posthogAnalytics } from '@apideck/agent-analytics'
const analytics = posthogAnalytics({
apiKey: process.env.NEXT_PUBLIC_POSTHOG_KEY!,
host: 'https://eu.i.posthog.com' // optional; defaults to US cloud
})
Host can be the PostHog cloud (us.i.posthog.com, eu.i.posthog.com) or your own reverse-proxy domain (e.g. https://svc.yourdomain.com) to dodge ad-blockers. Scheme is optional — both 'https://host' and 'host' work.
webhookAnalyticsimport { webhookAnalytics } from '@apideck/agent-analytics'
const analytics = webhookAnalytics({
url: 'https://collector.example.com/events',
headers: { Authorization: `Bearer ${process.env.TOKEN}` },
transform: (event) => ({ // optional: reshape for your backend
type: event.event,
user: event.distinctId,
...event.properties
})
})
customAnalyticsimport { customAnalytics } from '@apideck/agent-analytics'
import { Mixpanel } from 'mixpanel'
const mp = Mixpanel.init(process.env.MIXPANEL_TOKEN!)
const analytics = customAnalytics((event) => {
mp.track(event.event, { distinct_id: event.distinctId, ...event.properties })
})
Any { capture(event): Promise<void> | void } object is a valid adapter. Compose multiple by fanning out in a custom callback.
Content-heavy sites should serve clean Markdown when an agent asks for it — that's what makes your docs actually useful to coding agents, not just indexable. The /markdown subpath exports the helpers that power developers.apideck.com's agent-readiness stack:
import {
markdownServeDecision, // decide if this request should get Markdown
markdownHeaders, // Content-Type, Content-Signal, x-markdown-tokens
synthesizeMarkdownPointer // fallback for URLs without a mirror
} from '@apideck/agent-analytics/markdown'
Three triggers, one decision helper:
| Trigger | Example | reason |
|---|---|---|
| AI-bot UA on any URL | curl -A ClaudeBot /docs/intro | ua-rewrite |
.md suffix | curl /docs/intro.md | md-suffix |
Accept: text/markdown header | curl -H "Accept: text/markdown" /docs/intro | accept-header |
Full middleware example: README.md → Markdown mirror helpers section, or copy from the reference implementation.
Peec.ai's Agent analytics product ingests a CSV/CLF access log and produces dashboards on top of it. The Peec docs assume you have a Vercel Log Drain → Axiom (or similar) pipeline that emits these eight columns: timestamp, request_method, request_url, response_status, client_ip, user_agent, country_code, referer.
If you're already running this library, you can skip the log drain — your PostHog agent_visit events are a near-superset of that schema. Opt into the two privacy-sensitive fields:
void trackVisit(req, {
analytics,
captureCountry: true, // emits country_code from x-vercel-ip-country / cf-ipcountry / x-country-code
captureGeo: true, // emits region, city, latitude, longitude, timezone from x-vercel-ip-* (URL-decoded)
captureIp: true // emits raw client_ip (first hop of x-forwarded-for)
})
All three default to off so the library stays PII-free out of the box. Enable them only on the deployments you intend to export. captureGeo is more identifying than captureCountry (city resolution vs. country) — opt in deliberately.
Then export from PostHog with a SQL insight:
SELECT
timestamp AS timestamp,
coalesce(properties.method, 'GET') AS request_method,
properties.$current_url AS request_url,
'200' AS response_status, -- middleware runs pre-response
coalesce(properties.client_ip, properties.$ip) AS client_ip,
properties.user_agent AS user_agent,
coalesce(properties.country_code,
properties.$geoip_country_code) AS country_code,
properties.referer AS referer
FROM events
WHERE event = 'agent_visit'
AND properties.is_ai_bot = true
AND timestamp >= now() - INTERVAL 30 DAY
ORDER BY timestamp DESC
coalesce makes the query work on historical events that predate the new fields and on events where captureCountry / captureIp are off (PostHog's built-in $ip and $geoip_country_code enrichment fills the gap). Click Export → CSV and upload to Peec.
Caveats:
response_status is hardcoded 200 — middleware runs before the response. If Peec filters on status, use the Vercel Log Drain path instead.is_ai_bot = true from the WHERE clause to also include coding-agent / scraper traffic (curl, axios, headless browsers).| @apideck/agent-analytics | DIY middleware | Dark Visitors SaaS | Cloudflare AI Labyrinth | |
|---|---|---|---|---|
| Tracks agents in your analytics | ✓ | ✓ (after N hours of glue code) | ✓ (external dashboard) | ✗ (it blocks them instead) |
| Reuses your analytics backend | ✓ PostHog / webhook / any | ✓ | ✗ (their dashboard) | ✗ |
| Zero runtime dependencies | ✓ | ✓ | ✗ (SaaS) | ✗ (Cloudflare) |
| Ships maintained UA list | ✓ | ✗ | ✓ | ✓ |
| Markdown-mirror helpers | ✓ | ✗ | ✗ | ✗ |
| Monthly cost | $0 | $0 + engineering time | $$$ | Requires CF plan |
No. trackVisit returns a promise you don't await, and the underlying fetch uses keepalive: true — the browser / runtime guarantees the request completes after your response returns. Your critical path is: req.headers.get('user-agent') + a regex test + a void fetch(...). Sub-millisecond.
The adapter call is wrapped in try/catch — trackVisit never throws, even if PostHog / your webhook / your custom callback crashes. You lose the event, not the response.
No. The event includes $process_person_profile: false, which tells PostHog to skip profile creation. Distinct IDs are djb2 hashes of ip:ua, so same-bot-same-network collapses into one anonymous visitor for journey analysis, but no "person" row gets created.
import { isAiBot, parseBotName } from '@apideck/agent-analytics'
if (isAiBot(req.headers.get('user-agent'))) {
// serve Markdown, skip personalisation, add rate limits, etc.
}
parseBotName('ClaudeBot/1.0') // → 'Claude'
Yes. The primary API takes a standard Web Fetch Request object. Works in Hono, Bun, Cloudflare Workers, Deno Deploy, Node 18+ HTTP handlers — anywhere you can get a Request.
PostHog's bot filter excludes bots from your metrics. This library does the opposite: it makes bots visible so you can analyse them deliberately. Complementary — segment by is_ai_bot to split the populations.
AI crawlers keep appearing. We publish patch releases whenever the list changes — npm update @apideck/agent-analytics picks them up. If you spot a missing agent, send a PR with a link to the bot's official docs; merges ship the same day.
createMarkdownMiddleware() — a batteries-included Next.js middleware for the full agent-readiness stackFile a feature request if something's missing from your setup.
PRs welcome — especially new UA signatures, adapters, and docs.
git clone https://github.com/apideck-libraries/agent-analytics
cd agent-analytics
npm install
npm test
Publishing to npm is fully automated by two workflows:
release.yml watches package.json on main. When the version field
bumps to something without a matching v<version> tag, it creates the
GitHub Release.publish.yml fires on release: published, runs typecheck + tests +
build, and publishes to npm with --provenance via OIDC trusted
publishing (no secrets required).So cutting a release is just:
# Pick a level (patch | minor | major), or edit package.json directly.
npm version patch
git push
The push lands on main, release.yml notices the new version, cuts the
release, and publish.yml publishes. No CLI juggling, no secrets to manage.
OIDC trusted publishing is configured at
npmjs.com/package/@apideck/agent-analytics/access
— the GitHub repo + publish.yml workflow are registered as the sole
trusted publisher.
Built on learnings from:
FAQs
Track AI agent and bot traffic to your Next.js / Vercel app — PostHog, webhooks, or any custom analytics backend. Detects Claude, ChatGPT, Perplexity, Google-Extended, and more.
We found that @apideck/agent-analytics demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 7 open source maintainers collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Research
/Security News
A mini Shai-Hulud campaign compromised Red Hat Cloud Services npm packages to steal developer and CI/CD secrets during installation.

Research
/Security News
The North Korean malware loader hides in a Packagist-listed package and its GitHub branch to fetch and execute remote code in a likely Contagious Interview-style lure.

Security News
The Rust project is moving toward formal rules on LLM use in contributions after months of internal debate over maintainer burden, code quality, and contributor experience.