New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

@amplitude/ai

Package Overview
Dependencies
Maintainers
10
Versions
15
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@amplitude/ai

Amplitude AI SDK - LLM usage tracking for Amplitude Analytics

latest
Source
npmnpm
Version
0.3.9
Version published
Maintainers
10
Created
Source

@amplitude/ai

npm version CI License: MIT

Agent analytics for Amplitude. Track every LLM call, user message, tool call, and quality signal as events in your Amplitude project — then build funnels, cohorts, and retention charts across AI and product behavior.

npm install @amplitude/ai @amplitude/analytics-node
import { AmplitudeAI, OpenAI } from '@amplitude/ai';

const ai = new AmplitudeAI({ apiKey: process.env.AMPLITUDE_AI_API_KEY! });
const openai = new OpenAI({ amplitude: ai, apiKey: process.env.OPENAI_API_KEY });
const agent = ai.agent('my-agent');

app.post('/chat', async (req, res) => {
  const session = agent.session({ userId: req.userId, sessionId: req.sessionId });

  const result = await session.run(async (s) => {
    s.trackUserMessage(req.body.message);
    const response = await openai.chat.completions.create({
      model: 'gpt-4o',
      messages: req.body.messages,
    });
    return response.choices[0].message.content;
  });

  await ai.flush();
  res.json({ response: result });
});
// Events: [Agent] User Message, [Agent] AI Response (with model, tokens, cost, latency),
//         [Agent] Session Start, [Agent] Session End — all tied to userId and sessionId

How to Get Started

npm install @amplitude/ai
npx amplitude-ai

The CLI prints a prompt to paste into any AI coding agent (Cursor, Claude Code, Windsurf, Copilot, Codex, etc.):

Instrument this app with @amplitude/ai. Follow node_modules/@amplitude/ai/amplitude-ai.md

The agent reads the guide, scans your project, discovers your agents and LLM call sites, and instruments everything — provider wrappers, session lifecycle, multi-agent delegation, tool tracking, scoring, and a verification test. You review and approve each step.

Manual setup

Whether you use a coding agent or set up manually, the goal is the same: full instrumentation — agents + sessions + provider wrappers. This gives you every event type, per-user analytics, and server-side enrichment.

Follow the code example above to get started. The pattern is:

  • Swap your LLM importimport { OpenAI } from '@amplitude/ai' (or Anthropic, Gemini, etc.)
  • Create an agentai.agent('my-agent') to name and track your AI component
  • Wrap in a sessionagent.session({ userId, sessionId }).run(async (s) => { ... }) for per-user analytics, funnels, cohorts, and server-side enrichment
  • Track user messagess.trackUserMessage(...) for conversation context
  • Score responsess.score(...) for quality measurement

patch() exists for quick verification or legacy codebases where you can't modify call sites, but it only captures [Agent] AI Response without user identity — no funnels, no cohorts, no retention. Start with full instrumentation; fall back to patch() only if you can't modify call sites.

PropertyValue
Name@amplitude/ai
Version0.3.9
RuntimeNode.js
Peer dependency@amplitude/analytics-node >= 1.3.0
Optional peersopenai, @anthropic-ai/sdk, @google/generative-ai, @mistralai/mistralai, @aws-sdk/client-bedrock-runtime, @pydantic/genai-prices (cost), tiktoken or js-tiktoken (token counting)

Table of Contents

Installation

npm install @amplitude/ai @amplitude/analytics-node

Install provider SDKs based on what you use (for example: openai, @anthropic-ai/sdk, @google/generative-ai, @mistralai/mistralai, @aws-sdk/client-bedrock-runtime).

Quick Start

5-minute quick start

  • Install: npm install @amplitude/ai @amplitude/analytics-node
  • Get your API key: In Amplitude, go to Settings > Projects and copy the API key.
  • Instrument: Run npx amplitude-ai and paste the printed prompt into your AI coding agent. Or follow the manual setup steps — the goal is the same: agents + sessions + provider wrappers.
  • Set your API key in the generated .env file and replace the placeholder userId/sessionId.
  • Run your app. You should see [Agent] User Message, [Agent] AI Response, and [Agent] Session End within 30 seconds.

To verify locally before checking Amplitude, add debug: true:

const ai = new AmplitudeAI({
  apiKey: process.env.AMPLITUDE_AI_API_KEY!,
  config: new AIConfig({ debug: true }),
});
// Prints: [amplitude-ai] [Agent] AI Response | model=gpt-4o | tokens=847 | cost=$0.0042 | latency=1,203ms

Tip: Call enableLivePriceUpdates() at startup so cost tracking stays accurate when new models are released. See Cache-Aware Cost Tracking.

Current Limitations

AreaStatus
RuntimeNode.js only (no browser). Python SDK available separately (amplitude-ai on PyPI).
Zero-code patchingOpenAI, Anthropic, Azure OpenAI, Gemini, Mistral, Bedrock (Converse/ConverseStream only).
CrewAIPython-only; the Node.js export throws ProviderError by design. Use LangChain or OpenTelemetry integrations instead.
OTEL scope filteringNot yet supported (Python SDK has allowed_scopes/blocked_scopes).
Streaming cost trackingAutomatic for OpenAI and Anthropic. Manual token counts required for other providers' streamed responses.

Is this for me?

Yes, if you're building an AI-powered feature (chatbot, copilot, agent, RAG pipeline) and you want to measure how it impacts real user behavior. AI events land in the same Amplitude project as your product events, so you can build funnels from "user asks a question" to "user converts," create cohorts of users with low AI quality scores, and measure retention without stitching data across tools.

Already using an LLM observability tool? Keep it. The OTEL bridge adds Amplitude as a second destination in one line. Your existing traces stay, and you get product analytics on top.

Why this SDK?

Most AI observability tools give you traces. This SDK gives you per-turn events that live in your product analytics so you can:

  • Build funnels from "user opens chat" through "AI responds" to "user converts"
  • Create cohorts of users with low AI quality scores and measure their 7-day retention
  • Answer "is this AI feature helping or hurting?" without moving data between tools

The structural difference is the event model. Trace-centric tools typically produce spans per LLM call. This SDK produces one event per conversation turn with 40+ properties: model, tokens, cost, latency, reasoning, implicit feedback signals (regeneration, copy, abandonment), cache breakdowns, agent hierarchy, and experiment context. Each event is independently queryable in Amplitude's charts, cohorts, funnels, and retention analysis.

Every AI event carries your product user_id. No separate identity system, no data joining required. Build a funnel from "user opens chat" to "AI responds" to "user upgrades" directly in Amplitude.

Server-side enrichment does the evals for you. When content is available (contentMode: 'full'), Amplitude's enrichment pipeline runs automatically on every session after it closes. You get topic classifications, quality rubrics, behavioral flags, and session outcomes without writing or maintaining any eval code. Define your own topics and scoring rubrics; the pipeline applies them to every session automatically. Results appear as [Agent] Score events with rubric scores, [Agent] Topic Classification events with category labels, and [Agent] Session Evaluation summaries, all queryable in charts, cohorts, and funnels alongside your product events.

Quality signals from every source in one event type. User thumbs up/down (source: 'user'), automated rubric scores from the enrichment pipeline (source: 'ai'), and reviewer assessments (source: 'reviewer') all produce [Agent] Score events differentiated by [Agent] Evaluation Source. One chart shows all three side by side. Filter by source or view them together. Filter by [Agent] Agent ID for per-agent quality attribution.

Three content-control tiers. full sends content and Amplitude runs enrichments for you. metadata_only sends zero content (you still get cost, latency, tokens, session grouping). customer_enriched sends zero content but lets you provide your own structured labels via trackSessionEnrichment().

Cache-aware cost tracking. Pass cacheReadTokens and cacheCreationTokens for accurate blended costs. Without this breakdown, naive cost calculation can overestimate by 2-5x for cache-heavy workloads.

What you can build

Once AI events are in Amplitude alongside your product events:

  • Cohorts. "Users who had 3+ task failures in the last 30 days." "Users with low task completion scores." Target them with Guides, measure churn impact.
  • Funnels. "AI session about charts -> Chart Created." "Sign Up -> First AI Session -> Conversion." Measure whether AI drives feature adoption and onboarding.
  • Retention. Do users with successful AI sessions retain better than those with failures? Segment retention curves by [Agent] Overall Outcome or task completion score.
  • Agent analytics. Compare quality, cost, and failure rate across agents in one chart. Identify which agent in a multi-agent chain introduced a failure.

How quality measurement works

The SDK captures quality signals at three layers, from most direct to most comprehensive:

1. Explicit user feedback — Instrument thumbs up/down, star ratings, or CSAT scores via trackScore(). Each call produces an [Agent] Score event with source: 'user':

ai.trackScore({
  userId: 'u1', name: 'user-feedback', value: 1,
  targetId: aiMessageId, targetType: 'message', source: 'user',
});

2. Implicit behavioral signals — The SDK auto-tracks behavioral proxies for quality on every turn, with zero additional instrumentation:

SignalPropertyEventInterpretation
Copy[Agent] Was Copied[Agent] AI ResponseUser copied the output — positive
Regeneration[Agent] Is Regeneration[Agent] User MessageUser asked for a redo — negative
Edit[Agent] Is Edit[Agent] User MessageUser refined their prompt — friction
Abandonment[Agent] Abandonment Turn[Agent] Session EndUser left after N turns — potential failure

3. Automated server-side evaluation — When contentMode: 'full', Amplitude's enrichment pipeline runs LLM-as-judge evaluators on every session after it closes. No eval code to write or maintain:

RubricWhat it measuresScale
task_completionDid the agent accomplish what the user asked?0–2
response_qualityWas the response clear, accurate, and helpful?0–2
user_satisfactionDid the user seem satisfied based on conversation signals?0–2
agent_confusionDid the agent misunderstand or go off track?0–2

Plus boolean detectors: negative_feedback (frustration phrases), task_failure (agent failed to deliver), data_quality_issues, and behavioral_patterns (clarification loops, topic drift). All results are emitted as [Agent] Score events with source: 'ai'.

All three layers use the same [Agent] Score event type, differentiated by [Agent] Evaluation Source ('user', 'ai', or 'reviewer'). One chart shows user feedback alongside automated evals. No joins, no separate tables.

What You Set vs What You Get

You setWhere it comes fromWhat you unlock
API keyAmplitude project settingsEvents reach Amplitude
userIdYour auth layer (JWT, session cookie, API token)Per-user analytics, cohorts, retention
agentIdYour choice (e.g. 'chat-handler')Per-agent cost, latency, quality dashboards
sessionIdYour conversation/thread/ticket IDMulti-turn analysis, session enrichment, quality scores
descriptionYour choice (e.g. 'Handles support queries via GPT-4o')Human-readable agent registry from event streams
contentMode + redactPiiConfig (defaults work)Server enrichment (automatic), PII scrubbing
model, tokens, costAuto-captured by provider wrappersCost analytics, latency monitoring
parentAgentIdAuto via child()/runAs()Multi-agent hierarchy
env, agentVersion, contextYour deploy pipelineSegmentation, regression detection

Italicized rows require zero developer effort — they're automatic or have sensible defaults.

The minimum viable setup is 4 fields: API key, userId, agentId, sessionId. Everything else is either automatic or a progressive enhancement.

What You Get at Each Level

The coding agent workflow defaults to full instrumentation — the top row below. Lower levels exist as fallbacks, not as recommended starting points.

LevelEvents you getWhat it unlocks in Amplitude
Full (agents + sessions + wrappers)User Message, AI Response, Tool Call, Session Start/End, Score, EnrichmentsPer-user funnels, cohorts, retention, session replay linking, quality scoring
Wrappers only (no sessions)AI Response (with cost, tokens, latency)Aggregate cost monitoring, model comparison
patch() only (no wrappers, no sessions)AI Response (basic)Aggregate call counts — useful for verification only

Support matrix

  • Fully supported in Node.js: OpenAI chat completions, OpenAI Responses API, Azure OpenAI chat completions, Anthropic messages, Gemini, Mistral, Bedrock, LangChain, OpenTelemetry, LlamaIndex.
  • Partial support: zero-code patch() is best-effort by installed SDK and provider surface; OpenAI Agents tracing depends on incoming span payload shape from the host SDK.
  • Not currently supported in Node.js:
    • AmplitudeCrewAIHooks is Python-only and throws in Node.js.

Parity and runtime limitations

This section is the source of truth for behavior that is intentionally different from Python due to runtime constraints:

  • AmplitudeCrewAIHooks is unsupported in Node.js (CrewAI is Python-only).
  • tool() does not auto-generate JSON Schema from runtime type hints; pass inputSchema explicitly.
  • Tool timeout behavior is async Promise.race based and cannot preempt synchronous CPU-bound code.
  • Auto-instrument bootstrap differs by runtime (node --import in Node vs sitecustomize in Python).
  • Request middleware differs by runtime (Express-compatible in Node vs ASGI middleware in Python).

Zero-code (for verification or legacy codebases)

patch() monkey-patches provider SDKs so existing LLM calls are tracked without code changes. This is useful for verifying the SDK works or for legacy codebases where you can't modify call sites. It only captures [Agent] AI Response without user identity — for the full event model, use agents + sessions (see Quick Start).

import { AmplitudeAI, patch } from '@amplitude/ai';
// OpenAI/Azure OpenAI chat completions (+ parse), OpenAI Responses, Anthropic, Gemini, Mistral,
// and Bedrock Converse calls are tracked when patching succeeds.
// No changes to your existing code needed.
import OpenAI from 'openai';

const ai = new AmplitudeAI({ apiKey: process.env.AMPLITUDE_AI_API_KEY! });
patch({ amplitudeAI: ai });

const openai = new OpenAI();

const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello' }],
});
// ^ automatically tracked as [Agent] AI Response

Warning: Patched calls that fire outside an active session context are silently dropped — no event is emitted and no error is thrown. If you instrument with patch() but see no events, this is the most likely cause. Wrap your LLM calls in session.run(), use the Express middleware, or pass context explicitly. See Session and Middleware.

Or use the CLI to auto-patch at process start without touching application code:

AMPLITUDE_AI_API_KEY=xxx AMPLITUDE_AI_AUTO_PATCH=true amplitude-ai-instrument node app.js

Replace the provider constructor with the Amplitude-instrumented version for automatic tracking with full control over options per call:

import { AmplitudeAI, OpenAI } from '@amplitude/ai';

const ai = new AmplitudeAI({ apiKey: process.env.AMPLITUDE_AI_API_KEY! });
const openai = new OpenAI({
  amplitude: ai,
  apiKey: process.env.OPENAI_API_KEY,
});

const agent = ai.agent('my-agent', { userId: 'user-123' });
const session = agent.session();

await session.run(async () => {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: 'Hello' }],
  });
  // AI response tracked automatically via wrapper

  const responseV2 = await openai.responses.create({
    model: 'gpt-4.1',
    instructions: 'You are concise.',
    input: [{ role: 'user', content: 'Summarize this in one sentence.' }],
  });
  // OpenAI Responses API is also tracked automatically
});

Or wrap an existing client instance (supports OpenAI, Azure OpenAI, and Anthropic):

import { wrap } from '@amplitude/ai';
import OpenAI from 'openai';

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const instrumented = wrap(client, ai);

All provider constructors and wrap() accept either an AmplitudeAI instance or a raw Amplitude client — both work:

new OpenAI({ amplitude: ai }); // AmplitudeAI instance
new OpenAI({ amplitude: ai.amplitude }); // raw Amplitude client
wrap(client, ai); // AmplitudeAI instance
wrap(client, ai.amplitude); // raw Amplitude client

Note: wrap() only supports OpenAI, Azure OpenAI, and Anthropic clients. For Gemini, Mistral, and Bedrock, use the SDK's provider classes directly (e.g., new Gemini({ amplitude: ai })).

Full control

Call tracking methods directly for maximum flexibility. Works with any LLM provider, including custom or self-hosted models:

import { AmplitudeAI } from '@amplitude/ai';

const ai = new AmplitudeAI({ apiKey: process.env.AMPLITUDE_AI_API_KEY! });
const agent = ai.agent('my-agent', { userId: 'user-123' });
const session = agent.session({ userId: 'user-123' });

await session.run(async (s) => {
  s.trackUserMessage('Summarize this document');

  const start = performance.now();
  const response = await myCustomLLM.generate('Summarize this document');
  const latencyMs = performance.now() - start;

  s.trackAiMessage(response.text, 'my-model-v2', 'custom', latencyMs, {
    inputTokens: response.usage.input,
    outputTokens: response.usage.output,
  });
});

Core Concepts

AmplitudeAI

Main client that wraps Amplitude analytics-node. Create it with an API key or an existing Amplitude instance:

const ai = new AmplitudeAI({ apiKey: 'YOUR_API_KEY' });
// Or with existing client:
const ai = new AmplitudeAI({ amplitude: existingAmplitudeClient });

BoundAgent

Agent with pre-bound defaults (agentId, description, userId, env, etc.). Use agent() to create:

const agent = ai.agent('support-bot', {
  description: 'Handles customer support queries via OpenAI GPT-4o',
  userId: 'user-123',
  env: 'production',
  customerOrgId: 'org-456',
});

Child agents inherit context from their parent and automatically set parentAgentId (note: description is agent-specific and is not inherited — pass it explicitly if needed):

const orchestrator = ai.agent('orchestrator', {
  description: 'Routes queries to specialized child agents',
  userId: 'user-123',
});
const researcher = orchestrator.child('researcher');
const writer = orchestrator.child('writer', {
  description: 'Drafts responses using retrieved context',
});
// researcher.parentAgentId === 'orchestrator'
// researcher inherits orchestrator's description; writer has its own

TenantHandle

Multi-tenant helper that pre-binds customerOrgId for all agents created from it:

const tenant = ai.tenant('org-456', { env: 'production' });
const agent = tenant.agent('support-bot', { userId: 'user-123' });

User Identity

User identity flows through the session, per-call, or middleware -- not at agent creation or patch time. This keeps the agent reusable across users.

Via sessions (recommended): pass userId when opening a session:

const agent = ai.agent('support-bot', { env: 'production' });
const session = agent.session({ userId: 'user-42' });

await session.run(async (s) => {
  s.trackUserMessage('Hello');
  // userId inherited from session context
});

Per-call: pass userId on each tracking call (useful with the zero-code tier):

agent.trackUserMessage('Hello', {
  userId: 'user-42',
  sessionId: 'sess-1',
});

Via middleware: createAmplitudeAIMiddleware extracts user identity from the request (see Middleware):

app.use(
  createAmplitudeAIMiddleware({
    amplitudeAI: ai,
    userIdResolver: (req) => req.headers['x-user-id'] ?? null,
  }),
);

Session

Async context manager using AsyncLocalStorage. Use session.run() to execute a callback within session context; session end is tracked automatically on exit:

const session = agent.session({ userId: 'user-123' });
await session.run(async (s) => {
  s.trackUserMessage('Hello');
  s.trackAiMessage(response.content, 'gpt-4', 'openai', latencyMs);
});

Start a new trace within an ongoing session to group related operations:

await session.run(async (s) => {
  const traceId = s.newTrace();
  s.trackUserMessage('Follow-up question');
  s.trackAiMessage(response.content, 'gpt-4o', 'openai', latencyMs);
});

For sessions where gaps between messages may exceed 30 minutes (e.g., coding assistants, support agents waiting on customer replies), pass idleTimeoutMinutes so Amplitude knows the session is still active:

const session = agent.session({
  userId: 'user-123',
  idleTimeoutMinutes: 240, // expect up to 4-hour gaps
});

Without this, sessions with long idle periods may be closed and enrichment may run earlier than expected. The default is 30 minutes.

Session lifecycle and enrichment. You do not need to call trackSessionEnd() for sessions to work. Amplitude's server automatically closes sessions after 30 minutes of inactivity and queues them for enrichment (topic classification, quality scoring, session evaluation) at that point. The only reason to call trackSessionEnd() is to trigger enrichment sooner — for example, if you know the conversation is over and want evaluation results immediately rather than waiting for the idle timeout.

"Closed" is a server-side concept meaning "queued for enrichment" — it does not prevent new events from flowing into the same session. If the user resumes a conversation after session end, new messages with the same sessionId are still associated with that session.

If you use session.run(), session end is tracked automatically when the callback completes. For long-lived conversations (chatbots, support agents), you can skip explicit session end entirely and let the server handle it.

Link to Session Replay: If your frontend uses Amplitude's Session Replay, pass the browser's deviceId and browserSessionId to link AI sessions to browser recordings:

const session = agent.session({
  userId: 'user-123',
  deviceId: req.headers['x-amp-device-id'],
  browserSessionId: req.headers['x-amp-session-id'],
});

await session.run(async (s) => {
  s.trackUserMessage('What is retention?');
  // All events now carry [Amplitude] Session Replay ID = deviceId/browserSessionId
});

tool()

Higher-order function wrapping functions to auto-track as [Agent] Tool Call events:

import { tool } from '@amplitude/ai';

const searchDb = tool(
  async (query: { q: string }) => {
    return await db.search(query.q);
  },
  {
    name: 'search_db',
    inputSchema: { type: 'object', properties: { q: { type: 'string' } } },
  },
);

Note on inputSchema: Unlike the Python SDK which accepts a Pydantic model class and extracts the JSON Schema automatically, the TypeScript SDK accepts a raw JSON Schema object. For type-safe schema generation, consider using Zod with zod-to-json-schema:

import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';

const QuerySchema = z.object({ q: z.string(), limit: z.number().optional() });
const searchDb = tool(mySearchFn, {
  name: 'search_db',
  inputSchema: zodToJsonSchema(QuerySchema),
});

observe()

Higher-order function wrapping functions to auto-track as [Agent] Span events:

import { observe } from '@amplitude/ai';

const processRequest = observe(
  async (input: Request) => {
    return await handleRequest(input);
  },
  { name: 'process_request' },
);

Configuration

import { AIConfig, AmplitudeAI, ContentMode } from '@amplitude/ai';

const config = new AIConfig({
  contentMode: ContentMode.FULL, // FULL | METADATA_ONLY | CUSTOMER_ENRICHED — both ContentMode.FULL and 'full' work
  redactPii: true,
  customRedactionPatterns: ['sensitive-\\d+'],
  debug: false,
  dryRun: false,
});

const ai = new AmplitudeAI({ apiKey: 'YOUR_API_KEY', config });
OptionDescription
contentMode'full' (default), 'metadata_only', or 'customer_enriched'. Both ContentMode.FULL and 'full' work.
redactPiiRedact email, phone, SSN, credit card patterns
customRedactionPatternsAdditional regex patterns for redaction
debugLog events to stderr
dryRunLog without sending to Amplitude
validateEnable strict validation of required fields
onEventCallbackCallback invoked after every tracked event (event, statusCode, message) => void
propagateContextEnable cross-service context propagation

Context Dict Conventions

The context parameter on ai.agent() accepts an arbitrary Record<string, unknown> that is JSON-serialized and attached to every event as [Agent] Context. This is the recommended way to add segmentation dimensions without requiring new global properties.

Recommended keys:

KeyExample ValuesUse Case
agent_type"planner", "executor", "retriever", "router"Filter/group analytics by agent role in multi-agent systems.
experiment_variant"control", "treatment-v2", "prompt-rewrite-a"Segment AI sessions by A/B test variant. Compare quality scores, abandonment rates, or cost across experiment arms.
feature_flag"new-rag-pipeline", "reasoning-model-enabled"Track which feature flags were active during the session.
surface"chat", "search", "copilot", "email-draft"Identify which UI surface or product area triggered the AI interaction.
prompt_revision"v7", "abc123", "2026-02-15"Track which prompt version was used. Detect prompt regression when combined with agentVersion.
deployment_region"us-east-1", "eu-west-1"Segment by deployment region for latency analysis or compliance tracking.
canary_group"canary", "stable"Identify canary vs. stable deployments for progressive rollout monitoring.

Example:

const agent = ai.agent('support-bot', {
  userId: 'u1',
  description: 'Handles customer support queries via OpenAI GPT-4o',
  agentVersion: '4.2.0',
  context: {
    agent_type: 'executor',
    experiment_variant: 'reasoning-enabled',
    surface: 'chat',
    feature_flag: 'new-rag-pipeline',
    prompt_revision: 'v7',
  },
});

// All events from this agent (and its sessions, child agents, and provider
// wrappers) will include [Agent] Context with these keys.

Context merging in child agents:

const parent = ai.agent('orchestrator', {
  context: { experiment_variant: 'treatment', surface: 'chat' },
});
const child = parent.child('researcher', {
  context: { agent_type: 'retriever' },
});
// child context = { experiment_variant: 'treatment', surface: 'chat', agent_type: 'retriever' }
// Child keys override parent keys; parent keys absent from the child are preserved.

Querying in Amplitude: The [Agent] Context property is a JSON string. Use Amplitude's JSON property parsing to extract individual keys for charts, cohorts, and funnels. For example, group by [Agent] Context.agent_type to see metrics by agent role.

Note on experiment_variant and server-generated events: Context keys appear on all SDK-emitted events ([Agent] User Message, [Agent] AI Response, etc.). Server-generated events ([Agent] Session Evaluation, [Agent] Score with source="ai") do not yet inherit context keys. To segment server-generated quality scores by experiment arm, use Amplitude Derived Properties to extract from [Agent] Context on SDK events.

Privacy & Content Control

Three content modes control what data is sent to Amplitude:

ModeMessage ContentToken/Cost/LatencySession GroupingServer Enrichments
FULLSent (with PII redaction)YesYesYes (auto)
METADATA_ONLYNot sentYesYesNo
CUSTOMER_ENRICHEDNot sentYesYesYes (you provide)

FULL mode (default)

Message content is captured and sent to Amplitude. When you opt in with redactPii: true, built-in PII redaction patterns scrub emails, phone numbers, SSNs, credit card numbers, and base64 image data before the event leaves your process:

const config = new AIConfig({
  contentMode: ContentMode.FULL,
  redactPii: true,
});

With redactPii: true, a message like "Contact me at john@example.com or 555-123-4567" is sanitized to "Contact me at [email] or [phone]" before being sent.

Built-in phone and SSN detection are currently tuned for common US formats. If you need broader international coverage, add explicit customRedactionPatterns for your locales.

Add custom redaction patterns for domain-specific PII:

const config = new AIConfig({
  contentMode: ContentMode.FULL,
  redactPii: true,
  customRedactionPatterns: ['ACCT-\\d{6,}', 'internal-key-[a-f0-9]+'],
});

Custom redaction patterns are your responsibility: avoid expensive or catastrophic regexes in performance-sensitive paths.

Message content is stored at full length with no truncation or size limits. The $llm_message property is whitelisted server-side, and the Node SDK does not apply per-property string truncation.

METADATA_ONLY mode

No message content is sent. You still get token counts, cost, latency, model name, and session grouping — everything needed for cost analytics and performance monitoring:

const config = new AIConfig({
  contentMode: ContentMode.METADATA_ONLY,
});

Use this when you cannot send user content to a third-party analytics service (e.g., regulated industries, sensitive data).

CUSTOMER_ENRICHED mode

Like METADATA_ONLY (no content sent), but designed for workflows where you enrich sessions with your own classifications, quality scores, and topic labels via the SessionEnrichments API:

const config = new AIConfig({
  contentMode: ContentMode.CUSTOMER_ENRICHED,
});

// Later, after running your own classification pipeline:
const enrichments = new SessionEnrichments({
  qualityScore: 0.85,
  overallOutcome: 'resolved',
});
session.setEnrichments(enrichments);

PrivacyConfig (advanced)

PrivacyConfig is derived from AIConfig via config.toPrivacyConfig(). For advanced use, create directly:

import { PrivacyConfig } from '@amplitude/ai';

const privacy = new PrivacyConfig({
  privacyMode: true,
  redactPii: true,
  customRedactionPatterns: ['sensitive-\\d+'],
});

When to use which mode

  • FULL: You want to see actual conversation content in Amplitude, debug individual sessions, and leverage server-side enrichment pipelines. Best for development, internal tools, and applications where data sharing agreements permit it.
  • METADATA_ONLY: You want cost/performance analytics without exposing any message content. Best for regulated environments (healthcare, finance) or when content contains proprietary data.
  • CUSTOMER_ENRICHED: You want the privacy of METADATA_ONLY but also want structured analytics (topic classification, quality scores) that you compute on your own infrastructure before sending to Amplitude.

Cache-Aware Cost Tracking

When using provider prompt caching (Anthropic's cache, OpenAI's cached completions, etc.), pass cache token breakdowns for accurate cost calculation:

s.trackAiMessage(
  response.content,
  'claude-3.5-sonnet',
  'anthropic',
  latencyMs,
  {
    inputTokens: response.usage.input_tokens,
    outputTokens: response.usage.output_tokens,
    cacheReadTokens: response.usage.cache_read_input_tokens,
    cacheCreationTokens: response.usage.cache_creation_input_tokens,
  },
);

Without cache breakdowns, cost calculation treats all input tokens at the standard rate. With caching enabled, cache-read tokens are typically 10x cheaper than standard input tokens and cache-creation tokens are ~25% more expensive. Naive cost calculation without this breakdown can overestimate costs by 2-5x for cache-heavy workloads.

The SDK tracks four token categories:

  • [Agent] Input Tokens — standard (non-cached) input tokens
  • [Agent] Output Tokens — generated output tokens
  • [Agent] Cache Read Tokens — tokens read from provider cache (cheap)
  • [Agent] Cache Creation Tokens — tokens written to provider cache (slightly expensive)

Cost is auto-calculated when token counts are provided and the @pydantic/genai-prices package is installed. When genai-prices is not available, calculateCost() returns 0 (never null). You can also pass totalCostUsd directly if you compute cost yourself:

s.trackAiMessage(response.content, 'gpt-4o', 'openai', latencyMs, {
  totalCostUsd: 0.0034,
});

Note — pricing data freshness. Cost calculation relies on pricing data bundled in the installed @pydantic/genai-prices package. Newly released models may return $0 until the package is updated. To get the latest pricing between package releases, opt in to live updates at startup:

import { enableLivePriceUpdates } from '@amplitude/ai';
enableLivePriceUpdates(); // fetches latest prices from genai-prices GitHub repo hourly

This makes periodic HTTPS requests to raw.githubusercontent.com (~26 KB each). Only enable in environments where outbound network access is permitted.

Semantic Cache Tracking

Track full-response semantic cache hits (distinct from token-level prompt caching above):

s.trackAiMessage(cachedResponse.content, 'gpt-4o', 'openai', latencyMs, {
  wasCached: true, // served from Redis/semantic cache
});

Maps to [Agent] Was Cached. Enables "cache hit rate" charts and cost optimization analysis. Only emitted when true; omitted (not false) when the response was not cached.

Model Tier Classification

Models are automatically classified into tiers for cost/performance analysis:

TierExamplesWhen to Use
fastgpt-4o-mini, claude-3-haiku, gemini-flash, gpt-3.5-turboHigh-volume, latency-sensitive
standardgpt-4o, claude-3.5-sonnet, gemini-pro, llama, commandGeneral purpose
reasoningo1, o3-mini, deepseek-r1, claude with extended thinkingComplex reasoning tasks

The tier is inferred automatically from the model name and attached as [Agent] Model Tier on every [Agent] AI Response event:

import {
  inferModelTier,
  TIER_FAST,
  TIER_REASONING,
  TIER_STANDARD,
} from '@amplitude/ai';

inferModelTier('gpt-4o-mini'); // 'fast'
inferModelTier('claude-3.5-sonnet'); // 'standard'
inferModelTier('o1-preview'); // 'reasoning'

Override the auto-inferred tier for custom or fine-tuned models:

s.trackAiMessage(
  response.content,
  'ft:gpt-4o:my-org:custom',
  'openai',
  latencyMs,
  {
    modelTier: 'standard',
    inputTokens: response.usage.prompt_tokens,
    outputTokens: response.usage.completion_tokens,
  },
);

Provider Wrappers

Use instrumented provider wrappers for automatic tracking:

ProviderClassPackage
OpenAIOpenAIopenai
AnthropicAnthropic@anthropic-ai/sdk
GeminiGemini@google/generative-ai
AzureOpenAIAzureOpenAIopenai
BedrockBedrock@aws-sdk/client-bedrock-runtime
MistralMistral@mistralai/mistralai

Feature coverage by provider:

FeatureOpenAIAnthropicGeminiAzureOpenAIBedrockMistral
StreamingYesYesYesYesYesYes
Tool call trackingYesYesNoYesYesNo
TTFB measurementYesYesNoYesNoNo
Cache token statsYesYesNoNoNoNo
Responses APIYes-----
Reasoning contentYesYesNoYesNoNo
System prompt captureYesYesYesYesYesYes
Cost estimationYesYesYesYesYesYes

Provider wrappers use injected TrackFn callbacks instead of class hierarchy casts, enabling easier composition and custom tracking logic.

Bedrock model IDs like us.anthropic.claude-3-5-sonnet are automatically normalized for price lookup (e.g., to claude-3-5-sonnet).

OpenAI example:

import { AmplitudeAI, OpenAI } from '@amplitude/ai';

const ai = new AmplitudeAI({ apiKey: process.env.AMPLITUDE_AI_API_KEY! });
const openai = new OpenAI({
  amplitude: ai,
  apiKey: process.env.OPENAI_API_KEY,
});

const agent = ai.agent('my-agent', { userId: 'user-123' });
const session = agent.session();

await session.run(async (s) => {
  const resp = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{ role: 'user', content: 'Hello' }],
  });
  // AI response tracked automatically via wrapper
});

Or wrap an existing client:

import { wrap } from '@amplitude/ai';
import OpenAI from 'openai';

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const instrumented = wrap(client, ai);

Streaming Tracking

Automatic streaming (provider wrappers)

Provider wrappers (OpenAI, AzureOpenAI, Anthropic, Gemini, Mistral, Bedrock) automatically detect supported streaming responses and track them transparently. The wrapper intercepts the AsyncIterable, accumulates chunks, measures TTFB, and emits an [Agent] AI Response event after the stream is fully consumed:

const openai = new OpenAI({ amplitude: ai, apiKey: '...' });

// Streaming is handled automatically — just iterate the result
const stream = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}
// ^ AI Response event emitted automatically after loop ends

Manual streaming

Track streaming responses manually with time-to-first-byte (TTFB) for latency analysis:

s.trackAiMessage(fullContent, 'gpt-4o', 'openai', totalMs, {
  isStreaming: true,
  ttfbMs: timeToFirstByte,
  inputTokens: usage.prompt_tokens,
  outputTokens: usage.completion_tokens,
});

The SDK tracks two timing properties for streaming:

  • [Agent] Latency Ms — total wall-clock time from request to final chunk
  • [Agent] TTFB Ms — time-to-first-byte, the delay before the first token arrives

StreamingAccumulator

For manual streaming, use StreamingAccumulator to collect chunks and automatically measure TTFB:

import { StreamingAccumulator } from '@amplitude/ai';

const accumulator = new StreamingAccumulator();

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) {
    accumulator.addContent(content);
  }
}

accumulator.setUsage({
  inputTokens: finalUsage.prompt_tokens,
  outputTokens: finalUsage.completion_tokens,
});

s.trackAiMessage(
  accumulator.content,
  'gpt-4o',
  'openai',
  accumulator.elapsedMs,
  {
    isStreaming: true,
    ttfbMs: accumulator.ttfbMs,
    inputTokens: accumulator.inputTokens,
    outputTokens: accumulator.outputTokens,
    finishReason: accumulator.finishReason,
  },
);

The accumulator automatically records TTFB when addContent() is called for the first time, and tracks total elapsed time via elapsedMs. For streaming errors, call setError(message) to set isError and errorMessage, which are included on the tracked AI Response event.

Attachment Tracking

Track files sent with user messages (images, PDFs, URLs):

s.trackUserMessage('Analyze this document', {
  attachments: [
    { type: 'image', name: 'chart.png', size_bytes: 102400 },
    { type: 'pdf', name: 'report.pdf', size_bytes: 2048576 },
  ],
});

The SDK automatically derives aggregate properties from the attachment array:

  • [Agent] Has Attachments — boolean, true when attachments are present
  • [Agent] Attachment Count — number of attachments
  • [Agent] Attachment Types — deduplicated list of attachment types (e.g., ["image", "pdf"])
  • [Agent] Total Attachment Size Bytes — sum of all size_bytes values
  • [Agent] Attachments — serialized JSON of the full attachment metadata

Attachments can also be tracked on AI responses (e.g., when the model generates images or files):

s.trackAiMessage(response.content, 'gpt-4o', 'openai', latencyMs, {
  attachments: [{ type: 'image', name: 'generated.png', size_bytes: 204800 }],
});

Implicit Feedback

Track behavioral signals that indicate whether a response met the user's need, without requiring explicit ratings:

// User asks a question
s.trackUserMessage('How do I create a funnel?');

// AI responds — user copies the answer (positive signal)
s.trackAiMessage('To create a funnel, go to...', 'gpt-4o', 'openai', latencyMs, {
  wasCopied: true,
});

// User regenerates (negative signal — first response wasn't good enough)
s.trackUserMessage('How do I create a funnel?', {
  isRegeneration: true,
});

// User edits their question (refining intent)
s.trackUserMessage('How do I create a conversion funnel for signups?', {
  isEdit: true,
  editedMessageId: originalMsgId, // links the edit to the original
});

Track abandonment at session end — a low abandonmentTurn (e.g., 1) strongly signals first-response dissatisfaction:

agent.trackSessionEnd({
  sessionId: 'sess-1',
  abandonmentTurn: 1, // user left after first AI response
});

These signals map to [Agent] Was Copied, [Agent] Is Regeneration, [Agent] Is Edit, [Agent] Edited Message ID, and [Agent] Abandonment Turn. Use them in Amplitude to build quality dashboards without requiring user surveys.

tool() and observe() HOFs

tool()

Wraps an async function to track as [Agent] Tool Call:

import { tool, ToolCallTracker } from '@amplitude/ai';

ToolCallTracker.setAmplitude(ai.amplitude, 'user-123', {
  sessionId: 'sess-1',
  traceId: 'trace-1',
  agentId: 'my-agent',
  privacyConfig: ai.config.toPrivacyConfig(),
});

const fetchWeather = tool(
  async (args: { city: string }) => {
    return await weatherApi.get(args.city);
  },
  {
    name: 'fetch_weather',
    inputSchema: { type: 'object', properties: { city: { type: 'string' } } },
    timeoutMs: 5000,
    onError: (err, name) => console.error(`Tool ${name} failed:`, err),
  },
);

observe()

Wraps a function to track as [Agent] Span:

import { observe } from '@amplitude/ai';

const enrichData = observe(async (data: unknown) => transform(data), {
  name: 'enrich_data',
  agentId: 'enricher',
});

Scoring Patterns

Track quality feedback from multiple sources using the score() method. Scores are emitted as [Agent] Score events.

User Feedback (thumbs up/down)

s.score('thumbs-up', 1, messageId, { source: 'user' });
s.score('thumbs-down', 0, messageId, { source: 'user' });

Numeric Rating

s.score('rating', 4, messageId, {
  source: 'user',
  comment: 'Very helpful but slightly verbose',
});

LLM-as-Judge

s.score('quality', 0.85, messageId, {
  source: 'ai',
  comment: 'Clear and accurate response with proper citations',
});

Session-Level Scoring

Score an entire session rather than a single message by setting targetType to 'session':

s.score('session-quality', 0.9, session.sessionId, {
  targetType: 'session',
  source: 'ai',
});

Score Properties

Each [Agent] Score event includes:

  • [Agent] Score Name — the name you provide (e.g., "thumbs-up", "quality")
  • [Agent] Score Value — numeric value
  • [Agent] Target ID — the message ID or session ID being scored
  • [Agent] Target Type"message" (default) or "session"
  • [Agent] Evaluation Source"user" (default) or "ai"
  • [Agent] Comment — optional free-text comment (respects content mode)

Enrichments

Session Enrichments

Attach structured metadata to sessions for analytics. Enrichments are included when the session auto-ends:

import {
  RubricScore,
  SessionEnrichments,
  TopicClassification,
} from '@amplitude/ai';

const enrichments = new SessionEnrichments({
  qualityScore: 0.85,
  sentimentScore: 0.7,
  overallOutcome: 'resolved',
  topicClassifications: {
    intent: new TopicClassification({
      l1: 'billing',
      primary: 'billing',
      values: ['billing', 'refund'],
      subcategories: ['REFUND_REQUEST', 'PRICING_QUESTION'],
    }),
  },
  rubrics: [
    new RubricScore({
      name: 'helpfulness',
      score: 4,
      rationale: 'Provided clear step-by-step instructions',
    }),
    new RubricScore({
      name: 'accuracy',
      score: 5,
      rationale: 'All information was factually correct',
    }),
  ],
  agentChain: ['orchestrator', 'researcher', 'writer'],
  rootAgentName: 'orchestrator',
  requestComplexity: 'medium',
});

session.setEnrichments(enrichments);
// Enrichments are included automatically when session.run() completes

Track Enrichments Separately

Send enrichments as a standalone event without ending the session:

agent.trackSessionEnrichment(enrichments, {
  sessionId: 'sess-abc123',
});

End-to-End Example: customer_enriched Mode

This mode is for teams that run their own evaluation pipeline (or can't send message content to Amplitude) but still want rich session-level analytics. Here's a complete workflow:

import {
  AIConfig,
  AmplitudeAI,
  ContentMode,
  MessageLabel,
  RubricScore,
  SessionEnrichments,
  TopicClassification,
} from '@amplitude/ai';

// 1. Configure: no content sent to Amplitude
const ai = new AmplitudeAI({
  apiKey: process.env.AMPLITUDE_AI_API_KEY!,
  config: new AIConfig({
    contentMode: ContentMode.CUSTOMER_ENRICHED,
  }),
});

const agent = ai.agent('support-bot', {
  description: 'Handles support conversations in metadata-only mode',
  agentVersion: '2.1.0',
});

// 2. Run the conversation — content is NOT sent (metadata only)
const session = agent.session({ userId: 'user-42' });
const { sessionId, messageIds } = await session.run(async (s) => {
  const msgIds: string[] = [];
  msgIds.push(s.trackUserMessage('Why was I charged twice?'));
  msgIds.push(
    s.trackAiMessage(
      aiResponse.content,
      'gpt-4o',
      'openai',
      latencyMs,
    ),
  );
  return { sessionId: s.sessionId, messageIds: msgIds };
});

// 3. Run your eval pipeline on the raw messages (e.g., your own LLM judge)
const evalResults = await myEvalPipeline(conversationHistory);

// 4. Ship enrichments back to Amplitude
const enrichments = new SessionEnrichments({
  qualityScore: evalResults.quality,
  sentimentScore: evalResults.sentiment,
  overallOutcome: evalResults.outcome,
  topicClassifications: {
    'billing': new TopicClassification({
      topic: 'billing-dispute',
      confidence: 0.92,
    }),
  },
  rubricScores: [
    new RubricScore({ name: 'accuracy', score: 4, maxScore: 5 }),
    new RubricScore({ name: 'helpfulness', score: 5, maxScore: 5 }),
  ],
  messageLabels: {
    [messageIds[0]]: [
      new MessageLabel({ key: 'intent', value: 'billing-dispute', confidence: 0.94 }),
    ],
  },
  customMetadata: { eval_model: 'gpt-4o-judge-v2' },
});

agent.trackSessionEnrichment(enrichments, { sessionId });

This produces the same Amplitude event properties as Amplitude's built-in server-side enrichment (topics, rubrics, outcomes, message labels), but sourced from your pipeline. Use it when compliance requires zero-content transmission, or when you need custom evaluation logic beyond what the built-in enrichment provides.

Available Enrichment Fields

  • Quality & Sentiment: qualityScore, sentimentScore
  • Outcome: overallOutcome, hasTaskFailure, taskFailureType, taskFailureReason
  • Topics: topicClassifications — a map of taxonomy name to TopicClassification
  • Rubrics: rubrics — array of RubricScore with name, score, rationale, and evidence
  • Failure Signals: hasNegativeFeedback, hasDataQualityIssues, hasTechnicalFailure
  • Error Analysis: errorCategories, technicalErrorCount
  • Behavioral: behavioralPatterns, negativeFeedbackPhrases, dataQualityIssues
  • Agent Topology: agentChain, rootAgentName
  • Complexity: requestComplexity
  • Labels: messageLabels — per-message labels keyed by message ID
  • Custom: customMetadata — arbitrary key/value data for your own analytics

Message Labels

Attach classification labels to individual messages within a session. Labels are flexible key-value pairs for filtering and segmentation in Amplitude.

Common use cases: routing tags (flow, surface), classifier output (intent, sentiment, toxicity), business context (tier, plan).

Inline labels (at tracking time):

import { MessageLabel } from '@amplitude/ai';

s.trackUserMessage('I want to cancel my subscription', {
  labels: [
    new MessageLabel({
      key: 'intent',
      value: 'cancellation',
      confidence: 0.95,
    }),
    new MessageLabel({
      key: 'sentiment',
      value: 'frustrated',
      confidence: 0.8,
    }),
  ],
});

Retrospective labels (after the session, from a background pipeline):

When classifier results arrive after the session ends, attach them via SessionEnrichments.messageLabels, keyed by the messageId returned from tracking calls:

import { MessageLabel, SessionEnrichments } from '@amplitude/ai';

const enrichments = new SessionEnrichments({
  messageLabels: {
    [userMsgId]: [
      new MessageLabel({ key: 'intent', value: 'cancellation', confidence: 0.94 }),
    ],
    [aiMsgId]: [
      new MessageLabel({ key: 'quality', value: 'good', confidence: 0.91 }),
    ],
  },
});

agent.trackSessionEnrichment(enrichments, { sessionId: 'sess-abc123' });

Labels are emitted as [Agent] Message Labels on the event. In Amplitude, filter or group by label key/value to build charts like "messages by intent" or "sessions where flow=onboarding".

Debug and Dry-Run Modes

Debug Mode

Prints a colored (ANSI) summary of every tracked event to stderr. All 8 event types (User Message, AI Response, Tool Call, Embedding, Span, Session End, Session Enrichment, Score) are formatted. Events are still sent to Amplitude:

const ai = new AmplitudeAI({
  apiKey: 'xxx',
  config: new AIConfig({ debug: true }),
});

// stderr output for each event:
// [amplitude-ai] [Agent] AI Response | user=user-123 session=sess-abc agent=my-agent model=gpt-4o latency=1203ms tokens=150→847 cost=$0.0042
// [amplitude-ai] [Agent] Tool Call | user=user-123 session=sess-abc agent=my-agent tool=search_db success=true latency=340ms
// [amplitude-ai] [Agent] User Message | user=user-123 session=sess-abc agent=my-agent

Dry-Run Mode

Logs the full event JSON to stderr WITHOUT sending to Amplitude. Events are never transmitted:

const ai = new AmplitudeAI({
  apiKey: 'xxx',
  config: new AIConfig({ dryRun: true }),
});

// stderr: full JSON of each event
// Useful for local development, CI pipelines, and validating event shape

Environment Variable Configuration

Both modes can be enabled via environment variables when using auto-instrumentation:

AMPLITUDE_AI_DEBUG=true amplitude-ai-instrument node app.js

Patching

Monkey-patch provider SDKs to auto-track without changing call sites. This is useful for quick verification that the SDK is connected, or for legacy codebases where modifying call sites is impractical. For the full event model (user messages, sessions, scoring, enrichments), use agents + sessions as shown in Quick Start.

import {
  AmplitudeAI,
  patch,
  patchOpenAI,
  unpatch,
  unpatchOpenAI,
} from '@amplitude/ai';

const ai = new AmplitudeAI({ apiKey: process.env.AMPLITUDE_AI_API_KEY! });

// Patch installed/available providers (OpenAI, Anthropic, Gemini, Mistral, Bedrock)
patch({ amplitudeAI: ai });

// Or patch specific provider
patchOpenAI({ amplitudeAI: ai });

// Unpatch
unpatch();
unpatchOpenAI();

Available patch functions: patchOpenAI, patchAnthropic, patchAzureOpenAI, patchGemini, patchMistral, patchBedrock. Corresponding unpatch for each: unpatchOpenAI, unpatchAnthropic, unpatchAzureOpenAI, unpatchGemini, unpatchMistral, unpatchBedrock.

patch() returns a string[] of providers where at least one supported surface was successfully patched (e.g., ['openai', 'anthropic']), matching the Python SDK's return signature.

Patch surface notes:

  • OpenAI/Azure OpenAI: chat.completions.create, chat.completions.parse, and Responses APIs are instrumented (including streaming shapes where exposed by the SDK).
  • Bedrock: only ConverseCommand and ConverseStreamCommand are instrumented when patching client.send.

Auto-Instrumentation CLI

Preload the register module to auto-patch providers at process start:

AMPLITUDE_AI_API_KEY=xxx AMPLITUDE_AI_AUTO_PATCH=true amplitude-ai-instrument node app.js

Or directly with Node's ESM preload flag:

AMPLITUDE_AI_API_KEY=xxx AMPLITUDE_AI_AUTO_PATCH=true node --import @amplitude/ai/register app.js

Environment variables:

VariableDescription
AMPLITUDE_AI_API_KEYRequired for auto-patch
AMPLITUDE_AI_AUTO_PATCHMust be "true" to enable
AMPLITUDE_AI_CONTENT_MODEfull, metadata_only, or customer_enriched
AMPLITUDE_AI_DEBUG"true" for debug output to stderr

Doctor CLI

Validate setup (env, provider deps, mock event capture, mock flush path):

amplitude-ai doctor

Useful flags:

  • amplitude-ai doctor --no-mock-check

Status

Show the installed SDK version, detected provider packages, and environment variable configuration at a glance:

amplitude-ai status

Shell Completions

Enable tab-completion for all CLI commands and flags:

# bash
eval "$(amplitude-ai-completions bash)"

# zsh
eval "$(amplitude-ai-completions zsh)"

MCP Server

Run the SDK-local MCP server over stdio:

amplitude-ai mcp

MCP surface:

ToolDescription
scan_projectScan project structure, detect providers, frameworks, and multi-agent patterns
validate_fileAnalyze a source file to detect uninstrumented LLM call sites
instrument_fileApply instrumentation transforms to a source file
generate_verify_testGenerate a dry-run verification test using MockAmplitudeAI
get_event_schemaReturn the full event schema and property definitions
get_integration_patternReturn canonical instrumentation code patterns
validate_setupCheck env vars and dependency presence
suggest_instrumentationContext-aware next steps based on your framework and provider
search_docsFull-text search across SDK documentation (README, llms-full.txt)

Resources: amplitude-ai://event-schema, amplitude-ai://integration-patterns, amplitude-ai://instrument-guide

Prompt: instrument_app — guided walkthrough for instrumenting an application

Examples and AI Coding Agent Guide

  • amplitude-ai.md — self-contained instrumentation guide for any AI coding agent (Cursor, Claude Code, Windsurf, Copilot, Codex, etc.). Run npx amplitude-ai to see the prompt that points your agent to this file.
  • Mock-based examples demonstrating the event model (also used as CI smoke tests):
    • examples/zero-code.ts
    • examples/wrap-openai.ts
    • examples/multi-agent.ts
    • examples/framework-integration.ts
  • Real provider examples (require API keys):
    • examples/real-openai.ts — end-to-end OpenAI integration with session tracking and flush
    • examples/real-anthropic.ts — end-to-end Anthropic integration with session tracking and flush

Integrations

LangChain

import { AmplitudeAI, AmplitudeCallbackHandler } from '@amplitude/ai';

const ai = new AmplitudeAI({ apiKey: process.env.AMPLITUDE_AI_API_KEY! });
const handler = new AmplitudeCallbackHandler({
  amplitudeAI: ai,
  userId: 'user-123',
  sessionId: 'sess-1',
});

// Pass handler to LangChain callbacks

OpenTelemetry

Two exporters add Amplitude as a destination alongside your existing trace backend (Datadog, Honeycomb, Jaeger, etc.):

import {
  AmplitudeAgentExporter,
  AmplitudeGenAIExporter,
} from '@amplitude/ai';
import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node';
import {
  BatchSpanProcessor,
  SimpleSpanProcessor,
} from '@opentelemetry/sdk-trace-base';

const provider = new NodeTracerProvider();

// GenAI exporter — converts gen_ai.* spans into Amplitude AI events
provider.addSpanProcessor(
  new BatchSpanProcessor(
    new AmplitudeGenAIExporter({
      apiKey: process.env.AMPLITUDE_AI_API_KEY!,
    }),
  ),
);

// Agent exporter — converts agent.* spans into Amplitude session events
provider.addSpanProcessor(
  new SimpleSpanProcessor(
    new AmplitudeAgentExporter({
      apiKey: process.env.AMPLITUDE_AI_API_KEY!,
    }),
  ),
);

provider.register();

Only spans with gen_ai.provider.name or gen_ai.system attributes are processed; all other spans are silently ignored. This means it's safe to add the exporter to a pipeline that produces mixed (GenAI + HTTP + DB) spans.

Attribute mapping reference:

OTEL Span AttributeAmplitude Event PropertyNotes
gen_ai.response.model / gen_ai.request.model[Agent] ModelResponse model preferred
gen_ai.system / gen_ai.provider.name[Agent] Provider
gen_ai.usage.input_tokens[Agent] Input Tokens
gen_ai.usage.output_tokens[Agent] Output Tokens
gen_ai.usage.total_tokens[Agent] Total TokensDerived if not present
gen_ai.usage.cache_read.input_tokens[Agent] Cache Read Tokens
gen_ai.usage.cache_creation.input_tokens[Agent] Cache Creation Tokens
gen_ai.request.temperature[Agent] Temperature
gen_ai.request.top_p[Agent] Top P
gen_ai.request.max_output_tokens[Agent] Max Output Tokens
gen_ai.response.finish_reasons[Agent] Finish Reason
gen_ai.input.messages[Agent] LLM MessageOnly if content mode allows
Span duration[Agent] Latency Ms
Span status ERROR[Agent] Is Error, [Agent] Error Message

Not available via OTEL (use native wrappers): reasoning content/tokens, TTFB, streaming detection, implicit feedback, file attachments, event graph linking (parent_message_id).

When to use OTEL vs. native wrappers: If you already have @opentelemetry/instrumentation-openai or similar producing GenAI spans, the OTEL bridge gives you Amplitude analytics with zero code changes. For richer tracking (implicit feedback, streaming metrics, attachments), use the native wrapOpenAI()/wrapAnthropic() wrappers alongside OTEL.

LlamaIndex

import {
  AmplitudeLlamaIndexHandler,
  createAmplitudeLlamaIndexHandler,
} from '@amplitude/ai';

OpenAI Agents SDK

import { AmplitudeTracingProcessor } from '@amplitude/ai';

Anthropic Tool Use

import { AmplitudeToolLoop } from '@amplitude/ai';

CrewAI (Python-only)

import { AmplitudeCrewAIHooks } from '@amplitude/ai';

In Node.js, AmplitudeCrewAIHooks throws a ProviderError by design. Use LangChain or OpenTelemetry integrations instead.

Data Flow

How events flow from your application to Amplitude charts:

Your Application
├── wrapOpenAI() / wrapAnthropic()     ─── auto-emits ──┐
├── session.trackUserMessage()         ─── manual ──────┤
├── session.trackAiMessage()           ─── manual ──────┤
├── agent.trackToolCall()              ─── manual ──────┤
├── agent.trackSessionEnrichment()     ─── manual ──────┤
└── OTEL exporter (AmplitudeGenAI...)  ─── bridge ──────┤
                                                        │
                              AmplitudeAI client ◄──────┘
                                   │
                                   ├── validate (if enabled)
                                   ├── apply middleware chain
                                   ├── batch events
                                   │
                                   ▼
                           Amplitude HTTP API
                                   │
                     ┌─────────────┴──────────────┐
                     │                            │
            Amplitude Charts               LLM Enrichment
            (immediate querying)           Pipeline (async)
                                                  │
                                                  ▼
                                        [Agent] Session Evaluation
                                        [Agent] Score events
                                        (topic, rubric, outcome)

Key points:

  • All paths converge at the AmplitudeAI client, which batches and sends events.
  • Events are available for charting within seconds of ingestion.
  • The LLM Enrichment Pipeline runs asynchronously after session close (only when contentMode: 'full'). It produces server-side events like [Agent] Session Evaluation and [Agent] Score.
  • With contentMode: 'customer_enriched', the enrichment pipeline is skipped — you provide your own enrichments via trackSessionEnrichment().

Integration Approaches

Start with full instrumentation. Use agents + sessions + provider wrappers. This is the recommended approach for both coding agent and manual workflows — it gives you every event type, per-user analytics, and server-side enrichment.

ApproachWhen to useWhat you get
Full control (recommended)Any project, new or existingBoundAgent + session.run() + provider wrappers — all event types, per-user funnels, cohorts, retention, quality scoring, enrichments
Express/Fastify middlewareWeb app, auto-session per requestSame as full control with automatic session lifecycle via createAmplitudeAIMiddleware
Swap importExisting codebase, incremental adoptionnew OpenAI({ amplitude: ai }) — auto-tracking per call, add sessions when ready
WrapYou've already created a clientwrap(client, ai) — instruments an existing client instance
Zero-code / patch()Verification or legacy codebases onlypatch({ amplitudeAI: ai })[Agent] AI Response only, no user identity, no funnels
OTEL BridgeThird-party framework exports OTEL spansAdd exporter to existing OTEL pipeline — limited to OTEL attributes

The first four approaches all support the full event model. Choose based on how you want to integrate — the analytics capabilities are the same. patch() is the exception: it only captures aggregate [Agent] AI Response events without user identity, useful only for verifying the SDK works or for codebases where you can't modify call sites.

User text, turn-level events, and gateways

These rules match the Python amplitude-ai agent guide and affect how Agent Analytics labels sessions and computes costs:

  • trackUserMessage(content, opts?) — The content string becomes $llm_message.text. Use a short, human-readable line for the real user intent (or a headless summary). Put large JSON, RAG packs, or pipeline state in opts.context or opts.eventProperties, not as the only content, or session titles and segmentation will show raw JSON.
  • Turn-level vs spans[Agent] User Message and [Agent] AI Response (with session + turn ids) drive turn counts and conversation views. observe() / trackSpan() add trace detail but do not replace those turn events; keep a user + AI pair for each user-visible cycle unless you intentionally document otherwise.
  • Gateways / custom baseURL — If you use stock openai (or another client) against a proxy, the SDK may not auto-wrap that path. Call trackAiMessage with usage token fields from the response (or stream end), pass the actual routed model id as the model argument, and set totalCostUsd if genai-prices cannot resolve the model string. Install @pydantic/genai-prices for automatic USD estimates when model + tokens are known.

Integration Patterns

Pattern A: Single-Request API Endpoint

For serverless functions or API endpoints that handle one request at a time. The key requirement is flushing events before the handler returns:

import { AmplitudeAI } from '@amplitude/ai';

const ai = new AmplitudeAI({ apiKey: process.env.AMPLITUDE_AI_API_KEY! });

app.post('/chat', async (req, res) => {
  const agent = ai.agent('api-handler', { userId: req.userId });
  const session = agent.session({ sessionId: req.sessionId });

  const result = await session.run(async (s) => {
    s.trackUserMessage(req.body.message);

    const start = performance.now();
    const response = await openai.chat.completions.create({
      model: 'gpt-4o',
      messages: req.body.messages,
    });
    const latencyMs = performance.now() - start;

    s.trackAiMessage(
      response.choices[0].message.content ?? '',
      'gpt-4o',
      'openai',
      latencyMs,
      {
        inputTokens: response.usage?.prompt_tokens,
        outputTokens: response.usage?.completion_tokens,
      },
    );

    return response.choices[0].message.content;
  });

  await ai.flush();
  res.json({ response: result });
});

Pattern B: Long-Lived Session (Chatbot)

For multi-turn conversations where the session spans many request/response cycles. Create the session once and reuse it across turns:

const ai = new AmplitudeAI({ apiKey: process.env.AMPLITUDE_AI_API_KEY! });
const agent = ai.agent('chatbot', { userId: 'user-123', env: 'production' });

// Session persists across multiple turns
const session = agent.session({ sessionId: conversationId });

await session.run(async (s) => {
  // Turn 1
  s.trackUserMessage('What is Amplitude?');
  const resp1 = await llm.chat('What is Amplitude?');
  s.trackAiMessage(resp1.content, 'gpt-4o', 'openai', resp1.latencyMs, {
    inputTokens: resp1.usage.input,
    outputTokens: resp1.usage.output,
  });

  // Turn 2
  s.trackUserMessage('How does it track events?');
  const resp2 = await llm.chat('How does it track events?');
  s.trackAiMessage(resp2.content, 'gpt-4o', 'openai', resp2.latencyMs, {
    inputTokens: resp2.usage.input,
    outputTokens: resp2.usage.output,
  });

  // Score the conversation
  s.score('helpfulness', 0.9, session.sessionId, {
    targetType: 'session',
    source: 'ai',
  });
});
// Session auto-ends here with all enrichments

Pattern C: Multi-Agent Orchestration

For architectures where a parent agent delegates to specialized child agents. Use session.runAs() to automatically propagate the child agent's identity to both manual tracking calls and provider wrappers:

const ai = new AmplitudeAI({ apiKey: process.env.AMPLITUDE_AI_API_KEY! });
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY!, amplitude: ai });

const orchestrator = ai.agent('orchestrator', {
  userId: 'user-123',
  env: 'production',
});
const researcher = orchestrator.child('researcher');
const writer = orchestrator.child('writer');

const session = orchestrator.session({ userId: 'user-123' });

await session.run(async (s) => {
  s.trackUserMessage('Write a blog post about TypeScript generics');

  // Research phase — provider calls automatically tagged with agentId='researcher'
  const researchResult = await s.runAs(researcher, async (rs) => {
    const completion = await openai.chat.completions.create({
      model: 'gpt-4o',
      messages: [{ role: 'user', content: 'Research TypeScript generics' }],
    });
    return completion.choices[0].message.content;
  });

  // Writing phase — provider calls automatically tagged with agentId='writer'
  const draft = await s.runAs(writer, async (ws) => {
    const completion = await openai.chat.completions.create({
      model: 'gpt-4o',
      messages: [{ role: 'user', content: `Write a post using: ${researchResult}` }],
    });
    return completion.choices[0].message.content;
  });

  s.trackAiMessage(draft ?? '', 'gpt-4o', 'openai', totalLatencyMs, {
    inputTokens: totalInput,
    outputTokens: totalOutput,
  });
});

// Events emitted:
//   [Agent] User Message     → agentId='orchestrator'
//   [Agent] AI Response       → agentId='researcher',  parentAgentId='orchestrator'
//   [Agent] AI Response       → agentId='writer',      parentAgentId='orchestrator'
//   [Agent] AI Response       → agentId='orchestrator'
//   [Agent] Session End       → agentId='orchestrator'  (one session end, not per-child)

How runAs works:

  • Shares the parent session's sessionId, traceId, and turn counter
  • Overrides agentId and parentAgentId in AsyncLocalStorage for the callback's duration
  • Provider wrappers automatically read the child's identity — no amplitudeOverrides needed
  • Does not emit [Agent] Session End (the child operates within the parent session)
  • Restores the parent context when the callback completes, even on errors
  • Supports nesting: s.runAs(child, (cs) => cs.runAs(grandchild, ...))

Serverless Environments

The SDK auto-detects serverless environments (Vercel, AWS Lambda, Netlify, Google Cloud Functions, Azure Functions, Cloudflare Pages). When detected, session.run() automatically flushes all pending events before the promise resolves — no explicit ai.flush() needed. You can also control this explicitly via the autoFlush option on session():

// Auto-detected: flushes automatically in serverless, skips in long-running servers
agent.session({ userId, sessionId });

// Explicit control:
agent.session({ userId, sessionId, autoFlush: true });   // always flush
agent.session({ userId, sessionId, autoFlush: false });  // never flush

If you track events outside of session.run(), you still need await ai.flush() before your handler returns:

export async function handler(event: APIGatewayEvent) {
  const ai = new AmplitudeAI({ apiKey: process.env.AMPLITUDE_AI_API_KEY! });
  const agent = ai.agent('api-handler', {
    userId: event.requestContext.authorizer?.userId,
  });

  const session = agent.session();

  const result = await session.run(async (s) => {
    s.trackUserMessage(JSON.parse(event.body ?? '{}').message ?? '');

    const start = performance.now();
    const response = await callLLM(JSON.parse(event.body ?? '{}').message);
    const latencyMs = performance.now() - start;

    s.trackAiMessage(response.content, response.model, 'openai', latencyMs, {
      inputTokens: response.usage.prompt_tokens,
      outputTokens: response.usage.completion_tokens,
    });

    return response.content;
  });

  await ai.flush(); // Without this, events may be lost
  return { statusCode: 200, body: JSON.stringify({ response: result }) };
}

Flush vs Shutdown

  • ai.flush() — sends all buffered events and returns a promise. Use in serverless handlers and API endpoints where you need to ensure delivery before responding.
  • ai.shutdown() — flushes and closes the underlying Amplitude client. Only needed if you created the client via apiKey (not when passing your own instance). Call on process exit (e.g., SIGTERM handler).
process.on('SIGTERM', () => {
  ai.shutdown();
  process.exit(0);
});

Error Handling and Reliability

  • Non-throwing: All track* methods catch and log errors internally. Your application code is never interrupted by tracking failures.
  • Buffering: Events are buffered and sent in batches by the underlying @amplitude/analytics-node SDK.
  • Retry: Failed sends are automatically retried by the transport layer.
  • Validation: Enable validate: true in AIConfig to get early validation errors for missing required fields (userId, sessionId, etc.). Validation errors throw ValidationError so you can catch them during development.
  • Graceful degradation: If the Amplitude service is unreachable, events are silently dropped after retries are exhausted. Your LLM application continues operating normally.
import { AIConfig, AmplitudeAI, ValidationError } from '@amplitude/ai';

const ai = new AmplitudeAI({
  apiKey: 'xxx',
  config: new AIConfig({ validate: true }),
});

try {
  ai.trackUserMessage({ userId: '', content: 'Hello', sessionId: 'sess-1' });
} catch (e) {
  if (e instanceof ValidationError) {
    console.error('Invalid tracking call:', e.message);
    // "userId must be a non-empty string, got "
  }
}

Testing

Use MockAmplitudeAI for unit tests:

import { MockAmplitudeAI } from '@amplitude/ai';

const mock = new MockAmplitudeAI();

const agent = mock.agent('test-agent', { userId: 'user-1' });
const session = agent.session({ sessionId: 'sess-1', userId: 'user-1' });

await session.run(async (s) => {
  s.trackUserMessage('Hello');
  s.trackAiMessage('Hi!', 'gpt-4', 'openai', 100);
});

mock.assertEventTracked('[Agent] User Message', { userId: 'user-1' });
mock.assertEventTracked('[Agent] AI Response', { userId: 'user-1' });
mock.assertSessionClosed('sess-1');

mock.reset();

Troubleshooting

SymptomCauseFix
No events in AmplitudeAPI key not set or incorrectRun amplitude-ai doctor — it checks AMPLITUDE_AI_API_KEY and reports a fix command
Events tracked but [Agent] Cost USD is $0Model not in the pricing database, or total_cost_usd not passedPass totalCostUsd explicitly, or check that @pydantic/genai-prices / genai-prices is installed
patch() doesn't instrument callspatch() called after the provider client was createdCall patch() before importing or instantiating provider clients
Session context missing on eventsLLM calls made outside session.run()Wrap your LLM calls inside session.run(async () => { ... })
flush() hangs or times out in serverlessProcess exits before flush completesUse await ai.flush() before returning from your Lambda/Cloud Function handler
wrap() TypeScript type errorsPassing a non-supported client typewrap() only supports OpenAI, AzureOpenAI, and Anthropic clients; use provider classes for others
MockAmplitudeAI events are emptyTracking calls not inside a session contextUse mock.agent(...).session(...).run(...) to wrap tracked calls
Cannot find module 'openai' in Turbopack/WebpackBundler rewrites import.meta.url, breaking dynamic require()Pass the provider module directly: new OpenAI({ amplitude: ai, apiKey, openaiModule: OpenAISDK }). Same pattern for Anthropic, Gemini, etc. See each provider's <name>Module option.

Run amplitude-ai doctor for automated environment diagnostics with fix suggestions.

Context Propagation

For distributed tracing, inject context into outgoing request headers and extract on the receiving side:

import { randomUUID } from 'node:crypto';
import {
  extractContext,
  injectContext,
  runWithContextAsync,
  SessionContext,
} from '@amplitude/ai';

// Outgoing request
const headers = injectContext();
fetch(url, { headers });

// Receiving side
const extracted = extractContext(req.headers);
const ctx = new SessionContext({
  sessionId: extracted.sessionId ?? randomUUID(),
  traceId: extracted.traceId ?? null,
  userId: extracted.userId ?? null,
});

await runWithContextAsync(ctx, async () => {
  // Context available via getActiveContext()
});

Middleware

Express-compatible middleware for automatic session tracking:

import { randomUUID } from 'node:crypto';
import { AmplitudeAI, createAmplitudeAIMiddleware } from '@amplitude/ai';
import express from 'express';

const ai = new AmplitudeAI({ apiKey: process.env.AMPLITUDE_AI_API_KEY! });

const app = express();
app.use(
  createAmplitudeAIMiddleware({
    amplitudeAI: ai,
    userIdResolver: (req) =>
      (req as { headers: { 'x-user-id'?: string } }).headers['x-user-id'] ??
      null,
    sessionIdResolver: (req) =>
      (req as { headers: { 'x-session-id'?: string } }).headers[
        'x-session-id'
      ] ?? randomUUID(),
    agentId: 'api-server',
    env: process.env.NODE_ENV ?? 'development',
  }),
);

app.post('/chat', async (req, res) => {
  // Session context available; trackUserMessage/trackAiMessage inherit sessionId, traceId
});

Bulk Conversation Import

Use trackConversation() to import an entire conversation history in one call. Each message in the array is tracked as either a [Agent] User Message or [Agent] AI Response event, with turn IDs auto-incremented:

import { trackConversation } from '@amplitude/ai';
import * as amplitude from '@amplitude/analytics-node';

trackConversation({
  amplitude,
  userId: 'user-123',
  sessionId: 'sess-abc',
  agentId: 'support-bot',
  messages: [
    { role: 'user', content: 'How do I reset my password?' },
    {
      role: 'assistant',
      content: 'Go to Settings > Security > Reset Password.',
      model: 'gpt-4o',
      provider: 'openai',
      latency_ms: 1200,
      input_tokens: 15,
      output_tokens: 42,
      total_cost_usd: 0.002,
    },
    { role: 'user', content: 'Thanks, that worked!' },
    {
      role: 'assistant',
      content: 'Glad I could help!',
      model: 'gpt-4o',
      provider: 'openai',
      latency_ms: 800,
      input_tokens: 10,
      output_tokens: 8,
    },
  ],
});

This is useful for backfilling historical conversations or importing data from external systems. The function accepts all the same context fields (agentId, env, customerOrgId, etc.) as the individual tracking methods.

Event Schema

Event TypeSourceDescription
[Agent] User MessageSDKUser sent a message
[Agent] AI ResponseSDKAI model returned a response
[Agent] Tool CallSDKTool/function was invoked
[Agent] EmbeddingSDKEmbedding was generated
[Agent] SpanSDKSpan (e.g. RAG step, transform)
[Agent] Session EndSDKSession ended
[Agent] Session EnrichmentSDKSession-level enrichment data
[Agent] ScoreBothEvaluation score (quality, sentiment, etc.)
[Agent] Session EvaluationServerSession-level summary: outcome, turn count, flags, cost. Emitted automatically.
[Agent] Topic ClassificationServerOne event per topic model per session. Emitted automatically.

Event Property Reference

All event properties are prefixed with [Agent] (except [Amplitude] Session Replay ID). This reference is auto-generated and matches what gets registered in Amplitude's data catalog via the amplitude-ai-register-catalog CLI.

Common Properties (present on all SDK events)

PropertyTypeRequiredDescription
[Agent] Session IDstringYesUnique session identifier. All events in one conversation share the same session ID.
[Agent] Trace IDstringNoIdentifies one user-message-to-AI-response cycle within a session.
[Agent] Turn IDnumberNoMonotonically increasing counter for event ordering within a session.
[Agent] Agent IDstringNoIdentifies which AI agent handled the interaction (e.g., 'support-bot', 'houston').
[Agent] Parent Agent IDstringNoFor multi-agent orchestration: the agent that delegated to this agent.
[Agent] Customer Org IDstringNoOrganization ID for multi-tenant platforms. Enables account-level group analytics.
[Agent] Agent VersionstringNoAgent code version (e.g., 'v4.2'). Enables version-over-version quality comparison.
[Agent] Agent DescriptionstringNoHuman-readable description of the agent's purpose (e.g., 'Handles user chat requests via OpenAI GPT-4o'). Enables observability-driven agent registry from event streams.
[Agent] ContextstringNoSerialized JSON dict of arbitrary segmentation dimensions (experiment_variant, surface, feature_flag, prompt_revision, etc.).
[Agent] EnvstringNoDeployment environment: 'production', 'staging', or 'dev'.
[Agent] SDK VersionstringYesVersion of the amplitude-ai SDK that produced this event.
[Agent] RuntimestringYesSDK runtime: 'python' or 'node'.

User Message Properties

Event-specific properties for [Agent] User Message (in addition to common properties above).

PropertyTypeRequiredDescription
[Agent] Message IDstringYesUnique identifier for this message event (UUID). Used to link scores and tool calls back to specific messages.
[Agent] Component TypestringYesType of component that produced this event: 'user_input', 'llm', 'tool', 'embedding'.
[Agent] LocalestringNoUser locale (e.g., 'en-US').
[Amplitude] Session Replay IDstringNoLinks to Amplitude Session Replay (format: device_id/session_id). Enables one-click navigation from AI session to browser replay.
[Agent] Is RegenerationbooleanNoWhether the user requested the AI regenerate a previous response.
[Agent] Is EditbooleanNoWhether the user edited a previous message and resubmitted.
[Agent] Edited Message IDstringNoThe message_id of the original message that was edited (links the edit to the original).
[Agent] Has AttachmentsbooleanNoWhether this message includes file attachments (uploads, images, etc.).
[Agent] Attachment Typesstring[]NoDistinct attachment types (e.g., 'pdf', 'image', 'csv'). Serialized JSON array.
[Agent] Attachment CountnumberNoNumber of file attachments included with this message.
[Agent] Total Attachment Size BytesnumberNoTotal size of all attachments in bytes.
[Agent] AttachmentsstringNoSerialized JSON array of attachment metadata (type, name, size_bytes, mime_type). Only metadata, never file content.
[Agent] Message LabelsstringNoSerialized JSON array of MessageLabel objects (key-value pairs with optional confidence). Used for routing tags, classifier output, business context.
[Agent] Message SourcestringNoOrigin of the user message: 'user' for real end-user input, 'agent' for inter-agent delegation (parent agent sending instructions to a child agent). Automatically set by provider wrappers based on parent_agent_id context.

AI Response Properties

Event-specific properties for [Agent] AI Response (in addition to common properties above).

PropertyTypeRequiredDescription
[Agent] Message IDstringYesUnique identifier for this message event (UUID). Used to link scores and tool calls back to specific messages.
[Agent] Component TypestringYesType of component that produced this event: 'user_input', 'llm', 'tool', 'embedding'.
[Agent] Model NamestringYesLLM model identifier (e.g., 'gpt-4o', 'claude-sonnet-4-20250514').
[Agent] ProviderstringYesLLM provider name (e.g., 'openai', 'anthropic', 'google', 'mistral', 'bedrock').
[Agent] Latency MsnumberYesTotal wall-clock latency in milliseconds for this operation.
[Agent] Is ErrorbooleanYesWhether this event represents an error condition.
[Agent] Error MessagestringNoError message text when Is Error is true.
[Agent] LocalestringNoUser locale (e.g., 'en-US').
[Agent] Span KindstringNoClassification of the span type for OTEL bridge compatibility.
[Amplitude] Session Replay IDstringNoLinks to Amplitude Session Replay (format: device_id/session_id). Enables one-click navigation from AI session to browser replay.
[Agent] TTFB MsnumberNoTime to first byte/token in milliseconds. Measures perceived responsiveness for streaming.
[Agent] Input TokensnumberNoNumber of input/prompt tokens consumed by this LLM call.
[Agent] Output TokensnumberNoNumber of output/completion tokens generated by this LLM call.
[Agent] Total TokensnumberNoTotal tokens consumed (input + output).
[Agent] Reasoning TokensnumberNoTokens consumed by reasoning/thinking (o1, o3, extended thinking models).
[Agent] Cache Read TokensnumberNoInput tokens served from the provider's prompt cache (cheaper rate). Used for cache-aware cost calculation.
[Agent] Cache Creation TokensnumberNoInput tokens that created new prompt cache entries.
[Agent] Cost USDnumberNoEstimated cost in USD for this LLM call. Cache-aware when cache token counts are provided.
[Agent] Finish ReasonstringNoWhy the model stopped generating: 'stop', 'end_turn', 'tool_use', 'length', 'content_filter', etc.
[Agent] Tool CallsstringNoSerialized JSON array of tool call requests made by the AI in this response.
[Agent] Has ReasoningbooleanNoWhether the AI response included reasoning/thinking content.
[Agent] Reasoning ContentstringNoThe AI's reasoning/thinking content (when available and content_mode permits).
[Agent] System PromptstringNoThe system prompt used for this LLM call (when content_mode permits). Chunked for long prompts.
[Agent] System Prompt LengthnumberNoCharacter length of the system prompt.
[Agent] Tool DefinitionsstringNoNormalized JSON array of tool definitions sent to the LLM (when content_mode permits). Each entry contains name, description, and parameters schema.
[Agent] Tool Definitions CountnumberNoNumber of tool definitions in the LLM request.
[Agent] Tool Definitions HashstringNoStable SHA-256 hash of the normalized tool definitions. Always present regardless of content_mode; enables toolset change detection without exposing schemas.
[Agent] TemperaturenumberNoTemperature parameter used for this LLM call.
[Agent] Max Output TokensnumberNoMaximum output tokens configured for this LLM call.
[Agent] Top PnumberNoTop-p (nucleus sampling) parameter used for this LLM call.
[Agent] Is StreamingbooleanNoWhether this response was generated via streaming.
[Agent] Prompt IDstringNoIdentifier for the prompt template or version used.
[Agent] Was CopiedbooleanNoWhether the user copied this AI response content. An implicit positive quality signal.
[Agent] Was CachedbooleanNoWhether this response was served from a semantic/full-response cache (distinct from token-level prompt caching).
[Agent] Model TierstringNoModel tier classification: 'fast' (GPT-4o-mini, Haiku, Flash), 'standard' (GPT-4o, Sonnet, Pro), or 'reasoning' (o1, o3, DeepSeek-R1). Auto-inferred from model name.
[Agent] Has AttachmentsbooleanNoWhether this AI response includes generated attachments (images, charts, files).
[Agent] Attachment Typesstring[]NoDistinct attachment types in this AI response. Serialized JSON array.
[Agent] Attachment CountnumberNoNumber of attachments generated by the AI in this response.
[Agent] Total Attachment Size BytesnumberNoTotal size of all AI-generated attachments in bytes.
[Agent] AttachmentsstringNoSerialized JSON array of AI-generated attachment metadata.
[Agent] Message LabelsstringNoSerialized JSON array of MessageLabel objects attached to this AI response.
[Agent] Message Label MapstringNoSerialized JSON map of label key to value for quick lookup.

Tool Call Properties

Event-specific properties for [Agent] Tool Call (in addition to common properties above).

PropertyTypeRequiredDescription
[Agent] Component TypestringYesType of component that produced this event: 'user_input', 'llm', 'tool', 'embedding'.
[Agent] Latency MsnumberYesTotal wall-clock latency in milliseconds for this operation.
[Agent] Is ErrorbooleanYesWhether this event represents an error condition.
[Agent] Error MessagestringNoError message text when Is Error is true.
[Agent] LocalestringNoUser locale (e.g., 'en-US').
[Agent] Span KindstringNoClassification of the span type for OTEL bridge compatibility.
[Amplitude] Session Replay IDstringNoLinks to Amplitude Session Replay (format: device_id/session_id). Enables one-click navigation from AI session to browser replay.
[Agent] Invocation IDstringYesUnique identifier for this tool invocation (UUID). Used to link tool calls to parent messages.
[Agent] Tool NamestringYesName of the tool/function that was invoked (e.g., 'search_docs', 'web_search').
[Agent] Tool SuccessbooleanYesWhether the tool call completed successfully.
[Agent] Tool InputstringNoSerialized JSON of the tool's input arguments. Only sent when content_mode='full'.
[Agent] Tool OutputstringNoSerialized JSON of the tool's output/return value. Only sent when content_mode='full'.
[Agent] Parent Message IDstringNoThe message_id of the user message that triggered this tool call. Links the tool call into the event graph.

Embedding Properties

Event-specific properties for [Agent] Embedding (in addition to common properties above).

PropertyTypeRequiredDescription
[Agent] Component TypestringYesType of component that produced this event: 'user_input', 'llm', 'tool', 'embedding'.
[Agent] Model NamestringYesLLM model identifier (e.g., 'gpt-4o', 'claude-sonnet-4-20250514').
[Agent] ProviderstringYesLLM provider name (e.g., 'openai', 'anthropic', 'google', 'mistral', 'bedrock').
[Agent] Latency MsnumberYesTotal wall-clock latency in milliseconds for this operation.
[Agent] Span IDstringYesUnique identifier for this embedding operation (UUID).
[Agent] Input TokensnumberNoNumber of input tokens processed by the embedding model.
[Agent] Embedding DimensionsnumberNoDimensionality of the output embedding vector.
[Agent] Cost USDnumberNoEstimated cost in USD for this embedding operation.

Span Properties

Event-specific properties for [Agent] Span (in addition to common properties above).

PropertyTypeRequiredDescription
[Agent] Latency MsnumberYesTotal wall-clock latency in milliseconds for this operation.
[Agent] Is ErrorbooleanYesWhether this event represents an error condition.
[Agent] Error MessagestringNoError message text when Is Error is true.
[Agent] Span IDstringYesUnique identifier for this span (UUID).
[Agent] Span NamestringYesName of the operation (e.g., 'rag_pipeline', 'vector_search', 'rerank').
[Agent] Parent Span IDstringNoSpan ID of the parent span for nested pipeline steps.
[Agent] Input StatestringNoSerialized JSON of the span's input state. Only sent when content_mode='full'.
[Agent] Output StatestringNoSerialized JSON of the span's output state. Only sent when content_mode='full'.

Session End Properties

Event-specific properties for [Agent] Session End (in addition to common properties above).

PropertyTypeRequiredDescription
[Agent] EnrichmentsstringNoSerialized JSON of SessionEnrichments (topic classifications, rubric scores, outcome, flags). Attached when enrichments are provided at session close.
[Agent] Abandonment TurnnumberNoTurn ID of the last user message that received an AI response before the user left. Low values (e.g., 1) strongly signal first-response dissatisfaction.
[Agent] Session Idle Timeout MinutesnumberNoCustom idle timeout for this session (default 30 min). Tells the server how long to wait before auto-closing.

Session Enrichment Properties

Event-specific properties for [Agent] Session Enrichment (in addition to common properties above).

PropertyTypeRequiredDescription
[Agent] EnrichmentsstringYesSerialized JSON of SessionEnrichments: topic_classifications, rubrics, overall_outcome, quality_score, sentiment_score, boolean flags, agent chain metadata, and message labels.

Score Properties

Event-specific properties for [Agent] Score (in addition to common properties above).

PropertyTypeRequiredDescription
[Agent] Score NamestringYesName of the score (e.g., 'user-feedback', 'task_completion', 'accuracy', 'groundedness').
[Agent] Score ValuenumberYesNumeric score value. Binary (0/1), continuous (0.0-1.0), or rating scale (1-5).
[Agent] Target IDstringYesThe message_id or session_id being scored.
[Agent] Target TypestringYesWhat is being scored: 'message' or 'session'.
[Agent] Evaluation SourcestringYesSource of the evaluation: 'user' (end-user feedback), 'ai' (automated/server pipeline), or 'reviewer' (human expert).
[Agent] CommentstringNoOptional text explanation for the score (respects content_mode).
[Agent] Taxonomy VersionstringNoWhich taxonomy config version produced this enrichment (from ai_category_config.config_version_id).
[Agent] Evaluated AtnumberNoEpoch milliseconds when this enrichment/evaluation was computed.
[Agent] Score LabelstringNoDirection-neutral magnitude label derived from score value. Default 5-tier: very_high (>=0.8), high (>=0.6), moderate (>=0.4), low (>=0.2), very_low (>=0.0). Server-side only.

Server-Side: Session Evaluation Properties

[Agent] Session Evaluation is emitted automatically by the server-side enrichment pipeline — do not send this event from your code.

PropertyTypeRequiredDescription
[Agent] Session IDstringYesUnique session identifier. All events in one conversation share the same session ID.
[Agent] Agent IDstringYesIdentifies which AI agent handled the interaction (e.g., 'support-bot', 'houston').
[Agent] Customer Org IDstringYesOrganization ID for multi-tenant platforms. Enables account-level group analytics.
[Agent] Evaluation SourcestringYesSource of the evaluation: 'user' (end-user feedback), 'ai' (automated/server pipeline), or 'reviewer' (human expert).
[Agent] Taxonomy VersionstringYesWhich taxonomy config version produced this enrichment (from ai_category_config.config_version_id).
[Agent] Evaluated AtnumberYesEpoch milliseconds when this enrichment/evaluation was computed.
[Agent] Overall OutcomestringYesSession outcome classification: 'success', 'partial_success', 'failure', 'abandoned', 'response_provided', etc.
[Agent] Turn CountnumberYesNumber of conversation turns in this session.
[Agent] Session Total TokensnumberNoTotal LLM tokens consumed across all turns in this session.
[Agent] Session Avg Latency MsnumberNoAverage AI response latency in milliseconds across the session.
[Agent] Request ComplexitystringNoComplexity classification of the user's request: 'simple', 'moderate', 'complex', or 'ambiguous'.
[Agent] Has Task FailurebooleanYesWhether the agent failed to complete the user's request.
[Agent] Has Negative FeedbackbooleanYesWhether the user expressed dissatisfaction during the session.
[Agent] Has Technical FailurebooleanYesWhether technical errors occurred (tool timeouts, API failures, etc.).
[Agent] Has Data Quality IssuesbooleanYesWhether the AI output had data quality problems (wrong data, hallucinations, etc.).
[Agent] Models Usedstring[]NoLLM models used in this session. JSON array of strings.
[Agent] Root Agent NamestringNoEntry-point agent in multi-agent flows.
[Agent] Agent Chain DepthnumberNoNumber of agents in the delegation chain.
[Agent] Task Failure TypestringNoSpecific failure type when has_task_failure is true (e.g., 'wrong_answer', 'unable_to_complete').
[Agent] Technical Error CountnumberNoCount of technical errors that occurred during the session.
[Agent] Error Categoriesstring[]NoCategorized error types (e.g., 'chart_not_found', 'timeout'). JSON array of strings.
[Agent] Behavioral Patternsstring[]NoDetected behavioral anti-patterns (e.g., 'retry_storm', 'clarification_loop', 'early_abandonment'). JSON array of strings.
[Agent] Session Cost USDnumberNoTotal LLM cost in USD for this AI session (aggregated from per-message costs).
[Agent] Enrichment Cost USDnumberNoCost in USD of running the enrichment pipeline's LLM inference for this session. Distinct from the session's own LLM cost.
[Agent] Quality ScorenumberNoOverall quality score (0.0-1.0) computed by the enrichment pipeline for this session.
[Agent] Sentiment ScorenumberNoUser sentiment score (0.0-1.0) inferred from the conversation by the enrichment pipeline.
[Agent] Task Failure ReasonstringNoExplanation of why the task failed when has_task_failure is true (e.g., 'chart data source unavailable').
[Agent] Agent Chainstring[]NoSerialized JSON array of agent IDs representing the delegation chain in multi-agent flows.
[Agent] Project IDstringNoAmplitude project ID that owns the AI session being evaluated.
[Agent] Has User FeedbackbooleanYesWhether the session received explicit user feedback (thumbs up/down, rating).
[Agent] User ScorenumberNoAggregate user feedback score for the session (0.0-1.0). Present only when has_user_feedback is true.
[Agent] Agent VersionstringNoAgent code version (e.g., 'v4.2'). Enables version-over-version quality comparison.
[Agent] Agent DescriptionstringNoHuman-readable description of the agent's purpose (e.g., 'Handles user chat requests via OpenAI GPT-4o'). Enables observability-driven agent registry from event streams.

Server-Side: Topic Classification Properties

[Agent] Topic Classification is emitted automatically by the server-side enrichment pipeline — do not send this event from your code.

PropertyTypeRequiredDescription
[Agent] Session IDstringYesUnique session identifier. All events in one conversation share the same session ID.
[Agent] Agent IDstringYesIdentifies which AI agent handled the interaction (e.g., 'support-bot', 'houston').
[Agent] Customer Org IDstringYesOrganization ID for multi-tenant platforms. Enables account-level group analytics.
[Agent] Evaluation SourcestringYesSource of the evaluation: 'user' (end-user feedback), 'ai' (automated/server pipeline), or 'reviewer' (human expert).
[Agent] Taxonomy VersionstringYesWhich taxonomy config version produced this enrichment (from ai_category_config.config_version_id).
[Agent] Evaluated AtnumberYesEpoch milliseconds when this enrichment/evaluation was computed.
[Agent] TopicstringYesWhich topic model this classification is for (e.g., 'product_area', 'query_intent', 'error_domain').
[Agent] Selection ModestringYesWhether this topic model uses 'single' (MECE) or 'multiple' (multi-label) selection.
[Agent] PrimarystringNoPrimary classification value (e.g., 'charts', 'billing_issues').
[Agent] Secondarystring[]NoSecondary classifications for multi-label topics. JSON array of strings.
[Agent] Subcategoriesstring[]NoSubcategories for finer classification within the primary topic (e.g., 'TREND_ANALYSIS', 'WRONG_EVENT'). JSON array of strings.

Event JSON Examples

[Agent] AI Response

A realistic example of what gets sent to Amplitude for an AI response:

{
  "event_type": "[Agent] AI Response",
  "user_id": "user-42",
  "event_properties": {
    "[Agent] Session ID": "sess-abc123",
    "[Agent] Trace ID": "trace-def456",
    "[Agent] Turn ID": 2,
    "[Agent] Message ID": "msg-789xyz",
    "[Agent] Model Name": "gpt-4o",
    "[Agent] Provider": "openai",
    "[Agent] Model Tier": "standard",
    "[Agent] Latency Ms": 1203,
    "[Agent] Input Tokens": 150,
    "[Agent] Output Tokens": 847,
    "[Agent] Total Tokens": 997,
    "[Agent] Cost USD": 0.0042,
    "[Agent] Is Error": false,
    "[Agent] Finish Reason": "stop",
    "[Agent] Is Streaming": false,
    "[Agent] Component Type": "llm",
    "[Agent] Agent ID": "support-bot",
    "[Agent] Env": "production",
    "[Agent] SDK Version": "0.1.0",
    "[Agent] Runtime": "node"
  }
}

[Agent] User Message

{
  "event_type": "[Agent] User Message",
  "user_id": "user-42",
  "event_properties": {
    "[Agent] Session ID": "sess-abc123",
    "[Agent] Turn ID": 1,
    "[Agent] Message ID": "msg-123abc",
    "[Agent] Component Type": "user_input",
    "[Agent] Agent ID": "support-bot",
    "[Agent] Env": "production",
    "[Agent] SDK Version": "0.1.0",
    "[Agent] Runtime": "node",
    "$llm_message": {
      "text": "How do I reset my password?"
    }
  }
}

[Agent] Tool Call

{
  "event_type": "[Agent] Tool Call",
  "user_id": "user-42",
  "event_properties": {
    "[Agent] Session ID": "sess-abc123",
    "[Agent] Turn ID": 3,
    "[Agent] Invocation ID": "inv-456def",
    "[Agent] Tool Name": "search_knowledge_base",
    "[Agent] Tool Success": true,
    "[Agent] Is Error": false,
    "[Agent] Latency Ms": 340,
    "[Agent] Component Type": "tool",
    "[Agent] Agent ID": "support-bot",
    "[Agent] Tool Input": "{\"query\":\"password reset instructions\"}",
    "[Agent] Tool Output": "{\"results\":[{\"title\":\"Password Reset Guide\"}]}",
    "[Agent] SDK Version": "0.1.0",
    "[Agent] Runtime": "node"
  }
}

[Agent] Score

{
  "event_type": "[Agent] Score",
  "user_id": "user-42",
  "event_properties": {
    "[Agent] Score Name": "thumbs-up",
    "[Agent] Score Value": 1,
    "[Agent] Target ID": "msg-789xyz",
    "[Agent] Target Type": "message",
    "[Agent] Evaluation Source": "user",
    "[Agent] Session ID": "sess-abc123",
    "[Agent] Agent ID": "support-bot",
    "[Agent] SDK Version": "0.1.0",
    "[Agent] Runtime": "node"
  }
}

Sending Events Without the SDK

The [Agent] event schema is not tied to this SDK. If your stack doesn't have an Amplitude AI SDK, you can send the same events directly via Amplitude's ingestion APIs.

What the SDK handles for you

When you use this SDK, the following are managed automatically. If you send events directly, you are responsible for these:

ConcernSDK behaviorDIY equivalent
Session IDGenerated once per session() and propagated to every eventGenerate a UUID per conversation and include it as [Agent] Session ID on every event
DeduplicationAutomatic insert_id on each eventSet a unique insert_id per event to prevent duplicates on retry
Property prefixingAll properties are prefixed with [Agent]You must include the [Agent] prefix in every property name
Cost / token calculationAuto-computed from model and token countsCompute and send [Agent] Cost USD, [Agent] Input Tokens, etc. yourself
Server-side enrichment[Agent] Session Evaluation, [Agent] Topic Classification, and [Agent] Score events are emitted automatically by the enrichment pipeline after [Agent] Session EndThese fire automatically — you do not need to send them. Just send the SDK-level events and close the session with [Agent] Session End.

Ingestion methods

MethodBest forDocs
HTTP V2 APIReal-time, low-to-medium volumeHTTP V2 API docs
Batch Event Upload APIHigh volume, backfillsBatch API docs
Amazon S3 ImportBulk historical import, warehouse-first workflowsS3 Import docs

Minimal HTTP API example

curl -X POST https://api2.amplitude.com/2/httpapi \
  -H 'Content-Type: application/json' \
  -d '{
    "api_key": "YOUR_API_KEY",
    "events": [
      {
        "event_type": "[Agent] User Message",
        "user_id": "user-42",
        "insert_id": "evt-unique-id-1",
        "event_properties": {
          "[Agent] Session ID": "sess-abc123",
          "[Agent] Trace ID": "trace-def456",
          "[Agent] Turn ID": 1,
          "[Agent] Agent ID": "support-bot",
          "[Agent] Message ID": "msg-001"
        }
      },
      {
        "event_type": "[Agent] AI Response",
        "user_id": "user-42",
        "insert_id": "evt-unique-id-2",
        "event_properties": {
          "[Agent] Session ID": "sess-abc123",
          "[Agent] Trace ID": "trace-def456",
          "[Agent] Turn ID": 1,
          "[Agent] Message ID": "msg-002",
          "[Agent] Agent ID": "support-bot",
          "[Agent] Model Name": "gpt-4o",
          "[Agent] Provider": "openai",
          "[Agent] Latency Ms": 1203,
          "[Agent] Input Tokens": 150,
          "[Agent] Output Tokens": 420,
          "[Agent] Cost USD": 0.0042
        }
      }
    ]
  }'

Refer to the Event Schema and Event Property Reference tables above for required and optional properties per event type.

Register Event Schema in Your Data Catalog

Amplitude's Data Catalog documents events and properties with descriptions, types, and required flags. The @amplitude/ai package includes a tool to generate all the Taxonomy API calls for you.

Prerequisites

  • Amplitude Enterprise plan (Taxonomy API access)
  • Project API key and Secret key from Settings > Projects in your Amplitude org

Option A: Generate curl commands (JS-native, no dependencies)

The bundled CLI reads data/agent_event_catalog.json and prints executable curl commands — it makes no network requests itself.

# Preview the curl commands (uses placeholder keys)
npx amplitude-ai-register-catalog

# Generate with your real keys
npx amplitude-ai-register-catalog --api-key YOUR_KEY --secret-key YOUR_SECRET

# Pipe to bash to execute immediately
npx amplitude-ai-register-catalog --api-key YOUR_KEY --secret-key YOUR_SECRET | bash

# EU data residency
npx amplitude-ai-register-catalog --api-key YOUR_KEY --secret-key YOUR_SECRET --eu | bash

Option B: Python CLI (direct execution)

If you have Python available, the amplitude-ai package provides a CLI that calls the Taxonomy API directly with retry logic and a progress summary:

pip install amplitude-ai
amplitude-ai-register-catalog --api-key YOUR_KEY --secret-key YOUR_SECRET

What gets registered

All 10 [Agent] event types and their properties (see Event Property Reference above), organized under the "Agent Analytics" category. The commands are idempotent — safe to re-run. They create missing events/properties and update existing ones.

Utilities and Type Exports

Token and cost utilities

  • calculateCost() — Returns cost in USD when @pydantic/genai-prices is installed; otherwise returns 0 (never null).
  • countTokens(text, model?) — Uses tiktoken when available. For unknown models, tries o200k_base encoding before falling back to cl100k_base (matching the Python SDK).
  • estimateTokens(text) — Heuristic fallback: ceil(chars/3.5 + words*0.1) (matching the Python SDK).
  • stripProviderPrefix(modelName) — Splits on : (e.g., openai:gpt-4ogpt-4o). Use for normalizing model IDs before cost lookup. Import from @amplitude/ai/internals.

Shared types

The package exports structural interfaces for provider shapes from @amplitude/ai and @amplitude/ai/types: ChatCompletionParams, ChatCompletionResponse, AnthropicParams, AnthropicResponse, BedrockConverseParams, BedrockConverseResponse, MistralChatParams, MistralChatResponse, TrackFn, TrackCallOptions, and related types. Use these for typing provider integrations without depending on the underlying SDK types.

Constants

All PROP_* and EVENT_* constants are exported for advanced use:

import {
  EVENT_AI_RESPONSE,
  EVENT_EMBEDDING,
  EVENT_SCORE,
  EVENT_SESSION_END,
  EVENT_SESSION_ENRICHMENT,
  EVENT_SPAN,
  EVENT_TOOL_CALL,
  EVENT_USER_MESSAGE,
  PROP_MODEL_NAME,
  PROP_SESSION_ID,
  PROP_TRACE_ID,
  // ... etc
} from '@amplitude/ai';

See src/core/tracking.ts and src/core/constants.ts for the full list.

API Reference

For AI Coding Agents

This SDK is designed to be discovered and used by any AI coding agent — Cursor, Claude Code, Windsurf, Copilot, Codex, Cline, or any agent that can read files.

The fastest path:

npm install @amplitude/ai
npx amplitude-ai

The CLI prints a prompt to paste into your agent:

Instrument this app with @amplitude/ai. Follow node_modules/@amplitude/ai/amplitude-ai.md

The agent reads the guide, scans your project, and instruments everything in 4 phases: Detect, Discover, Instrument, Verify.

Files shipped with the package:

FilePurpose
amplitude-ai.mdPrimary guide — self-contained 4-phase instrumentation workflow and full API reference
AGENTS.mdConcise index with canonical patterns, MCP surface, gotchas, and CLI reference
llms.txtCompact discovery file listing tools, resources, and event names
llms-full.txtExtended reference with full API signatures, provider coverage matrix, and common error resolutions
mcp.schema.jsonStructured JSON describing the MCP server's tools, resources, and prompt

Optional: MCP server for advanced tooling. Run amplitude-ai mcp to start the MCP server (standard stdio protocol). MCP-compatible agents can call tools like scan_project, instrument_file, validate_file, and generate_verify_test for deeper analysis. The MCP server is not required for the core instrumentation workflow — amplitude-ai.md is self-contained.

For Python SDK Migrators

If you're moving from amplitude_ai (Python) to @amplitude/ai (TypeScript/Node), the core event model is the same, but ergonomics differ to match the runtime:

AreaPython (amplitude_ai)TypeScript (@amplitude/ai)
Session scopewith session as s:await session.run(async (s) => { ... })
Tool/observe wrappers@tool, @observe decoratorstool(), observe() HOFs
Context propagationcontextvarsAsyncLocalStorage
Tool input schemaOptional auto-schema from Python type hintsExplicit inputSchema object (recommended: define with Zod, pass JSON Schema)
Sync behaviorNative sync + async wrappersWrappers return async (Promise<T>)
MiddlewareStarlette/FastAPI middlewareExpress-compatible middleware
Bootstrap/preloadsitecustomize.py + PYTHONPATH patternsNODE_OPTIONS=--import preload patterns
Provider patching modelPython class replacementPrototype patching + Proxy fallback for lazy getters

Features that do not map 1:1 because of platform/runtime constraints:

  • Auto-generated tool schemas from runtime type introspection
  • Python-style per-call keyword overrides (for example amplitude_user_id=...)
  • Interrupting synchronous tool execution with Python threading primitives
  • CrewAI integration (Python-only; TS package throws a clear error)

Python → TypeScript cheat sheet

# Python
from amplitude_ai import AmplitudeAI, tool, observe

ai = AmplitudeAI(api_key="xxx")
agent = ai.agent("my-agent", user_id="u1")

with agent.session(user_id="u1") as s:
    s.track_user_message("Hello")
    s.track_ai_message("Hi!", model="gpt-4", provider="openai", latency_ms=100)

@tool(name="search")
def search(query: str) -> str:
    return db.search(query)
// TypeScript
import { AmplitudeAI, tool } from '@amplitude/ai';

const ai = new AmplitudeAI({ apiKey: 'xxx' });
const agent = ai.agent('my-agent', { userId: 'u1' });

const session = agent.session({ userId: 'u1' });
await session.run(async (s) => {
  s.trackUserMessage('Hello');
  s.trackAiMessage('Hi!', 'gpt-4', 'openai', 100);
});

const search = tool(async (args: { query: string }) => db.search(args.query), {
  name: 'search',
});

Need Help?

Contributing

Contributions are welcome! Please open an issue first to discuss what you'd like to change, then submit a pull request.

  • Fork the repository
  • Create your branch (git checkout -b my-feature)
  • Install dependencies (pnpm install)
  • Make your changes and add tests
  • Ensure all tests pass (pnpm run test:coverage) and TypeScript compiles (pnpm run test:typescript)
  • Submit a pull request

License

MIT

Keywords

amplitude

FAQs

Package last updated on 03 Apr 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts