Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement →

@tessera-llm/vercel-ai

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

@tessera-llm/vercel-ai

Drop-in Tessera integration for the Vercel AI SDK. One line of config routes generateText / streamText / generateObject / streamObject through Tessera's auto-route + auto-cache + auto-compress + auto-batch proxy. Compatible with @ai-sdk/openai, @ai-sdk/an

latest

Source

npm

Version: 0.1.3

Version published: 5 days ago

Maintainers: 1

Created: 2 weeks ago

Source

`@tessera-llm/vercel-ai`

Drop-in cost optimization for the Vercel AI SDK. One line of config routes your existing generateText / streamText / generateObject / streamObject calls through the Tessera optimization proxy — auto-route to cheaper-equivalent models, exact + provider-prompt-cache hits, prompt compression with per-stack quality canary, batch arbitrage on async-tolerant calls. Free Sandbox tier: 60M tokens/month, no card. Paid tiers: flat monthly subscription by token volume, keep 100% of savings.

Companion to tessera-sdk (vanilla provider SDKs), tessera-langchain (LangChain integration), tessera-llamaindex (LlamaIndex integration), tessera-mastra (Mastra Agent framework integration), tessera-pydantic-ai (Pydantic AI integration), tessera-crewai (CrewAI multi-agent integration), and tessera-autogen (AutoGen 0.4+ multi-agent integration). Same proxy, same mechanic stack, Vercel AI SDK-shaped API.

What it looks like

import { generateText } from "ai";
import { createOpenAI } from "@ai-sdk/openai";
import { tesseraOpenAIConfig } from "@tessera-llm/vercel-ai";

const openai = createOpenAI({
  apiKey: process.env.OPENAI_API_KEY!,
  ...tesseraOpenAIConfig({ apiKey: process.env.TESSERA_API_KEY! }),
});

const { text } = await generateText({
  model: openai("gpt-4o"),
  prompt: "Summarize this customer support ticket in 2 sentences.",
});

Three changes in your code: one import, three lines in the constructor call. Your existing generateText / streamText / generateObject calls work unchanged.

Or use the convenience factory (skips the explicit createOpenAI import):

import { generateText } from "ai";
import { tesseraOpenAI } from "@tessera-llm/vercel-ai";

const openai = await tesseraOpenAI({
  openaiApiKey: process.env.OPENAI_API_KEY!,
  tesseraApiKey: process.env.TESSERA_API_KEY!,
});

const { text } = await generateText({
  model: openai("gpt-4o"),
  prompt: "Summarize this customer support ticket in 2 sentences.",
});

Install

npm install @tessera-llm/vercel-ai
# Plus whichever provider package you use:
npm install @ai-sdk/openai          # or @ai-sdk/anthropic / @ai-sdk/mistral / @ai-sdk/groq / @ai-sdk/cohere

The @ai-sdk/* packages are peer dependencies — install only the providers you actually use. The ai core SDK is whatever version you already have.

Get a free Tessera API key (60M tokens/mo, no card) at tesseraai.io/dev — sign-up takes ~30 seconds and returns an instant tk_… key plus magic-link dashboard access.

Provider support

Provider	@ai-sdk package	Tessera config function	Convenience factory
OpenAI	`@ai-sdk/openai`	`tesseraOpenAIConfig`	`tesseraOpenAI`
Anthropic	`@ai-sdk/anthropic`	`tesseraAnthropicConfig`	`tesseraAnthropic`
Mistral	`@ai-sdk/mistral`	`tesseraMistralConfig`	`tesseraMistral`
Groq	`@ai-sdk/groq`	`tesseraGroqConfig`	`tesseraGroq`
Cohere	`@ai-sdk/cohere`	`tesseraCohereConfig`	`tesseraCohere`

Generic dispatcher available too: tesseraConfig("openai", { apiKey: "tk_..." }) — returns the right { baseURL, headers } object regardless of provider. Useful when the provider is parameterized at runtime.

Worked example

Real customer-support agent on gpt-4o, 5B tokens/month, OpenAI list prices:

Stage	Cost / mo	Saved
Baseline — OpenAI direct via Vercel AI SDK	$24,000	—
+ Tessera (route, cache, prompt-cache headers, compress, M9 ceiling, batch)	$9,400	$14,600
Tessera subscription (Growth tier, flat)	$999	—
You net pay	$10,399	$13,601 / mo saved

Verify the savings math yourself. Every billable line traces back to two immutable cost figures pinned to a multi-source pricing catalog snapshot captured at request time. Two engineers, three hours, can re-derive any month from raw inputs. Full procedure at tesseraai.io/trust.

Quality canary across the full mechanic stack: mean-score 0.96 (floor 0.95) — 0.95 SLA held all 30 days. Full breakdown: /blog/cut-openai-bill-48-percent-without-quality-regression.

What Tessera does on every request

Same mechanic stack as the main tessera-sdk. Each mechanic is opt-in per workload, observable per request, and bypasses when its quality canary drops below the per-stack 0.95 floor.

Mechanic	What it does	Typical savings
Auto-route _(m1)	Route to a cheaper-equivalent model gated by a daily promptfoo canary on your eval set	15–35% on routed calls
Auto-cache _(m2)	sha256 cache on the canonical request body, 7-day TTL, Cloudflare edge KV	5–40% depending on prompt repetition
Auto-compress _(m3)	Per-role heuristic compression (system + user toggles independent). Preserves code fences and JSON shapes.	5–15% on prompt tokens
Prompt cache _(m6)	Inject provider-native cache headers — OpenAI cached-input (50% off), Anthropic `cache_control: ephemeral` (90% off cache reads)	50–90% on cached prefixes
Context prune _(m7)	Conservative trim on long conversations (system + last 8 turns; TF-IDF rerank on RAG attachments)	5–25% on multi-turn workloads
Output-length ceiling _(m9)	Daily compute fits p90 of completion length per workload, injects `maxTokens = p90 × 1.3`	5–15% on completion cost
Batch arbitrage _(m10)	Route async-tolerant calls to provider Batch APIs (OpenAI Batch + Anthropic Message Batches both 50% off)	50% on batch-eligible traffic
Per-provider circuit breaker	(Reliability primitive, above the mechanics.) Rolling 5xx-rate state machine per upstream — when a provider degrades, auto-route skips its intra-provider alternative mappings until the half-open probe succeeds.	n/a — keeps the savings stack honest

Pricing

Free Sandbox — 60M tokens/month, 30 requests/minute, observability-only mechanics, no card. Forever.
Paid tiers — flat monthly subscription by token volume: Starter $199 (≤1B), Growth $999 (≤5B), Scale $3,999 (≤20B), Enterprise custom (20B+). You keep 100% of measured savings.

Existing customers of tessera-sdk and tessera-langchain keep their rate_locked_pct (if any) on this package too — same tk_… key, same billing record.

FAQ

Q: How is this different from `tessera-langchain` and `tessera-sdk`?

Same proxy. Same mechanics. Same billing. The three packages target different code surfaces:

tessera-sdk — patches the underlying provider client constructors (OpenAI, Anthropic, etc.) directly via tessera.activate(key). Use when calling provider SDKs without a framework.
tessera-langchain — wires into LangChain ChatModel constructors. Use when you're on LangChain.
tessera-vercel-ai (this package) — wires into the Vercel AI SDK provider factories (createOpenAI, createAnthropic, etc.). Use when you're on ai core + @ai-sdk/*.

Pick whichever fits your codebase. Side-by-side install is supported — all three resolve to the same proxy and same billing record.

Q: Does this break my eval / structured output / tool calling / streaming?

No. The Vercel AI SDK provider object behaves identically — generateText, streamText, generateObject, streamObject all work unchanged. Schema-constrained outputs pass through. Tools pass through (auto-route gates on tool-calling capability). Streaming streams.

Q: What happens if Tessera's proxy is down?

Your application gets HTTP errors instead of LLM responses. On the proxy side, a per-provider circuit breaker tracks rolling 5xx rates and skips degraded providers in auto-route decisions. Cross-provider failover (re-routing to a different provider entirely when an upstream is down) is on the roadmap, not shipped yet.

Q: What happens to my OpenAI / Anthropic rate limits?

They pass through. Tessera does not aggregate quotas across customers. Your provider rate limits apply normally; the proxy enforces only the Tessera tier limits (30 rpm Free Sandbox, 60 rpm Production by default — higher on request).

Q: Are you storing my prompts and completions?

No. We log only token counts, cost deltas, mechanics_stack, and provider response status. Prompts and completions are never persisted. Full data handling on tesseraai.io/security.

Q: Why are there two API surfaces (`tesseraOpenAIConfig` vs `tesseraOpenAI`)?

The config function returns the kwargs object you spread into createOpenAI(...) — explicit, easy to combine with other settings (organization, custom fetch, etc.). The convenience factory imports createOpenAI for you and pre-merges. Use whichever you find more readable. Both ship in the same package.

Q: Can I use this with the Next.js App Router / Server Actions / Edge Runtime?

Yes. @tessera-llm/vercel-ai is a thin ESM/CJS dual package with no runtime dependencies on its own — same compatibility as the Vercel AI SDK itself.

Architecture

Open-source SDK ↔ closed-source proxy. This package is a thin client that adds one HTTP hop. The actual mechanic decisions (route, cache, compress, etc.) run inside the Tessera Cloudflare Worker proxy at api.tesseraai.io. The split is intentional: the wire format is open so you can audit what we send; the mechanic implementations are closed because that's the asymmetric IP. See the tessera-sdk README's "Architecture" note for the longer explanation.

License

Apache-2.0. See LICENSE.

Contributing

We accept PRs that:

Add support for a new @ai-sdk/* provider package (paste-and-mirror the existing config function shape)
Improve typing precision (TypeScript strict)
Add concrete example scripts under examples/ showing a real Vercel AI SDK pipeline
Improve tests or test infrastructure

We do not accept PRs that change the proxy's HTTP contract — that lives in the closed-source worker.

Versioning

Semver. Wire format compatibility committed across minor releases; breaking changes only on major bumps.

Security

See SECURITY.md. Coordinated disclosure address: security@tesseraai.io.

About Tessera

Tessera is the substrate layer for LLM cost optimization, also called the Optimize Layer in our product surface. A thin proxy that sits in your application's request-path, applies a conservative cascade of optimization mechanics, and measures every saved dollar against an audit-immutable baseline. We bill a flat monthly subscription by token volume (Starter $199, Growth $999, Scale $3,999, Enterprise custom); you keep 100% of measured savings. No per-token gateway fee; the category we operate in is "LLM cost optimizer," distinct from per-token AI gateways and observability dashboards.

Where observability tools tell you what you spent and AI gateways re-shape the request without measuring the cost delta, Tessera is the layer that does both, and proves the measured savings line by line. The verified-savings ledger at ledger.tesseraai.io shows every original-vs-actual cost pair, snapshot-pinned to a pricing_catalog version captured at request time. Mid-contract price changes don't retroactively alter past savings. This is the FinOps-friendly model for AI inference: every line of the bill traces to a code-enforced rule.

Operated by Fintechagency OÜ (Tallinn, Estonia, registry code 16638667).

Developer entry: tesseraai.io/dev
Mechanic reference: tesseraai.io/how-it-works
Dashboard: ledger.tesseraai.io
Engineering blog: tesseraai.io/blog

Keywords

FAQs

What is @tessera-llm/vercel-ai?

Is @tessera-llm/vercel-ai well maintained?

Package last updated on 28 May 2026

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

@tessera-llm/vercel-ai

What it looks like

Install

Provider support

Worked example

What Tessera does on every request

Pricing

FAQ

Q: How is this different from tessera-langchain and tessera-sdk?

Q: Does this break my eval / structured output / tool calling / streaming?

Q: What happens if Tessera's proxy is down?

Q: What happens to my OpenAI / Anthropic rate limits?

Q: Are you storing my prompts and completions?

Q: Why are there two API surfaces (tesseraOpenAIConfig vs tesseraOpenAI)?

Q: Can I use this with the Next.js App Router / Server Actions / Edge Runtime?

Architecture

License

Contributing

Versioning

Security

About Tessera

Keywords

Related posts

Famous Chollima Targets PHP Developers Through Compromised Packagist Package

Rust Moves to Restrict LLM Use in Contributions After Months of Internal Debate

Q: How is this different from `tessera-langchain` and `tessera-sdk`?

Q: Why are there two API surfaces (`tesseraOpenAIConfig` vs `tesseraOpenAI`)?