
Research
Supply Chain Attack on Axios Pulls Malicious Dependency from npm
A supply chain attack on Axios introduced a malicious dependency, plain-crypto-js@4.2.1, published minutes earlier and absent from the project’s GitHub releases.
cost-katana
Advanced tools
The simplest way to use AI with automatic cost tracking and optimization. Native SDK support for OpenAI and Google Gemini with automatic AWS Bedrock fallback.
Cut your AI costs in half. Without cutting corners.
Cost Katana is a drop-in SDK that wraps your AI calls with automatic cost tracking, smart caching, and optimization—all in one line of code.
TypeScript / Node
npm install cost-katana
Python — published on PyPI as cost-katana (install name uses a hyphen; import uses an underscore).
pip install cost-katana
import cost_katana as ck # package import: cost_katana
Requires Node.js 18+ for the npm package and Python 3.8+ for the PyPI package.
Set COST_KATANA_API_KEY. PROJECT_ID is optional (recommended for per-project analytics in the dashboard).
Use this when you want a drop-in proxy: change base URL and send Authorization: Bearer, or use gateway() in TypeScript with no extra config (reads COST_KATANA_API_KEY, same behavior as createGatewayClientFromEnv()).
cURL (no SDK; OpenAI-compatible JSON):
curl -s https://api.costkatana.com/api/gateway/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $COST_KATANA_API_KEY" \
-d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello!"}]}'
TypeScript
import { gateway, OPENAI } from 'cost-katana';
const res = await gateway().openai({
model: OPENAI.GPT_4O,
messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(res.data);
ai() (simple API, cost on the response)import { ai, OPENAI } from 'cost-katana';
const response = await ai(OPENAI.GPT_4O, 'Hello');
console.log(response.text, response.cost);
Install cost-katana from PyPI, set COST_KATANA_API_KEY (and optionally PROJECT_ID), then:
import cost_katana as ck
from cost_katana import openai
response = ck.ai(openai.gpt_4o, "Hello")
print(response.text, response.cost)
The Python SDK talks to the same hosted backend as TypeScript (https://api.costkatana.com by default). For HTTP gateway usage (OpenAI- or Anthropic-shaped JSON), see the package README on PyPI.
| If you want… | Use |
|---|---|
| Drop-in HTTP proxy (existing OpenAI clients / cURL) | Gateway URL + Authorization: Bearer, or gateway() in TypeScript |
| Simple AI calls with cost on the response | ai() / chat() |
Session replay, advanced analytics, or manual trackUsage | AICostTracker (advanced) |
For most apps, COST_KATANA_API_KEY plus either gateway() (proxy) or ai() (SDK) is enough. For optional direct provider keys, add them to your environment as shown in Configuration.
Start here: COST_KATANA_API_KEY unlocks routing, tracking, and dashboard features. PROJECT_ID is optional (scopes usage to a project in the dashboard).
Create a .env in your project (or export in your shell) with the variables you need:
# Required for hosted Cost Katana
COST_KATANA_API_KEY=dak_your_key_here
# Optional — per-project analytics
PROJECT_ID=your_project_id
# Optional — direct provider keys (bring your own keys)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...
# Optional — AWS Bedrock
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1
There is no .env.example file in this repository; copy the block above into your own .env and fill in values.
import { configure } from 'cost-katana';
await configure({
apiKey: 'dak_your_key',
cortex: true, // 40–75% cost savings (when enabled on requests)
cache: true, // Smart caching (when enabled on requests)
firewall: true, // Block prompt injections
});
ai())| Option | Description |
|---|---|
temperature | Creativity (0–2), default 0.7 |
maxTokens | Max response tokens, default 1000 |
systemMessage | System prompt |
cache | Enable caching |
cortex | Enable optimization (Cortex) |
import { ai, OPENAI } from 'cost-katana';
const response = await ai(OPENAI.GPT_4O, 'Your prompt', {
temperature: 0.7,
maxTokens: 500,
systemMessage: 'You are a helpful AI',
cache: true,
cortex: true,
});
ai()The simplest way to make AI requests with automatic cost tracking.
Signature
await ai(model, prompt, options?);
model — Use type-safe constants (e.g. OPENAI.GPT_4O). String model IDs still work but are deprecated.prompt — User prompt text.options — See Common request options.Returns: text, cost, tokens, model, provider, and optionally cached, optimized, templateUsed when applicable.
import { ai, OPENAI } from 'cost-katana';
const response = await ai(OPENAI.GPT_4O, 'Explain quantum computing', {
temperature: 0.7,
maxTokens: 500,
});
console.log(response.text);
console.log(`Cost: $${response.cost}`);
chat()Create a session with conversation history and cost tracking.
Signature
const session = chat(model, options?);
Session API
| Member | Description |
|---|---|
send(message) | Send a message and append assistant reply |
messages | Full conversation history |
totalCost | Running total cost (USD) |
totalTokens | Running token count |
clear() | Reset conversation (keeps system message if set) |
import { chat, OPENAI } from 'cost-katana';
const session = chat(OPENAI.GPT_4O, {
systemMessage: 'You are a helpful AI assistant.',
temperature: 0.7,
});
await session.send('Hello! What can you help me with?');
await session.send('Tell me a programming joke');
await session.send('Now explain it');
console.log(`Total cost: $${session.totalCost.toFixed(4)}`);
console.log(`Messages: ${session.messages.length}`);
console.log(`Tokens used: ${session.totalTokens}`);
gateway()Zero extra config for the hosted gateway: COST_KATANA_API_KEY is read from the environment. Use the same OpenAI-shaped request bodies you would send upstream.
For advanced gateway features (headers, proxy keys, firewall), see docs/GATEWAY.md and docs/API.md.
Cost Katana is provider-agnostic: the same ai() API works across OpenAI, Anthropic, Google, and more—pick a model constant per provider.
import { ai, OPENAI, ANTHROPIC, GOOGLE } from 'cost-katana';
const a = await ai(OPENAI.GPT_4O, 'Hello');
const b = await ai(ANTHROPIC.CLAUDE_3_5_SONNET_20241022, 'Hello');
const c = await ai(GOOGLE.GEMINI_2_5_PRO, 'Hello');
Benefits
For deeper routing patterns (capabilities, load balancing, multi-provider setups), see the Provider-Agnostic Guide.
Stop guessing model names: use namespaces for autocomplete and typo safety.
import { OPENAI, ANTHROPIC, GOOGLE, AWS_BEDROCK, XAI, DEEPSEEK } from 'cost-katana';
// OpenAI
OPENAI.GPT_5;
OPENAI.GPT_4;
OPENAI.GPT_4O;
OPENAI.GPT_3_5_TURBO;
OPENAI.O1;
OPENAI.O3;
// Anthropic
ANTHROPIC.CLAUDE_SONNET_4_5;
ANTHROPIC.CLAUDE_3_5_SONNET_20241022;
ANTHROPIC.CLAUDE_3_5_HAIKU_20241022;
// Google
GOOGLE.GEMINI_2_5_PRO;
GOOGLE.GEMINI_2_5_FLASH;
GOOGLE.GEMINI_1_5_PRO;
// AWS Bedrock
AWS_BEDROCK.NOVA_PRO;
AWS_BEDROCK.NOVA_LITE;
AWS_BEDROCK.CLAUDE_SONNET_4_5;
// Others
XAI.GROK_2_1212;
DEEPSEEK.DEEPSEEK_CHAT;
Prefer constants over raw strings — They give IDE autocomplete, catch typos early, refactor safely, and document which provider you intended.
| Strategy | Typical savings | When to use |
|---|---|---|
| Use a smaller/faster model (e.g. GPT-3.5 vs GPT-4) | Large on simple tasks | Trivial Q&A, classification, translation |
| Caching | 100% on cache hits | Repeated queries, FAQs |
| Cortex | 40–75% on eligible workloads | Long-form generation |
| Chat sessions | 10–20% | Related multi-turn work |
| Gemini Flash (vs heavy flagship models) | Very high $/token delta | High volume, cost-sensitive |
import { ai, OPENAI } from 'cost-katana';
const response1 = await ai(OPENAI.GPT_4O, 'What is 2+2?', { cache: true });
console.log(`Cached: ${response1.cached}`);
console.log(`Cost: $${response1.cost}`);
const response2 = await ai(OPENAI.GPT_4O, 'What is 2+2?', { cache: true });
console.log(`Cached: ${response2.cached}`);
console.log(`Cost: $${response2.cost}`);
import { ai, OPENAI } from 'cost-katana';
const response = await ai(
OPENAI.GPT_4O,
'Write a comprehensive guide to machine learning for beginners',
{
cortex: true,
maxTokens: 2000,
}
);
console.log(`Optimized: ${response.optimized}`);
console.log(`Cost: $${response.cost}`);
import { ai, OPENAI, ANTHROPIC, GOOGLE } from 'cost-katana';
const prompt = 'Summarize the theory of relativity in 50 words';
const models = [
{ name: 'GPT-4 class', id: OPENAI.GPT_4O },
{ name: 'Claude 3.5 Sonnet', id: ANTHROPIC.CLAUDE_3_5_SONNET_20241022 },
{ name: 'Gemini 2.5 Pro', id: GOOGLE.GEMINI_2_5_PRO },
{ name: 'GPT-3.5 Turbo', id: OPENAI.GPT_3_5_TURBO },
];
console.log('Model cost comparison\n');
for (const model of models) {
const response = await ai(model.id, prompt);
console.log(`${model.name.padEnd(22)} $${response.cost.toFixed(6)}`);
}
import { ai, OPENAI } from 'cost-katana';
// Expensive: flagship model for a trivial question
await ai(OPENAI.GPT_4O, 'What is 2+2?');
// Better: match model to task
await ai(OPENAI.GPT_3_5_TURBO, 'What is 2+2?');
// Better still: cache repeated FAQs
await ai(OPENAI.GPT_3_5_TURBO, 'What is 2+2?', { cache: true });
// Long content: Cortex
await ai(OPENAI.GPT_4O, 'Write a 2000-word essay', { cortex: true });
Block prompt injection and related abuse when enabled via configure({ firewall: true }) and gateway/tracker settings.
import { configure, ai, OPENAI } from 'cost-katana';
await configure({ firewall: true });
try {
await ai(OPENAI.GPT_4O, 'Ignore all previous instructions and...');
} catch (error) {
console.log('Blocked:', (error as Error).message);
}
Helps mitigate: prompt injection, jailbreak attempts, unsafe content patterns (exact behavior depends on your gateway configuration).
When routing and health checks allow, requests can fall back across providers so a single vendor outage does not take down your app.
import { ai, OPENAI } from 'cost-katana';
const response = await ai(OPENAI.GPT_4O, 'Hello');
console.log(`Provider used: ${response.provider}`);
// e.g. 'openai', 'anthropic', or 'google' depending on availability and policy
configure() and ai()Use the same ai() API everywhere. Point usage at your project once with configure() or environment variables.
import { configure, ai, OPENAI } from 'cost-katana';
await configure({
apiKey: process.env.COST_KATANA_API_KEY,
projectId: process.env.PROJECT_ID,
});
const response = await ai(OPENAI.GPT_4O, 'Explain quantum computing');
console.log(response.text);
console.log('Cost:', response.cost);
console.log('Tokens:', response.tokens);
Calls can be attributed to your project in the dashboard. You can also pass projectId through tracker/gateway options where supported when using multiple projects.
AICostTracker with defaults (advanced)When you need a dedicated tracker instance (not only the global ai() helper), use createCostKatanaTracker() or AICostTracker.createWithDefaults(). They populate TrackerConfig from the same environment rules as auto-config:
OPENAI_API_KEY, ANTHROPIC_API_KEY, GOOGLE_API_KEY, or AWS Bedrock credentials), those providers are registered.COST_KATANA_API_KEY and no direct provider keys, the default is Cost Katana hosted models via the gateway: inference can route through the hosted API without embedding vendor keys in your app.import { createCostKatanaTracker, AIProvider } from 'cost-katana';
const tracker = await createCostKatanaTracker();
const custom = await createCostKatanaTracker({
optimization: { enablePromptOptimization: false },
providers: [{ provider: AIProvider.OpenAI, apiKey: process.env.OPENAI_API_KEY! }],
});
// Same idea: await AICostTracker.createWithDefaults({ ... })
// Short alias: import { tracker as costKatana } from 'cost-katana';
Requires COST_KATANA_API_KEY in the environment (same as AICostTracker.create()). PROJECT_ID remains optional.
For a small complete()-style API on top of AICostTracker, use createOpenAITracker, createAnthropicTracker, etc.
import { createOpenAITracker, OPENAI } from 'cost-katana';
const t = await createOpenAITracker({ model: OPENAI.GPT_4O });
const response = await t.complete({ prompt: 'Explain quantum computing' });
console.log(response.text);
console.log('Total cost (USD):', response.cost.totalCost);
console.log('Response time (ms):', response.responseTime);
For gateway proxying, manual trackUsage, or a fully custom AICostTracker, see docs/API.md and examples/.
With tracking enabled, you can inspect:
import { createCostKatanaTracker } from 'cost-katana';
const tracker = await createCostKatanaTracker();
await tracker.trackUsage({
model: 'gpt-4o',
provider: 'openai',
prompt: 'Hello, world!',
completion: 'Hello! How can I help you today?',
promptTokens: 3,
completionTokens: 9,
totalTokens: 12,
cost: 0.00036,
responseTime: 850,
userId: 'user_123',
sessionId: 'session_abc',
tags: ['chat', 'greeting'],
requestMetadata: {
userAgent: typeof navigator !== 'undefined' ? navigator.userAgent : undefined,
clientIP: await fetch('https://api.ipify.org').then((r) => r.text()),
feature: 'chat-interface',
},
});
The trace submodule provides session graphs, spans, and middleware. See src/trace/README.md for exports such as TraceClient, LocalTraceService, and createTraceMiddleware.
// app/api/chat/route.ts
import { ai, OPENAI } from 'cost-katana';
export async function POST(request: Request) {
const { prompt } = await request.json();
const response = await ai(OPENAI.GPT_4O, prompt);
return Response.json(response);
}
import express from 'express';
import { ai, OPENAI } from 'cost-katana';
const app = express();
app.use(express.json());
app.post('/api/chat', async (req, res) => {
const response = await ai(OPENAI.GPT_4O, req.body.prompt);
res.json(response);
});
app.listen(3000);
import fastify from 'fastify';
import { ai, OPENAI } from 'cost-katana';
const app = fastify();
app.post('/api/chat', async (request) => {
const { prompt } = request.body as { prompt: string };
return await ai(OPENAI.GPT_4O, prompt);
});
app.listen({ port: 3000 });
import { Controller, Post, Body } from '@nestjs/common';
import { ai, OPENAI } from 'cost-katana';
@Controller('api')
export class ChatController {
@Post('chat')
async chat(@Body() body: { prompt: string }) {
return await ai(OPENAI.GPT_4O, body.prompt);
}
}
import { ai, OPENAI } from 'cost-katana';
try {
const response = await ai(OPENAI.GPT_4O, 'Hello');
console.log(response.text);
} catch (error) {
const err = error as Error & { code?: string; availableModels?: string[] };
switch (err.code) {
case 'NO_API_KEY':
console.log('Set COST_KATANA_API_KEY or a provider API key');
break;
case 'RATE_LIMIT':
console.log('Rate limited. Retry with backoff.');
break;
case 'INVALID_MODEL':
console.log('Model not found. Available:', err.availableModels);
break;
default:
console.log('Error:', err.message);
}
}
Exact code values depend on the failure path (gateway vs direct provider). Always log message for support.
The gateway is an HTTP proxy: call Cost Katana’s URL with your API key; the service forwards to OpenAI, Anthropic, Google, Cohere, and others, and can attach caching, retries, firewall, and tracking.
gateway() or cURL).CostKatana-Target-Url: Use for non-default upstream URLs (Azure OpenAI, private endpoints). For standard routes (/v1/chat/completions, /v1/messages, …), gateway() often uses inferTargetUrl: true and omits it.gateway.anthropic(...) / /v1/messages may not require an Anthropic key in your app; the service may use Bedrock when no server ANTHROPIC_API_KEY is set (see docs for streaming limitations).AICostTracker / trackUsage supports custom structured logging. For multi-turn and token nuances, see examples/GATEWAY_USAGE_AND_TRACKING.md and costkatana-examples 2-gateway.The Cost Katana backend (costkatana-backend-nest) exposes experimentation REST endpoints under /api/experimentation on the hosted API (same origin as the gateway, e.g. https://api.costkatana.com). The dashboard Experimentation UI uses these APIs; you can also integrate them directly.
What it covers
POST /api/experimentation/model-comparison).POST /api/experimentation/real-time-comparison) and stream progress over SSE at GET /api/experimentation/comparison-progress/:sessionId (session token validated). Poll or reconnect via GET /api/experimentation/comparison-job/:sessionId when authenticated.GET /api/experimentation/available-models returns router-registered models (active/inactive) for picking candidates.POST /api/experimentation/estimate-cost (public) for experiment cost estimates before you run./api/experimentation/what-if-scenarios, .../:scenarioName/analyze, lifecycle updates).POST /api/experimentation/real-time-simulation (public) for what-if style simulations.GET /api/experimentation/history, GET /api/experimentation/recommendations, GET /api/experimentation/fine-tuning-analysis.GET /api/experimentation/:experimentId/export?format=json|csv for results.Auth
JwtAuthGuard).experimentation.controller.ts in costkatana-backend-nest.Server configuration
ENABLE_REAL_MODEL_COMPARISON=true where your deployment enables live API calls to providers.In this repo
| Resource | Description |
|---|---|
docs/API.md | API reference |
docs/EXAMPLES.md | Examples index |
docs/GATEWAY.md | Gateway |
docs/PROMPT_OPTIMIZATION.md | Prompt optimization |
docs/WEBHOOKS.md | Webhooks |
examples/ | Runnable TypeScript examples |
External examples repo — 45+ complete examples:
github.com/Hypothesize-Tech/costkatana-examples
| Category | Topics |
|---|---|
| Cost tracking | Budgets, alerts |
| Gateway | Routing, load balancing, failover |
| Optimization | Cortex, caching, compression |
| Observability | OpenTelemetry, tracing, metrics |
| Security | Firewall, rate limiting, moderation |
| Workflows | Multi-step orchestration |
| Frameworks | Express, Next.js, Fastify, NestJS, FastAPI |
// Before
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: 'sk-...' });
const completion = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Hello' }],
});
console.log(completion.choices[0].message.content);
// After
import { ai, OPENAI } from 'cost-katana';
const response = await ai(OPENAI.GPT_4, 'Hello');
console.log(response.text);
console.log(`Cost: $${response.cost}`);
// Before
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({ apiKey: 'sk-ant-...' });
const message = await anthropic.messages.create({
model: 'claude-3-sonnet-20241022',
messages: [{ role: 'user', content: 'Hello' }],
});
// After
import { ai, ANTHROPIC } from 'cost-katana';
const response = await ai(ANTHROPIC.CLAUDE_3_5_SONNET_20241022, 'Hello');
// Before
import { ChatOpenAI } from 'langchain/chat_models/openai';
const model = new ChatOpenAI({ modelName: 'gpt-4' });
const response = await model.call([{ content: 'Hello' }]);
// After
import { ai, OPENAI } from 'cost-katana';
const response = await ai(OPENAI.GPT_4, 'Hello');
We welcome contributions. See Contributing Guide.
git clone https://github.com/Hypothesize-Tech/costkatana-core.git
cd costkatana-core
npm install
npm run lint # Check code style
npm run lint:fix # Auto-fix issues
npm run format # Format code
npm test # Run tests
npm run build # Build
| Channel | Link |
|---|---|
| Dashboard | costkatana.com |
| Documentation | docs.costkatana.com |
| GitHub | github.com/Hypothesize-Tech |
| Discord | discord.gg/D8nDArmKbY |
| support@costkatana.com |
MIT © Cost Katana
Start cutting AI costs today
npm install cost-katana
import { ai, OPENAI } from 'cost-katana';
await ai(OPENAI.GPT_4, 'Hello, world!');
FAQs
The simplest way to use AI with automatic cost tracking and optimization. Native SDK support for OpenAI and Google Gemini with automatic AWS Bedrock fallback.
We found that cost-katana demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Research
A supply chain attack on Axios introduced a malicious dependency, plain-crypto-js@4.2.1, published minutes earlier and absent from the project’s GitHub releases.

Research
Malicious versions of the Telnyx Python SDK on PyPI delivered credential-stealing malware via a multi-stage supply chain attack.

Security News
TeamPCP is partnering with ransomware group Vect to turn open source supply chain attacks on tools like Trivy and LiteLLM into large-scale ransomware operations.