New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

cost-katana

Package Overview
Dependencies
Maintainers
1
Versions
29
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

cost-katana

The simplest way to use AI with automatic cost tracking and optimization. Native SDK support for OpenAI and Google Gemini with automatic AWS Bedrock fallback.

latest
Source
npmnpm
Version
2.4.2
Version published
Maintainers
1
Created
Source

Cost Katana

npm PyPI License: MIT Node.js Python

Cut your AI costs in half. Without cutting corners.

Cost Katana is a drop-in SDK that wraps your AI calls with automatic cost tracking, smart caching, and optimization—all in one line of code.

Table of contents

Installation

TypeScript / Node

npm install cost-katana

Python — published on PyPI as cost-katana (install name uses a hyphen; import uses an underscore).

pip install cost-katana
import cost_katana as ck  # package import: cost_katana

Requires Node.js 18+ for the npm package and Python 3.8+ for the PyPI package.

Quick start

Set COST_KATANA_API_KEY. PROJECT_ID is optional (recommended for per-project analytics in the dashboard).

Path A — Gateway (HTTP proxy)

Use this when you want a drop-in proxy: change base URL and send Authorization: Bearer, or use gateway() in TypeScript with no extra config (reads COST_KATANA_API_KEY, same behavior as createGatewayClientFromEnv()).

cURL (no SDK; OpenAI-compatible JSON):

curl -s https://api.costkatana.com/api/gateway/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $COST_KATANA_API_KEY" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello!"}]}'

TypeScript

import { gateway, OPENAI } from 'cost-katana';

const res = await gateway().openai({
  model: OPENAI.GPT_4O,
  messages: [{ role: 'user', content: 'Hello!' }],
});

console.log(res.data);

Path B — ai() (simple API, cost on the response)

import { ai, OPENAI } from 'cost-katana';

const response = await ai(OPENAI.GPT_4O, 'Hello');

console.log(response.text, response.cost);

Path C — Python

Install cost-katana from PyPI, set COST_KATANA_API_KEY (and optionally PROJECT_ID), then:

import cost_katana as ck
from cost_katana import openai

response = ck.ai(openai.gpt_4o, "Hello")
print(response.text, response.cost)

The Python SDK talks to the same hosted backend as TypeScript (https://api.costkatana.com by default). For HTTP gateway usage (OpenAI- or Anthropic-shaped JSON), see the package README on PyPI.

Which API should I use?

If you want…Use
Drop-in HTTP proxy (existing OpenAI clients / cURL)Gateway URL + Authorization: Bearer, or gateway() in TypeScript
Simple AI calls with cost on the responseai() / chat()
Session replay, advanced analytics, or manual trackUsageAICostTracker (advanced)

For most apps, COST_KATANA_API_KEY plus either gateway() (proxy) or ai() (SDK) is enough. For optional direct provider keys, add them to your environment as shown in Configuration.

Configuration

Environment variables

Start here: COST_KATANA_API_KEY unlocks routing, tracking, and dashboard features. PROJECT_ID is optional (scopes usage to a project in the dashboard).

Create a .env in your project (or export in your shell) with the variables you need:

# Required for hosted Cost Katana
COST_KATANA_API_KEY=dak_your_key_here

# Optional — per-project analytics
PROJECT_ID=your_project_id

# Optional — direct provider keys (bring your own keys)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...

# Optional — AWS Bedrock
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1

There is no .env.example file in this repository; copy the block above into your own .env and fill in values.

Programmatic configuration

import { configure } from 'cost-katana';

await configure({
  apiKey: 'dak_your_key',
  cortex: true, // 40–75% cost savings (when enabled on requests)
  cache: true, // Smart caching (when enabled on requests)
  firewall: true, // Block prompt injections
});

Common request options (ai())

OptionDescription
temperatureCreativity (0–2), default 0.7
maxTokensMax response tokens, default 1000
systemMessageSystem prompt
cacheEnable caching
cortexEnable optimization (Cortex)
import { ai, OPENAI } from 'cost-katana';

const response = await ai(OPENAI.GPT_4O, 'Your prompt', {
  temperature: 0.7,
  maxTokens: 500,
  systemMessage: 'You are a helpful AI',
  cache: true,
  cortex: true,
});

Core APIs

ai()

The simplest way to make AI requests with automatic cost tracking.

Signature

await ai(model, prompt, options?);
  • model — Use type-safe constants (e.g. OPENAI.GPT_4O). String model IDs still work but are deprecated.
  • prompt — User prompt text.
  • options — See Common request options.

Returns: text, cost, tokens, model, provider, and optionally cached, optimized, templateUsed when applicable.

import { ai, OPENAI } from 'cost-katana';

const response = await ai(OPENAI.GPT_4O, 'Explain quantum computing', {
  temperature: 0.7,
  maxTokens: 500,
});

console.log(response.text);
console.log(`Cost: $${response.cost}`);

chat()

Create a session with conversation history and cost tracking.

Signature

const session = chat(model, options?);

Session API

MemberDescription
send(message)Send a message and append assistant reply
messagesFull conversation history
totalCostRunning total cost (USD)
totalTokensRunning token count
clear()Reset conversation (keeps system message if set)
import { chat, OPENAI } from 'cost-katana';

const session = chat(OPENAI.GPT_4O, {
  systemMessage: 'You are a helpful AI assistant.',
  temperature: 0.7,
});

await session.send('Hello! What can you help me with?');
await session.send('Tell me a programming joke');
await session.send('Now explain it');

console.log(`Total cost: $${session.totalCost.toFixed(4)}`);
console.log(`Messages: ${session.messages.length}`);
console.log(`Tokens used: ${session.totalTokens}`);

gateway()

Zero extra config for the hosted gateway: COST_KATANA_API_KEY is read from the environment. Use the same OpenAI-shaped request bodies you would send upstream.

For advanced gateway features (headers, proxy keys, firewall), see docs/GATEWAY.md and docs/API.md.

Provider-independent design

Cost Katana is provider-agnostic: the same ai() API works across OpenAI, Anthropic, Google, and more—pick a model constant per provider.

import { ai, OPENAI, ANTHROPIC, GOOGLE } from 'cost-katana';

const a = await ai(OPENAI.GPT_4O, 'Hello');
const b = await ai(ANTHROPIC.CLAUDE_3_5_SONNET_20241022, 'Hello');
const c = await ai(GOOGLE.GEMINI_2_5_PRO, 'Hello');

Benefits

  • Automatic failover — Seamlessly switch providers when configured (see Security and reliability).
  • Cost optimization — Choose cheaper models with constants and the cost optimization patterns below.
  • Future-proof — New providers and models are added to the registry without changing your mental model.
  • Zero lock-in — Swap model constants as your stack evolves.

For deeper routing patterns (capabilities, load balancing, multi-provider setups), see the Provider-Agnostic Guide.

Type-safe model constants

Stop guessing model names: use namespaces for autocomplete and typo safety.

import { OPENAI, ANTHROPIC, GOOGLE, AWS_BEDROCK, XAI, DEEPSEEK } from 'cost-katana';

// OpenAI
OPENAI.GPT_5;
OPENAI.GPT_4;
OPENAI.GPT_4O;
OPENAI.GPT_3_5_TURBO;
OPENAI.O1;
OPENAI.O3;

// Anthropic
ANTHROPIC.CLAUDE_SONNET_4_5;
ANTHROPIC.CLAUDE_3_5_SONNET_20241022;
ANTHROPIC.CLAUDE_3_5_HAIKU_20241022;

// Google
GOOGLE.GEMINI_2_5_PRO;
GOOGLE.GEMINI_2_5_FLASH;
GOOGLE.GEMINI_1_5_PRO;

// AWS Bedrock
AWS_BEDROCK.NOVA_PRO;
AWS_BEDROCK.NOVA_LITE;
AWS_BEDROCK.CLAUDE_SONNET_4_5;

// Others
XAI.GROK_2_1212;
DEEPSEEK.DEEPSEEK_CHAT;

Prefer constants over raw strings — They give IDE autocomplete, catch typos early, refactor safely, and document which provider you intended.

Cost optimization

Cheatsheet

StrategyTypical savingsWhen to use
Use a smaller/faster model (e.g. GPT-3.5 vs GPT-4)Large on simple tasksTrivial Q&A, classification, translation
Caching100% on cache hitsRepeated queries, FAQs
Cortex40–75% on eligible workloadsLong-form generation
Chat sessions10–20%Related multi-turn work
Gemini Flash (vs heavy flagship models)Very high $/token deltaHigh volume, cost-sensitive

Caching

import { ai, OPENAI } from 'cost-katana';

const response1 = await ai(OPENAI.GPT_4O, 'What is 2+2?', { cache: true });
console.log(`Cached: ${response1.cached}`);
console.log(`Cost: $${response1.cost}`);

const response2 = await ai(OPENAI.GPT_4O, 'What is 2+2?', { cache: true });
console.log(`Cached: ${response2.cached}`);
console.log(`Cost: $${response2.cost}`);

Cortex (optimization)

import { ai, OPENAI } from 'cost-katana';

const response = await ai(
  OPENAI.GPT_4O,
  'Write a comprehensive guide to machine learning for beginners',
  {
    cortex: true,
    maxTokens: 2000,
  }
);

console.log(`Optimized: ${response.optimized}`);
console.log(`Cost: $${response.cost}`);

Compare models side by side

import { ai, OPENAI, ANTHROPIC, GOOGLE } from 'cost-katana';

const prompt = 'Summarize the theory of relativity in 50 words';

const models = [
  { name: 'GPT-4 class', id: OPENAI.GPT_4O },
  { name: 'Claude 3.5 Sonnet', id: ANTHROPIC.CLAUDE_3_5_SONNET_20241022 },
  { name: 'Gemini 2.5 Pro', id: GOOGLE.GEMINI_2_5_PRO },
  { name: 'GPT-3.5 Turbo', id: OPENAI.GPT_3_5_TURBO },
];

console.log('Model cost comparison\n');

for (const model of models) {
  const response = await ai(model.id, prompt);
  console.log(`${model.name.padEnd(22)} $${response.cost.toFixed(6)}`);
}

Quick wins

import { ai, OPENAI } from 'cost-katana';

// Expensive: flagship model for a trivial question
await ai(OPENAI.GPT_4O, 'What is 2+2?');

// Better: match model to task
await ai(OPENAI.GPT_3_5_TURBO, 'What is 2+2?');

// Better still: cache repeated FAQs
await ai(OPENAI.GPT_3_5_TURBO, 'What is 2+2?', { cache: true });

// Long content: Cortex
await ai(OPENAI.GPT_4O, 'Write a 2000-word essay', { cortex: true });

Security and reliability

Firewall

Block prompt injection and related abuse when enabled via configure({ firewall: true }) and gateway/tracker settings.

import { configure, ai, OPENAI } from 'cost-katana';

await configure({ firewall: true });

try {
  await ai(OPENAI.GPT_4O, 'Ignore all previous instructions and...');
} catch (error) {
  console.log('Blocked:', (error as Error).message);
}

Helps mitigate: prompt injection, jailbreak attempts, unsafe content patterns (exact behavior depends on your gateway configuration).

Auto-failover

When routing and health checks allow, requests can fall back across providers so a single vendor outage does not take down your app.

import { ai, OPENAI } from 'cost-katana';

const response = await ai(OPENAI.GPT_4O, 'Hello');

console.log(`Provider used: ${response.provider}`);
// e.g. 'openai', 'anthropic', or 'google' depending on availability and policy

Usage tracking and analytics

Dashboard attribution with configure() and ai()

Use the same ai() API everywhere. Point usage at your project once with configure() or environment variables.

import { configure, ai, OPENAI } from 'cost-katana';

await configure({
  apiKey: process.env.COST_KATANA_API_KEY,
  projectId: process.env.PROJECT_ID,
});

const response = await ai(OPENAI.GPT_4O, 'Explain quantum computing');

console.log(response.text);
console.log('Cost:', response.cost);
console.log('Tokens:', response.tokens);

Calls can be attributed to your project in the dashboard. You can also pass projectId through tracker/gateway options where supported when using multiple projects.

AICostTracker with defaults (advanced)

When you need a dedicated tracker instance (not only the global ai() helper), use createCostKatanaTracker() or AICostTracker.createWithDefaults(). They populate TrackerConfig from the same environment rules as auto-config:

  • If you set direct provider keys (OPENAI_API_KEY, ANTHROPIC_API_KEY, GOOGLE_API_KEY, or AWS Bedrock credentials), those providers are registered.
  • If you only have COST_KATANA_API_KEY and no direct provider keys, the default is Cost Katana hosted models via the gateway: inference can route through the hosted API without embedding vendor keys in your app.
import { createCostKatanaTracker, AIProvider } from 'cost-katana';

const tracker = await createCostKatanaTracker();

const custom = await createCostKatanaTracker({
  optimization: { enablePromptOptimization: false },
  providers: [{ provider: AIProvider.OpenAI, apiKey: process.env.OPENAI_API_KEY! }],
});

// Same idea: await AICostTracker.createWithDefaults({ ... })
// Short alias: import { tracker as costKatana } from 'cost-katana';

Requires COST_KATANA_API_KEY in the environment (same as AICostTracker.create()). PROJECT_ID remains optional.

Dedicated per-provider trackers

For a small complete()-style API on top of AICostTracker, use createOpenAITracker, createAnthropicTracker, etc.

import { createOpenAITracker, OPENAI } from 'cost-katana';

const t = await createOpenAITracker({ model: OPENAI.GPT_4O });
const response = await t.complete({ prompt: 'Explain quantum computing' });

console.log(response.text);
console.log('Total cost (USD):', response.cost.totalCost);
console.log('Response time (ms):', response.responseTime);

For gateway proxying, manual trackUsage, or a fully custom AICostTracker, see docs/API.md and examples/.

View analytics in the dashboard

With tracking enabled, you can inspect:

  • Network performance — DNS, TCP, total response time
  • Client environment — User agent, platform, IP geolocation (where collected)
  • Request/response data — Payloads (sanitized)
  • Optimization opportunities — Suggestions to reduce cost
  • Performance metrics — Monitoring and anomaly signals

Manual usage tracking

import { createCostKatanaTracker } from 'cost-katana';

const tracker = await createCostKatanaTracker();

await tracker.trackUsage({
  model: 'gpt-4o',
  provider: 'openai',
  prompt: 'Hello, world!',
  completion: 'Hello! How can I help you today?',
  promptTokens: 3,
  completionTokens: 9,
  totalTokens: 12,
  cost: 0.00036,
  responseTime: 850,
  userId: 'user_123',
  sessionId: 'session_abc',
  tags: ['chat', 'greeting'],
  requestMetadata: {
    userAgent: typeof navigator !== 'undefined' ? navigator.userAgent : undefined,
    clientIP: await fetch('https://api.ipify.org').then((r) => r.text()),
    feature: 'chat-interface',
  },
});

Session replay and distributed tracing

The trace submodule provides session graphs, spans, and middleware. See src/trace/README.md for exports such as TraceClient, LocalTraceService, and createTraceMiddleware.

Framework integration

Next.js App Router

// app/api/chat/route.ts
import { ai, OPENAI } from 'cost-katana';

export async function POST(request: Request) {
  const { prompt } = await request.json();
  const response = await ai(OPENAI.GPT_4O, prompt);
  return Response.json(response);
}

Express.js

import express from 'express';
import { ai, OPENAI } from 'cost-katana';

const app = express();
app.use(express.json());

app.post('/api/chat', async (req, res) => {
  const response = await ai(OPENAI.GPT_4O, req.body.prompt);
  res.json(response);
});

app.listen(3000);

Fastify

import fastify from 'fastify';
import { ai, OPENAI } from 'cost-katana';

const app = fastify();

app.post('/api/chat', async (request) => {
  const { prompt } = request.body as { prompt: string };
  return await ai(OPENAI.GPT_4O, prompt);
});

app.listen({ port: 3000 });

NestJS

import { Controller, Post, Body } from '@nestjs/common';
import { ai, OPENAI } from 'cost-katana';

@Controller('api')
export class ChatController {
  @Post('chat')
  async chat(@Body() body: { prompt: string }) {
    return await ai(OPENAI.GPT_4O, body.prompt);
  }
}

Error handling

import { ai, OPENAI } from 'cost-katana';

try {
  const response = await ai(OPENAI.GPT_4O, 'Hello');
  console.log(response.text);
} catch (error) {
  const err = error as Error & { code?: string; availableModels?: string[] };
  switch (err.code) {
    case 'NO_API_KEY':
      console.log('Set COST_KATANA_API_KEY or a provider API key');
      break;
    case 'RATE_LIMIT':
      console.log('Rate limited. Retry with backoff.');
      break;
    case 'INVALID_MODEL':
      console.log('Model not found. Available:', err.availableModels);
      break;
    default:
      console.log('Error:', err.message);
  }
}

Exact code values depend on the failure path (gateway vs direct provider). Always log message for support.

AI gateway (details)

The gateway is an HTTP proxy: call Cost Katana’s URL with your API key; the service forwards to OpenAI, Anthropic, Google, Cohere, and others, and can attach caching, retries, firewall, and tracking.

  • Quick start: Quick start — Path A (gateway() or cURL).
  • CostKatana-Target-Url: Use for non-default upstream URLs (Azure OpenAI, private endpoints). For standard routes (/v1/chat/completions, /v1/messages, …), gateway() often uses inferTargetUrl: true and omits it.
  • Anthropic on hosted gateway: gateway.anthropic(...) / /v1/messages may not require an Anthropic key in your app; the service may use Bedrock when no server ANTHROPIC_API_KEY is set (see docs for streaming limitations).
  • Dashboard vs custom tracking: Gateway traffic reflects proxied bodies; AICostTracker / trackUsage supports custom structured logging. For multi-turn and token nuances, see examples/GATEWAY_USAGE_AND_TRACKING.md and costkatana-examples 2-gateway.

Experimentation (hosted API)

The Cost Katana backend (costkatana-backend-nest) exposes experimentation REST endpoints under /api/experimentation on the hosted API (same origin as the gateway, e.g. https://api.costkatana.com). The dashboard Experimentation UI uses these APIs; you can also integrate them directly.

What it covers

  • Model comparison — Run side-by-side comparisons across providers (POST /api/experimentation/model-comparison).
  • Real-time comparison — Start a comparison job (POST /api/experimentation/real-time-comparison) and stream progress over SSE at GET /api/experimentation/comparison-progress/:sessionId (session token validated). Poll or reconnect via GET /api/experimentation/comparison-job/:sessionId when authenticated.
  • CatalogGET /api/experimentation/available-models returns router-registered models (active/inactive) for picking candidates.
  • Cost estimatePOST /api/experimentation/estimate-cost (public) for experiment cost estimates before you run.
  • What-if scenarios — List/create/analyze/delete scenarios (/api/experimentation/what-if-scenarios, .../:scenarioName/analyze, lifecycle updates).
  • Real-time simulationPOST /api/experimentation/real-time-simulation (public) for what-if style simulations.
  • History and insightsGET /api/experimentation/history, GET /api/experimentation/recommendations, GET /api/experimentation/fine-tuning-analysis.
  • ExportsGET /api/experimentation/:experimentId/export?format=json|csv for results.

Auth

  • Most write/read routes require a dashboard user JWT (JwtAuthGuard).
  • Several routes are marked public (estimate cost, available models, real-time simulation, SSE progress with a valid session id). See the controller for the exact list: experimentation.controller.ts in costkatana-backend-nest.

Server configuration

  • Real model execution for comparisons may require backend flags such as ENABLE_REAL_MODEL_COMPARISON=true where your deployment enables live API calls to providers.

Examples and documentation

In this repo

ResourceDescription
docs/API.mdAPI reference
docs/EXAMPLES.mdExamples index
docs/GATEWAY.mdGateway
docs/PROMPT_OPTIMIZATION.mdPrompt optimization
docs/WEBHOOKS.mdWebhooks
examples/Runnable TypeScript examples

External examples repo — 45+ complete examples:

github.com/Hypothesize-Tech/costkatana-examples

CategoryTopics
Cost trackingBudgets, alerts
GatewayRouting, load balancing, failover
OptimizationCortex, caching, compression
ObservabilityOpenTelemetry, tracing, metrics
SecurityFirewall, rate limiting, moderation
WorkflowsMulti-step orchestration
FrameworksExpress, Next.js, Fastify, NestJS, FastAPI

Migration guides

From OpenAI SDK

// Before
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: 'sk-...' });
const completion = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Hello' }],
});
console.log(completion.choices[0].message.content);

// After
import { ai, OPENAI } from 'cost-katana';
const response = await ai(OPENAI.GPT_4, 'Hello');
console.log(response.text);
console.log(`Cost: $${response.cost}`);

From Anthropic SDK

// Before
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({ apiKey: 'sk-ant-...' });
const message = await anthropic.messages.create({
  model: 'claude-3-sonnet-20241022',
  messages: [{ role: 'user', content: 'Hello' }],
});

// After
import { ai, ANTHROPIC } from 'cost-katana';
const response = await ai(ANTHROPIC.CLAUDE_3_5_SONNET_20241022, 'Hello');

From LangChain

// Before
import { ChatOpenAI } from 'langchain/chat_models/openai';
const model = new ChatOpenAI({ modelName: 'gpt-4' });
const response = await model.call([{ content: 'Hello' }]);

// After
import { ai, OPENAI } from 'cost-katana';
const response = await ai(OPENAI.GPT_4, 'Hello');

Contributing

We welcome contributions. See Contributing Guide.

git clone https://github.com/Hypothesize-Tech/costkatana-core.git
cd costkatana-core
npm install

npm run lint        # Check code style
npm run lint:fix    # Auto-fix issues
npm run format      # Format code
npm test            # Run tests
npm run build       # Build

Support

ChannelLink
Dashboardcostkatana.com
Documentationdocs.costkatana.com
GitHubgithub.com/Hypothesize-Tech
Discorddiscord.gg/D8nDArmKbY
Emailsupport@costkatana.com

License

MIT © Cost Katana

Start cutting AI costs today

npm install cost-katana
import { ai, OPENAI } from 'cost-katana';
await ai(OPENAI.GPT_4, 'Hello, world!');

Keywords

ai

FAQs

Package last updated on 29 Mar 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts