Vendor-agnostic AI integration package with full RAG pipeline for astro-minimax blogs. Supports OpenAI-compatible APIs, Cloudflare Workers AI, and mock fallback.

Architecture

┌─────────────────────────────────────────────────────────┐
│  Components (ChatPanel / AIChatWidget / AIChatContainer) │
│  → useChat + DefaultChatTransport                        │
└──────────────────────────┬──────────────────────────────┘
                           │ POST /api/chat
┌──────────────────────────▼──────────────────────────────┐
│  Server (chat-handler.ts)                                │
│  Rate Limit → Validate → Search → Evidence → Prompt →   │
│  Provider Manager → streamText → SSE Response            │
└──────────────────────────┬──────────────────────────────┘
                           │
     ┌─────────────────────┼──────────────────────┐
     │                     │                      │
 ┌───▼───┐          ┌─────▼─────┐          ┌─────▼────┐
 │OpenAI │          │Workers AI │          │   Mock   │
 │Compat │          │ Binding   │          │ Fallback │
 └───────┘          └───────────┘          └──────────┘

Modules

Module	Purpose
`server/`	Reusable API handlers (`handleChatRequest`, `initializeMetadata`)
`provider-manager/`	Multi-provider management with priority, failover, health tracking
`search/`	In-memory article/project search with session caching
`intelligence/`	Keyword extraction, evidence analysis, citation guard, answer mode, dynamic evidence budget
`prompt/`	Three-layer system prompt builder (static → semi-static → dynamic)
`data/`	Bundle-backed runtime metadata loading and shared data types
`components/`	Preact UI components (ChatPanel, AIChatWidget, AIChatContainer)
`extensions/`	Search/prompt extensions and semantic fallback rules
`structured-output/`	Schema-validated structured generation helpers
`cache/`	Response/session/injection cache utilities
`fact-registry/`	Verified facts used for grounded prompt assembly
`tools/`	Runtime tool registry and built-in action/search tools

Features

Dynamic Evidence Budget

The system dynamically adjusts retrieval and analysis resources based on query complexity:

Complexity	Max Articles	Summary Length	Key Points	Deep Content
`simple`	4	48 chars	2	No
`moderate`	6	56 chars	3	Yes
`complex`	8	64 chars	4	Yes

Budget is further adjusted by answer mode (count, list, opinion, recommendation):

import { getEvidenceBudget, applyBudgetToArticles } from '@astro-minimax/ai/intelligence';

const budget = getEvidenceBudget('moderate', 'list');
// → { maxArticles: 8, summaryMaxLength: 80, ... }

const trimmedArticles = applyBudgetToArticles(articles, budget);

Answer Mode Detection

Automatically detects the expected response format from user queries:

Mode	Trigger Patterns	Response Style
`fact`	"是什么", "what is"	Conclusion first, then evidence
`count`	"多少", "how many"	Number in first sentence
`list`	"哪些", "what are"	2-6 items directly
`opinion`	"怎么看", "what do you think"	"I think..." + 2-3 points
`recommendation`	"推荐", "suggest"	2-4 recommendations + reasons

Answer mode hints are injected into the dynamic prompt layer, guiding the LLM toward the appropriate format.

Reading Time Display

Article reading time is now displayed in the dynamic prompt layer:

**[Article Title](/posts/article)**
阅读时间：约 5 分钟
摘要：Article summary...

Enhanced Citation Guard

Improved URL validation prevents hallucinated links:

Scheme whitelist: Only http:// and https:// allowed
Domain validation: Blocks localhost, private IPs, internal networks
XSS prevention: Sanitizes dangerous URL patterns

import { createCitationGuardTransform } from '@astro-minimax/ai/intelligence';

const guard = createCitationGuardTransform({
  articles,
  projects,
  siteUrl: 'https://example.com',
  onApplied: ({ actions }) => console.log('Rewrote:', actions),
});

Installation

pnpm add @astro-minimax/ai

The @astro-minimax/core integration auto-detects this package and renders the AI chat widget.

Configuration

In src/config.ts:

export const SITE = {
  ai: {
    enabled: true,
    mockMode: false,
    apiEndpoint: "/api/chat",
    welcomeMessage: undefined, // auto-generated
    placeholder: undefined,
  },
};

Environment Variables

Variable	Required	Description
`AI_BASE_URL`	For OpenAI	Base URL of OpenAI-compatible API
`AI_API_KEY`	For OpenAI	API key
`AI_MODEL`	Recommended	Model name for OpenAI provider (default: `gpt-4o-mini`)
`AI_KEYWORD_MODEL`	Optional	Model for keyword extraction (defaults to `AI_MODEL`)
`AI_EVIDENCE_MODEL`	Optional	Model for evidence analysis (defaults to keyword model)
`AI_BINDING_NAME`	For Workers	Cloudflare AI binding name (default: `minimaxAI`)
`AI_WORKERS_MODEL`	For Workers	Model for Workers AI (default: `@cf/zai-org/glm-4.7-flash`)
`SITE_AUTHOR`	Recommended	Author name for prompts
`SITE_URL`	Recommended	Site URL for article links

Response Cache Configuration

Variable	Default	Description
`AI_CACHE_ENABLED`	`false`	Enable AI response caching
`AI_CACHE_TTL`	`3600`	Cache TTL in seconds (1 hour)
`AI_CACHE_PLAYBACK_DELAY`	`20`	Delay between chunks during playback (ms)
`AI_CACHE_CHUNK_SIZE`	`15`	Characters per chunk during playback
`AI_CACHE_THINKING_DELAY`	`5`	Delay for thinking content playback (ms)

When enabled, the system caches complete AI responses (including thinking/reasoning content) for public questions like "What tech stack does this blog use?". Subsequent identical queries are served from cache with simulated streaming playback, reducing API costs and response time.

Server Module

The server module provides reusable request handlers, decoupled from any specific runtime (Cloudflare, Node.js, etc.).

Usage in Cloudflare Pages Functions

// functions/api/chat.ts
import { handleChatRequest, initializeMetadata } from '@astro-minimax/ai/server';
import knowledgeBundle from '../../datas/knowledge/runtime/knowledge-bundle.json';

export const onRequest: PagesFunction = async (context) => {
  initializeMetadata({ knowledgeBundle }, context.env);

  return handleChatRequest({ env: context.env, request: context.request });
};

Chat API Contract

Request: POST /api/chat

{
  "context": {
    "scope": "article",
    "article": {
      "slug": "my-post",
      "title": "My Post Title",
      "summary": "Brief summary...",
      "keyPoints": ["Point 1", "Point 2"],
      "categories": ["tech"]
    }
  },
  "id": "article:my-post",
  "messages": [...]
}

context.scope values:

"global" — General blog chat (default)
"article" — Reading companion mode, focused on a specific article

Response: UI Message Stream Protocol (SSE)

text-start / text-delta / text-end — Streaming text content
source — RAG article references
message-metadata — Processing status updates
finish — Stream completion

Error Response:

{
  "error": "请求太频繁，请稍后再试",
  "code": "RATE_LIMITED",
  "retryable": true,
  "retryAfter": 10
}

Code	Status	Retryable	Description
`RATE_LIMITED`	429	Yes	Too many requests
`PROVIDER_UNAVAILABLE`	503	Yes	All providers failed
`TIMEOUT`	504	Yes	Request timeout
`INPUT_TOO_LONG`	400	No	Message exceeds limit
`INVALID_REQUEST`	400	No	Malformed request
`INTERNAL_ERROR`	500	Yes	Server error

Provider System

Priority & Failover

Workers AI (weight: 100) → OpenAI Compatible (weight: 90) → Mock (weight: 0)

When a provider fails, the next one is tried automatically. Mock fallback ensures users always get a response.

Timeout Budget (per request: 45s total)

Stage	Timeout	Behavior on timeout
Keyword extraction	5s	Falls back to local search query
Evidence analysis	8s	Skipped
LLM streaming	30s	Tries next provider, then mock

"Read & Chat" (边读边聊)

When a user opens the AI chat on an article page, the system enters reading companion mode:

Article context flows from PostDetails.astro → Layout.astro → AIChatWidget → ChatPanel
Welcome message references the current article title
Quick prompts are article-specific (summarize, explain, related topics)
API request includes context: { scope: "article", article: {...} }
Server enhances the prompt with article summary, key points, and reading companion instructions

Components

AIChatWidget.astro

Astro entry point. Accepts lang and optional articleContext props. Renders AIChatContainer with client:idle.

AIChatContainer.tsx

Manages open/close state. Exposes window.__aiChatToggle for the floating action button.

ChatPanel.tsx

Core chat UI built on useChat from @ai-sdk/react:

DefaultChatTransport with prepareSendMessagesRequest for context injection
Parts-based message rendering (text, source, custom data parts)
Error display with retry button (regenerate())
Status indicators from message metadata
Mock mode with character-by-character streaming simulation

Exports

Path	Contents
`.`	All modules
`./server`	`handleChatRequest`, `initializeMetadata`, error helpers, types
`./middleware`	Rate limiting
`./search`	Article/project search, session cache
`./intelligence`	Keyword extraction, evidence analysis, citation guard, answer mode, evidence budget
`./prompt`	System prompt builder
`./cache`	Cache adapters and response/session cache utilities
`./data`	Metadata loading
`./fact-registry`	Verified facts registry
`./extensions`	Extension registry, loader, and injector
`./structured-output`	Structured output helpers
`./tools`	Tool registry and built-in AI tools
`./components/ChatPanel`	Preact chat panel component
`./components/AIChatContainer`	Preact chat container component
`./components/AIChatWidget.astro`	Astro chat widget entry point

Testing

The package includes comprehensive unit tests with Vitest:

cd packages/ai
pnpm test

Test coverage includes:

Citation guard (10 tests)
Intent detection (7 tests)
Keyword extraction (7 tests)
Evidence analysis (7 tests)
Evidence budget (5 tests)

Keywords

FAQs

What is @astro-minimax/ai?

Is @astro-minimax/ai well maintained?

Package last updated on 28 Mar 2026

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

@astro-minimax/ai