BitBuffet TypeScript/JavaScript SDK

A powerful TypeScript/JavaScript SDK for the BitBuffet API that allows you to extract structured data from any web content using Zod schemas or raw markdown content in under two seconds.
🚀 Features
- Universal: Works with any website or web content (url, image, video, audio, youtube, pdf, etc.)
- Type-safe: Built with TypeScript and Zod for complete type safety
- Fast: Extract structured data in under 2 seconds
- Flexible: Support for custom prompts and reasoning levels
- Dual Output: Extract structured JSON data or raw markdown content
- Easy to use: Simple, intuitive API
- Well-tested: Comprehensive test suite with integration tests
📦 Installation
npm install bitbuffet
yarn add bitbuffet
pnpm add bitbuffet
🏃♂️ Quick Start
import { BitBuffet } from 'bitbuffet';
import { z } from 'zod';
const ArticleSchema = z.object({
title: z.string().describe('The main title of the article'),
author: z.string().describe('The author of the article'),
publishDate: z.string().describe('When the article was published'),
content: z.string().describe('The main content/body of the article'),
tags: z.array(z.string()).describe('Article tags or categories'),
summary: z.string().describe('A brief summary of the article')
});
type Article = z.infer<typeof ArticleSchema>;
const client = new BitBuffet('your-api-key-here');
try {
const result: Article = await client.extract(
'https://example.com/article',
ArticleSchema
);
console.log(`Title: ${result.title}`);
console.log(`Author: ${result.author}`);
console.log(`Published: ${result.publishDate}`);
console.log(`Tags: ${result.tags.join(', ')}`);
} catch (error) {
console.error('Extraction failed:', error);
}
import { BitBuffet } from 'bitbuffet';
const client = new BitBuffet('your-api-key-here');
try {
const markdown: string = await client.extract(
'https://example.com/article',
{ format: 'markdown' }
);
console.log('Raw markdown content:');
console.log(markdown);
} catch (error) {
console.error('Extraction failed:', error);
}
⚙️ Output Methods
Choose between structured JSON extraction or raw markdown content:
JSON Format (Default)
Extracts structured data according to your Zod schema:
const ProductSchema = z.object({
name: z.string(),
price: z.number(),
description: z.string()
});
const product = await client.extract(
'https://example.com/product',
ProductSchema,
{ format: 'json' }
);
Markdown Format
Returns the raw markdown content of the webpage:
const markdown = await client.extract(
'https://example.com/article',
{ format: 'markdown' }
);
Note: When using format: 'markdown', do not provide a schema as the second parameter.
Format Usage Patterns
The SDK supports two extraction methods via the format parameter:
const result = await client.extract(
'https://example.com/article',
ArticleSchema,
{
format: 'json',
reasoning_effort: 'high'
}
);
const result = await client.extract(
'https://example.com/article',
ArticleSchema
);
const markdown = await client.extract(
'https://example.com/article',
{
format: 'markdown',
reasoning_effort: 'medium',
prompt: 'Focus on main content'
}
);
const markdown = await client.extract(
'https://example.com/article',
{ format: 'markdown' }
);
Important Format Rules:
-
JSON Format Requirements:
- A Zod schema MUST be provided as the second parameter
- Returns typed data matching your schema
format: 'json' is optional (default behavior)
-
Markdown Format Requirements:
- NO schema should be provided
format: 'markdown' MUST be specified in config object
- Returns raw markdown string
- Schema and markdown format cannot be used together
-
Format Parameter Location:
- Always specified in the configuration object
- Never passed as a separate parameter
⚙️ Configuration Options
Customize the extraction process with various options:
const result = await client.extract(
'https://example.com/complex-page',
ArticleSchema,
{
format: 'json',
reasoning_effort: 'high',
prompt: 'Focus on extracting the main article content, ignoring ads and navigation',
temperature: 0.1,
},
30000
);
const markdown = await client.extract(
'https://example.com/article',
{
format: 'markdown',
reasoning_effort: 'medium',
prompt: 'Focus on the main content, ignore navigation and ads'
},
30000
);
📚 Advanced Examples
const ProductSchema = z.object({
name: z.string(),
price: z.number(),
currency: z.string(),
description: z.string(),
images: z.array(z.string().url()),
inStock: z.boolean(),
rating: z.number().min(0).max(5).optional(),
reviews: z.number().optional()
});
const product = await client.extract(
'https://shop.example.com/product/123',
ProductSchema,
{ reasoning_effort: 'high' }
);
News Article with Metadata
const NewsSchema = z.object({
headline: z.string(),
subheadline: z.string().optional(),
author: z.object({
name: z.string(),
bio: z.string().optional()
}),
publishedAt: z.string(),
category: z.string(),
content: z.string(),
relatedArticles: z.array(z.object({
title: z.string(),
url: z.string().url()
})).optional()
});
const article = await client.extract(
'https://news.example.com/breaking-news',
NewsSchema
);
Raw Content for Processing
const rawContent = await client.extract(
'https://blog.example.com/post/123',
{ format: 'markdown' }
);
const wordCount = rawContent.split(' ').length;
const hasCodeBlocks = rawContent.includes('```');
console.log(`Content has ${wordCount} words`);
console.log(`Contains code blocks: ${hasCodeBlocks}`);
🔧 API Reference
BitBuffet Class
Constructor
new BitBuffet(apiKey: string)
Methods
extract<T>(
url: string,
schema: ZodSchema<T>,
config?: JsonConfig,
timeout?: number
): Promise<T>
extract(
url: string,
config: MarkdownConfig,
timeout?: number
): Promise<string>
Parameters:
url: The URL to extract data from
schema: Zod schema defining the expected data structure (JSON format only)
config: Extraction configuration options
timeout: Request timeout in milliseconds (default: 30000)
Returns:
- JSON format: Promise resolving to the extracted data matching your schema
- Markdown format: Promise resolving to raw markdown content as string
Types
interface JsonConfig {
format?: 'json';
reasoning_effort?: 'medium' | 'high';
prompt?: string;
temperature?: number;
top_p?: number;
}
interface MarkdownConfig {
format: 'markdown';
reasoning_effort?: 'medium' | 'high';
prompt?: string;
temperature?: number;
top_p?: number;
}
type ExtractionMethod = 'json' | 'markdown';
🛠️ Development
yarn install
yarn test
yarn test:watch
yarn test:integration
yarn build
yarn test:manual
📋 Requirements
- Node.js >= 16.0.0
- TypeScript >= 4.5.0 (for TypeScript projects)
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🔗 Links
💡 Need Help?
For detailed documentation, examples, and API reference, visit our complete documentation.
If you encounter any issues or have questions, please open an issue on GitHub.