Socket
Book a DemoInstallSign in
Socket

gpt-research

Package Overview
Dependencies
Maintainers
1
Versions
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

gpt-research

Autonomous AI research agent that conducts comprehensive research on any topic and generates detailed reports with citations

latest
Source
npmnpm
Version
1.0.1
Version published
Weekly downloads
0
-100%
Maintainers
1
Weekly downloads
 
Created
Source

GPT Research

npm version License: MIT Node.js Version TypeScript

🔍 GPT Research is an autonomous AI research agent that conducts comprehensive research on any topic, searches the web for real-time information, and generates detailed reports with proper citations.

Built with TypeScript and optimized for both local development and serverless deployment (Vercel, AWS Lambda, etc.).

✨ Features

  • 🔍 Multi-source Research: Integrates multiple search providers:
    • Tavily - AI-optimized search engine
    • Serper - Google Search API (2,500 free searches/month)
    • Google Custom Search - Direct Google integration
    • DuckDuckGo - Privacy-focused search
  • 🌐 Smart Web Scraping: Cheerio and Puppeteer for content extraction
  • 🤖 Multiple LLM Support: OpenAI, Anthropic, Google AI, Groq, and more
  • 🔌 MCP Integration: Model Context Protocol for external tool connections
  • 📊 Various Report Types: Research, Detailed, Summary, Resource, Outline
  • 🔄 Streaming Support: Real-time updates via Server-Sent Events
  • Vercel Optimized: Built for serverless deployment
  • 💾 Memory Management: Tracks research context and history
  • 💰 Cost Tracking: Monitor LLM usage and costs

🚀 Quick Start

Installation

npm install gpt-research
# or
yarn add gpt-research
# or
pnpm add gpt-research

Configuration

Create a .env file in the root directory:

# Required
OPENAI_API_KEY=your-openai-api-key

# Optional Search Providers (at least one recommended)
TAVILY_API_KEY=your-tavily-api-key        # https://tavily.com (best for AI research)
SERPER_API_KEY=your-serper-api-key        # https://serper.dev (Google search, 2,500 free/month)
GOOGLE_API_KEY=your-google-api-key        # Google Custom Search
GOOGLE_CX=your-google-custom-search-engine-id

# Optional LLM Providers
ANTHROPIC_API_KEY=your-anthropic-api-key
GOOGLE_AI_API_KEY=your-google-ai-api-key
GROQ_API_KEY=your-groq-api-key

Basic Usage

const { GPTResearch } = require('gpt-research');
// or for TypeScript/ES modules:
// import { GPTResearch } from 'gpt-research';

async function main() {
  const researcher = new GPTResearch({
    query: 'What are the latest developments in quantum computing?',
    reportType: 'research_report',
    llmProvider: 'openai',
    apiKeys: {
      openai: process.env.OPENAI_API_KEY,
      tavily: process.env.TAVILY_API_KEY
    }
  });

  // Conduct research
  const result = await researcher.conductResearch();
  
  console.log(result.report);
  console.log(`Sources used: ${result.sources.length}`);
  console.log(`Cost: $${result.costs.total.toFixed(4)}`);
}

main().catch(console.error);

Streaming Research

const researcher = new GPTResearch(config);

// Stream research updates in real-time
for await (const update of researcher.streamResearch()) {
  switch (update.type) {
    case 'progress':
      console.log(`[${update.progress}%] ${update.message}`);
      break;
    case 'data':
      if (update.data?.reportChunk) {
        process.stdout.write(update.data.reportChunk);
      }
      break;
    case 'complete':
      console.log('\nResearch complete!');
      break;
  }
}

🔧 Configuration Options

interface ResearchConfig {
  // Required
  query: string;                    // Research query
  
  // Report Configuration
  reportType?: ReportType;          // Type of report to generate
  reportFormat?: ReportFormat;      // Output format (markdown, pdf, docx)
  tone?: Tone;                      // Writing tone
  
  // LLM Configuration
  llmProvider?: string;             // LLM provider (openai, anthropic, etc.)
  smartLLMModel?: string;           // Model for complex tasks
  fastLLMModel?: string;            // Model for simple tasks
  temperature?: number;             // Generation temperature
  maxTokens?: number;               // Max tokens per generation
  
  // Search Configuration
  defaultRetriever?: string;        // Default search provider
  maxSearchResults?: number;        // Max results per search
  
  // Scraping Configuration
  defaultScraper?: string;          // Default scraper (cheerio, puppeteer)
  scrapingConcurrency?: number;     // Concurrent scraping operations
  
  // API Keys
  apiKeys?: {
    openai?: string;
    tavily?: string;
    serper?: string;
    google?: string;
    anthropic?: string;
    groq?: string;
  };
}

📋 Report Types

  • ResearchReport: Comprehensive research with citations
  • DetailedReport: In-depth analysis with extensive coverage
  • QuickSummary: Concise overview of key points
  • ResourceReport: Curated list of resources and references
  • OutlineReport: Structured outline for further research

🔍 Search Providers

Available Providers

ProviderBest ForFree TierAPI Key Required
TavilyAI-optimized research1,000/monthYes - Get Key
SerperGoogle search results2,500/monthYes - Get Key
GoogleCustom search100/dayYes - Setup
DuckDuckGoPrivacy-focusedUnlimitedNo

Choosing the Right Provider

  • Tavily: Best for AI research, academic papers, technical topics
  • Serper: Best for current events, general web search, Google quality
  • Google Custom Search: Best for specific domains, controlled results
  • DuckDuckGo: Best for privacy-sensitive research, no API needed

Using Multiple Providers

// Configure multiple providers for redundancy
const researcher = new GPTResearch({
  query: 'Your research topic',
  retrievers: ['tavily', 'serper'], // Falls back if one fails
  apiKeys: {
    tavily: process.env.TAVILY_API_KEY,
    serper: process.env.SERPER_API_KEY
  }
});

🔌 MCP (Model Context Protocol) Support

GPT Research now supports MCP for connecting to external tools and services!

What is MCP?

MCP (Model Context Protocol) is a standardized protocol for connecting AI systems to external tools and data sources. It enables seamless integration with various services through a unified interface.

MCP Features

  • Stdio MCP Servers - Local process spawning for NPX/binary tools (Node.js/Docker/VPS)
  • HTTP MCP Servers - RESTful API connections (works everywhere including Vercel)
  • WebSocket MCP - Real-time bidirectional communication (works everywhere)
  • Tool Discovery - Automatic discovery of available tools from all server types
  • Smart Selection - AI-powered tool selection based on research query
  • Streaming Updates - Real-time progress tracking via SSE
  • Mixed Mode - Combine stdio, HTTP, and WebSocket in the same application

MCP Usage Examples

HTTP/WebSocket MCP (Works everywhere including Vercel)

const researcher = new GPTResearch({
  query: "Latest AI developments",
  mcpConfigs: [
    {
      name: "research-tools",
      connectionType: "http",
      connectionUrl: "https://mcp.example.com",
      connectionToken: process.env.MCP_TOKEN
    }
  ],
  useMCP: true
});

Stdio MCP (Local tools - Node.js environments)

const researcher = new GPTResearch({
  query: "Analyze this codebase",
  mcpConfigs: [
    {
      name: "filesystem",
      connectionType: "stdio",
      command: "npx",
      args: ["@modelcontextprotocol/filesystem-server"],
      env: { READ_ONLY: "false" }
    },
    {
      name: "git",
      connectionType: "stdio",
      command: "git-mcp",
      args: ["--repo", "."]
    }
  ]
});

Mixed Mode (Combine all connection types)

const researcher = new GPTResearch({
  query: "Research topic",
  mcpConfigs: [
    // Local tools via stdio
    { name: "local-fs", connectionType: "stdio", command: "npx", args: ["fs-mcp"] },
    // Remote API via HTTP
    { name: "api", connectionType: "http", connectionUrl: "https://api.example.com/mcp" },
    // Real-time via WebSocket
    { name: "stream", connectionType: "websocket", connectionUrl: "wss://realtime.example.com" }
  ]
});

MCP Deployment Compatibility

MCP TypeLocal/Node.jsVercelDockerVPS/Cloud
HTTP Servers✅ Full✅ Full✅ Full✅ Full
WebSocket✅ Full✅ Full✅ Full✅ Full
Stdio✅ Full❌ Not Supported✅ Full✅ Full

Stdio MCP Notes:

  • Works perfectly in Node.js, Docker, VPS, and self-hosted environments
  • Not supported on Vercel, AWS Lambda, or other serverless platforms
  • For serverless deployments, use HTTP/WebSocket MCP or deploy a proxy server

These MCP servers can be run locally via stdio:

# File System Access
npx @modelcontextprotocol/filesystem-server

# Git Repository Tools  
npx @modelcontextprotocol/git-server

# Database Query Execution
npm install -g mcp-database
mcp-database

# Custom Python MCP Server
python -m mcp.server

# Shell Command Execution
cargo install mcp-shell
mcp-shell

Learn More

🌐 Vercel Deployment

API Routes

Create API routes in your Next.js/Vercel project:

// api/research/route.js
import { GPTResearch } from 'gpt-research';

export async function POST(request) {
  const { query, reportType } = await request.json();
  
  const researcher = new GPTResearch({
    query,
    reportType,
    apiKeys: {
      openai: process.env.OPENAI_API_KEY,
      tavily: process.env.TAVILY_API_KEY
    }
  });
  
  const result = await researcher.conductResearch();
  
  return Response.json(result);
}

Streaming API

// api/research/stream/route.js
export async function POST(request) {
  const { query } = await request.json();
  
  const stream = new ReadableStream({
    async start(controller) {
      const researcher = new GPTResearch({ query });
      
      for await (const update of researcher.streamResearch()) {
        controller.enqueue(
          `data: ${JSON.stringify(update)}\n\n`
        );
      }
      
      controller.close();
    }
  });
  
  return new Response(stream, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache',
      'Connection': 'keep-alive'
    }
  });
}

Environment Variables

Add to your Vercel project settings:

OPENAI_API_KEY=your-key
TAVILY_API_KEY=your-key
SERPER_API_KEY=your-key

🧪 Examples

# Basic example
npm run example

# OpenAI-only example (no web search)
npm run example:simple

# Full research with Tavily web search
npm run example:tavily

# Research using Serper (Google Search API)
npm run example:serper

Check the examples/ directory for more detailed usage examples.

📚 Documentation

🎯 Use Cases

  • Market Research: Analyze competitors, trends, and market opportunities
  • Academic Research: Gather and synthesize information for papers and studies
  • Content Creation: Research topics thoroughly for articles and blog posts
  • Technical Documentation: Research technical topics and generate comprehensive guides
  • Due Diligence: Conduct thorough research on companies, people, or topics
  • News Aggregation: Gather and summarize news from multiple sources

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📝 License

MIT License - see LICENSE file for details.

📊 Performance Considerations

  • Token Limits: Automatically manages context within token limits
  • Concurrent Operations: Configurable concurrency for searches and scraping
  • Cost Optimization: Uses appropriate models for different tasks
  • Caching: Caches scraped content to avoid redundant operations
  • Memory Management: Efficient in-memory storage with export/import capabilities

🔐 Security

  • API Key Management: Never commit API keys to version control
  • Input Validation: All URLs and inputs are validated
  • Rate Limiting: Built-in rate limiting for API calls
  • Error Handling: Comprehensive error handling and recovery

🎯 Roadmap

  • Add multi-language support
  • Add more LLM providers (Cohere, Together AI)
  • Implement research templates
  • Add PDF and DOCX report export

💡 Tips

  • Use Tavily for best results - It's specifically designed for AI research
  • Configure multiple search providers - Automatic fallback ensures reliability
  • Adjust concurrency based on your limits - Prevent rate limiting
  • Use streaming for long research - Better user experience
  • Monitor costs - Track LLM usage to manage expenses

🆘 Troubleshooting

Common Issues

Build Errors: Make sure you have Node.js 18+ and run npm install

API Key Errors: Verify your API keys are correct in .env

Rate Limiting: Reduce scrapingConcurrency and maxSearchResults

Memory Issues: For large research, increase Node.js memory:

node --max-old-space-size=4096 your-script.js

📧 Support

⭐ Show Your Support

If you find GPT Research helpful, please consider:

  • Giving us a star on GitHub
  • Sharing with your network
  • Contributing to the project

Built with ❤️ by Pablo Schaffner
Autonomous research for everyone

Keywords

gpt

FAQs

Package last updated on 16 Sep 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts