New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

scoopi

Package Overview
Dependencies
Maintainers
1
Versions
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

scoopi

CLI tool to scoop documentation websites and convert them to local Markdown files for LLM consumption

latest
npmnpm
Version
1.0.0
Version published
Maintainers
1
Created
Source

scoopi

CLI tool to scoop documentation websites and convert them to local Markdown files for LLM consumption.

Installation

# Install dependencies (Chrome will be installed automatically)
npm install

# If Chrome installation failed, run manually:
npm run setup

# Make the CLI globally available (optional)
npm install -g .

Requirements

  • Node.js 18+
  • Chrome browser (automatically installed via Puppeteer)

Usage

Basic snipping

scoopi https://docs.example.com

Advanced options

# Specify maximum depth and output directory
scoopi https://docs.example.com --depth 2 --output ./my-docs

# Include/exclude URL patterns
scoopi https://docs.example.com --include "**/api/**,**/guide/**" --exclude "**/legacy/**"

# Add delay between requests (in milliseconds)
scoopi https://docs.example.com --delay 2000

# Enable verbose logging
scoopi https://docs.example.com --verbose

Configuration

# Show current configuration
scoopi config --show

# Reset configuration to defaults
scoopi config --reset

Options

  • --depth, -d <number>: Maximum scooping depth (default: 3)
  • --output, -o <path>: Output directory (default: ./docs)
  • --include <patterns>: URL patterns to include (comma-separated)
  • --exclude <patterns>: URL patterns to exclude (comma-separated)
  • --delay <ms>: Delay between requests in milliseconds (default: 1000)
  • --verbose: Enable verbose logging

Features

  • 🥄 Smart scooping: Automatically detects and follows documentation links

  • 📝 Clean conversion: Converts HTML to clean, readable Markdown

  • 🗂️ Organized output: Creates hierarchical directory structure based on URLs

  • 🎯 Pattern matching: Include/exclude URLs using glob patterns

  • Performance: Configurable delays and depth limits

  • 🔍 Content filtering: Removes navigation, ads, and other non-content elements

  • 📊 Progress tracking: Real-time progress indicators and detailed logging

Development

# Run tests
npm test

# Start in development mode
npm run dev

# Lint code
npm run lint

License

MIT

Keywords

documentation

FAQs

Package last updated on 29 Sep 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts