New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

mcp-server-image-extractor

Package Overview
Dependencies
Maintainers
1
Versions
8
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

mcp-server-image-extractor

MCP server for extracting and categorizing images from web pages with intelligent classification

latest
npmnpm
Version
1.0.8
Version published
Weekly downloads
3
-25%
Maintainers
1
Weekly downloads
 
Created
Source

Image Extractor MCP Server

An MCP (Model Context Protocol) server that extracts and categorizes images from web pages using intelligent heuristics.

Features

  • Smart Image Extraction: Extracts images from various sources including:

    • <img> tags
    • CSS background images
    • Meta tags (og:image, twitter:image)
    • Favicons and touch icons
  • Intelligent Classification: Categorizes images into three types:

    • Icons: Logos, favicons, small brand images
    • Products: E-commerce product images
    • Other: Banners, article images, decorative content
  • Dual Extraction Modes:

    • Static Mode: Fast extraction using axios and cheerio
    • JavaScript Mode: Full rendering with Puppeteer for dynamic sites
  • Rich Metadata: Returns comprehensive information for each image:

    • Absolute URL
    • Dimensions (width/height)
    • Alt text and title
    • Position on page (header/main/footer)
    • Surrounding context
    • Classification confidence score

Installation

As an MCP Server

npm install -g mcp-server-image-extractor

For Development

# Download and extract the source code
cd image-extractor
npm install
npm run build

MCP Configuration

Add the server to your MCP settings:

{
  "mcpServers": {
    "image-extractor": {
      "command": "npx",
      "args": ["-y", "mcp-server-image-extractor"],
      "timeout": 120
    }
  }
}

Note: The first run with npx may take longer as it downloads the package. Set a higher timeout (120 seconds) to accommodate this.

Using global installation (faster startup)

First install globally:

npm install -g mcp-server-image-extractor

Then configure:

{
  "mcpServers": {
    "image-extractor": {
      "command": "mcp-server-image-extractor"
    }
  }
}

Using local installation

For development or local testing:

{
  "mcpServers": {
    "image-extractor": {
      "command": "node",
      "args": ["C:/path/to/image-extractor/build/index.js"]
    }
  }
}

Alternative: Using npx with cache

To avoid timeout issues, you can pre-cache the package:

npx mcp-server-image-extractor --version

Then use the standard npx configuration.

Usage

Once connected, you can use the extract_images tool:

Tool Parameters

  • url (required): The URL to extract images from
  • useJavaScript (optional): Use Puppeteer for JavaScript-rendered sites (default: false)
  • includeDataUrls (optional): Include base64 data URLs (default: false)
  • minSize (optional): Minimum image size in pixels (default: 0)

Example Request

{
  "url": "https://example.com",
  "useJavaScript": false,
  "includeDataUrls": false,
  "minSize": 100
}

Example Response

{
  "url": "https://example.com",
  "timestamp": "2024-01-07T12:00:00Z",
  "images": {
    "icons": [
      {
        "url": "https://example.com/logo.png",
        "alt": "Company Logo",
        "dimensions": { "width": 150, "height": 50 },
        "confidence": 0.95,
        "position": "header",
        "context": "Main navigation area"
      }
    ],
    "products": [
      {
        "url": "https://example.com/product1.jpg",
        "alt": "Product Image",
        "dimensions": { "width": 500, "height": 500 },
        "confidence": 0.88,
        "position": "main",
        "context": "Product gallery, near price $29.99"
      }
    ],
    "other": [
      {
        "url": "https://example.com/banner.jpg",
        "alt": "Hero Banner",
        "dimensions": { "width": 1200, "height": 400 },
        "confidence": 0.75,
        "position": "main",
        "context": "Hero section"
      }
    ]
  },
  "summary": {
    "total": 25,
    "icons": 5,
    "products": 10,
    "other": 10
  }
}

Classification Heuristics

The server uses multiple factors to classify images:

Icon Detection

  • Small dimensions (< 200x200px)
  • Located in header/navigation
  • Filename contains: logo, icon, favicon, brand
  • Alt text with company/brand names
  • Meta favicon tags

Product Detection

  • Medium to large size (> 300x300px)
  • Square aspect ratio
  • Located near price/cart elements
  • Product-related keywords in alt text
  • E-commerce context patterns

Context Analysis

  • Examines surrounding HTML elements
  • Checks for e-commerce patterns
  • Analyzes parent container classes
  • Detects proximity to price elements

Development

Project Structure

image-extractor/
├── src/
│   ├── index.ts        # MCP server entry point
│   ├── extractor.ts    # Core extraction logic
│   ├── classifier.ts   # Image classification
│   ├── utils.ts        # Helper functions
│   └── types.ts        # TypeScript types
├── build/              # Compiled JavaScript
├── package.json
└── tsconfig.json

Building

npm run build    # Compile TypeScript
npm run dev      # Watch mode

Testing

npm test         # Run tests (when implemented)

Use Cases

  • E-commerce Analysis: Extract product images from online stores
  • Brand Monitoring: Collect logos and brand images from websites
  • Content Aggregation: Gather images for content curation
  • Web Scraping: Extract visual content for analysis
  • SEO Auditing: Analyze image usage and optimization

Limitations

  • Image dimension detection requires downloading image headers
  • JavaScript mode is slower but more accurate for dynamic sites
  • Classification accuracy depends on page structure and naming conventions
  • Large pages with many images may take longer to process
  • Puppeteer requires additional system dependencies for headless Chrome

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT

Keywords

mcp

FAQs

Package last updated on 14 Aug 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts