Socket
Book a DemoInstallSign in
Socket

n8n-nodes-n8ntools-web-scraper

Package Overview
Dependencies
Maintainers
1
Versions
73
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

n8n-nodes-n8ntools-web-scraper

N8N Tools - Web Scraper: Extract data from websites with AI-powered content recognition and anti-bot detection

4.4.1
latest
Source
npmnpm
Version published
Weekly downloads
1.1K
-77.91%
Maintainers
1
Weekly downloads
 
Created
Source

N8N Tools - Web Scraper

npm version npm downloads License: MIT

Extract data from websites with AI-powered content recognition and anti-bot detection bypass. This N8N community node provides intelligent web scraping capabilities through the N8N Tools platform.

✨ Features

  • 🕷️ Smart Scraping: AI-powered content recognition and extraction
  • 🔄 Multiple Operations: Single page, multiple pages, and monitoring
  • 🎯 CSS Selectors: Flexible data extraction with attribute support
  • 🤖 JavaScript Support: Handle dynamic content and SPAs
  • 📸 Screenshots: Optional page screenshots for verification
  • 🛡️ Anti-Bot Protection: Built-in detection bypass mechanisms
  • 💰 Cost Tracking: Usage monitoring and budget controls

🚀 Quick Start

Installation

Install this node in your N8N instance:

  • Go to Settings > Community Nodes in your N8N interface
  • Click Install a community node
  • Enter n8n-nodes-n8ntools-web-scraper
  • Click Install

Via npm

npm install n8n-nodes-n8ntools-web-scraper

Setup Credentials

  • Sign up at N8N Tools and get your API key
  • In N8N, create new N8N Tools API credentials
  • Enter your API URL: https://api.n8ntools.io
  • Enter your API key

📖 Usage

Supported Operations

OperationDescriptionUse Case
Scrape Single PageExtract data from one webpageProduct details, contact info
Scrape Multiple PagesBatch process multiple URLsCatalog scraping, bulk data
Monitor Page ChangesTrack website changesPrice monitoring, content updates

Example Workflow

[Schedule Trigger] → [N8N Tools Web Scraper] → [Process Data] → [Database]

Configuration Example

E-commerce Product Scraping:

{
  "operation": "scrapePage",
  "url": "https://example-store.com/products/laptop",
  "selectors": [
    {
      "name": "title",
      "selector": "h1.product-title",
      "attribute": "text"
    },
    {
      "name": "price",
      "selector": ".price-current",
      "attribute": "text"
    },
    {
      "name": "images",
      "selector": ".product-gallery img",
      "attribute": "src",
      "multiple": true
    },
    {
      "name": "availability",
      "selector": ".stock-status",
      "attribute": "text"
    }
  ],
  "options": {
    "waitForSelector": ".price-current",
    "waitTime": 3,
    "screenshot": true
  }
}

⚙️ Node Parameters

URL Configuration

  • URL: Target webpage URL (for single page and monitoring)
  • URLs: Multiple URLs, one per line (for batch processing)

Selector Configuration

  • Name: Field name in the output
  • CSS Selector: CSS selector to target elements
  • Attribute: Element attribute to extract (text, href, src, title, etc.)
  • Multiple: Extract multiple elements (returns array)

Advanced Options

  • Wait for Selector: CSS selector to wait for before scraping
  • Wait Time: Seconds to wait before extraction (default: 5)
  • User Agent: Custom user agent string
  • Enable JavaScript: Execute JavaScript on page (default: true)
  • Screenshot: Capture page screenshot (default: false)
  • Follow Redirects: Handle HTTP redirects (default: true)

📤 Output Data

Single Page Result

{
  "url": "https://example-store.com/products/laptop",
  "title": "Gaming Laptop Pro 15\"",
  "price": "$1,299.99",
  "images": [
    "https://example-store.com/img/laptop-1.jpg",
    "https://example-store.com/img/laptop-2.jpg"
  ],
  "availability": "In Stock",
  "success": true,
  "operation": "scrapePage",
  "creditsUsed": 1,
  "creditsRemaining": 99,
  "timestamp": "2024-01-15T10:30:00Z"
}

Multiple Pages Result

Returns array with one object per URL processed.

🔧 Selector Guide

Basic Selectors

// Text content
{ "selector": "h1", "attribute": "text" }

// Links
{ "selector": "a.product-link", "attribute": "href" }

// Images
{ "selector": "img.thumbnail", "attribute": "src" }

// Data attributes
{ "selector": "[data-price]", "attribute": "data-price" }

Advanced Selectors

// Multiple items
{
  "selector": ".product-item",
  "attribute": "text",
  "multiple": true
}

// Nested selection
{
  "selector": ".product-card .title",
  "attribute": "text"
}

// Attribute extraction
{
  "selector": "meta[property='og:image']",
  "attribute": "content"
}

🤖 JavaScript Support

Handle dynamic content and single-page applications:

{
  "options": {
    "enableJavaScript": true,
    "waitForSelector": ".dynamic-content",
    "waitTime": 5
  }
}

Perfect for:

  • React/Vue/Angular applications
  • AJAX-loaded content
  • Lazy-loaded images
  • Dynamic pricing

📸 Screenshot Feature

Capture page screenshots for verification:

{
  "options": {
    "screenshot": true
  }
}

Screenshots are returned as base64-encoded PNG images in the response.

🛡️ Anti-Bot Features

Built-in protection against common anti-bot measures:

  • Rotating User Agents: Automatic user agent rotation
  • Request Delays: Human-like request timing
  • Header Spoofing: Realistic browser headers
  • Proxy Support: Optional proxy rotation (premium feature)

💸 Pricing & Limits

  • Single Page: 1 credit per page
  • Multiple Pages: 1 credit per URL
  • Page Monitoring: 1 credit per check
  • Screenshot: +0.5 credits when enabled
  • Rate Limit: Based on your N8N Tools subscription

🔄 Monitoring Workflows

Price Monitoring Example

[Cron Trigger: Daily] → [Web Scraper] → [Compare Previous] → [Send Alert]

Content Change Detection

[Schedule: Hourly] → [Web Scraper] → [Hash Content] → [Detect Changes] → [Notify]

🛠️ Advanced Use Cases

Product Catalog Scraping

// Scrape product listings
{
  "operation": "scrapePage",
  "url": "https://store.com/category/laptops",
  "selectors": [
    {
      "name": "products",
      "selector": ".product-item a",
      "attribute": "href",
      "multiple": true
    }
  ]
}

Lead Generation

// Extract contact information
{
  "selectors": [
    { "name": "email", "selector": "a[href^='mailto:']", "attribute": "href" },
    { "name": "phone", "selector": ".contact-phone", "attribute": "text" },
    { "name": "address", "selector": ".address", "attribute": "text" }
  ]
}

🚨 Error Handling

Common errors and solutions:

// Timeout error
{
  "error": "Page load timeout",
  "success": false,
  "suggestion": "Increase waitTime or check URL accessibility"
}

// Selector not found
{
  "error": "Selector not found: .missing-element",
  "success": false,
  "suggestion": "Verify CSS selector or wait for dynamic content"
}

📋 Requirements

  • N8N version 0.174.0 or higher
  • N8N Tools account and API key
  • Node.js 18+ (for development)

🆘 Support

📄 License

MIT License - see LICENSE file for details.

Part of the N8N Tools ecosystemWebsiteAll Packages

Keywords

n8n-community-node-package

FAQs

Package last updated on 22 Aug 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

About

Packages

Stay in touch

Get open source security insights delivered straight into your inbox.

  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc

U.S. Patent No. 12,346,443 & 12,314,394. Other pending.