Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

techcrunch-api

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

techcrunch-api

TechCrunch Scraper is a powerful Node.js package designed to facilitate the extraction of article data from TechCrunch based on specific categories or tags

1.0.0
Source
npm

Version published: 7 months ago

Weekly downloads: 8

Maintainers: 1

Weekly downloads

Created: 7 months ago

Source

TechCrunch Scraper

TechCrunch Scraper is a Node.js package that allows you to scrape articles from TechCrunch based on categories or tags. This package is designed for systems using Ubuntu or other Debian-based distributions that support sudo commands, leveraging Puppeteer to navigate and scrape content from a headless Chromium environment.

Features

Scrape by Category: Automatically retrieve all articles under a specified category.
Scrape by Tag: Collect articles that are tagged with a specific keyword.
Headless Browser Support: Runs Chromium in headless mode to scrape dynamic content.
Optimized for Ubuntu: Includes installation instructions specifically for Ubuntu, but compatible with other Linux distributions.

Prerequisites

Before installing the TechCrunch Scraper, you need to ensure your system has the following dependencies installed:

Node.js (Version 14 or later recommended)
Puppeteer
Dependencies required for Puppeteer and headless Chromium

Installation

Follow these steps to set up the TechCrunch Scraper package:

Step 1: Install System Dependencies

Open a terminal and execute the following commands to install necessary libraries:

sudo apt-get update
sudo apt-get install -y libgbm-dev xvfb chromium-browser libvpx7 libevent-2.1-7 libharfbuzz-icu0 libgstgl-1.0-0 libgstcodecparsers-1.0-0 libwebpdemux2 libenchant-2-2 libsecret-1-0 libmanette-0.2-0 libflite1 libx264-155 libgles2-mesa
npx playwright install
Xvfb :99 -screen 0 1920x1080x24 &
export DISPLAY=:99

Keywords

FAQs

What is techcrunch-api?

Is techcrunch-api popular?

Is techcrunch-api well maintained?

Package last updated on 04 May 2024

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

techcrunch-api

TechCrunch Scraper

Features

Prerequisites

Installation

Step 1: Install System Dependencies

Keywords

Related posts

Weekly Downloads Now Available in npm Package Search Results

Tech's $90B Ghost Engineer Problem: Stanford Study Finds 9.5% of Engineers Do Almost Nothing