Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

Sign in Demo Install

What is Socket?

Socket for GitHub

Detect suspicious packages in PRs

Socket CLI

Use Socket from the command line

Socket Web Extension

Use Socket from your browser

Socket Dependency Search

Find any package for your project

Integrations

All Integrations

Ticketing & Messaging

Package Managers

Docs

Want to read all the docs? Start here

Customers

Check out our customer stories

Blog

Keep up to date with all the news

Changelog

Latest updates and enhancements

FAQ

Answers to common questions

Package Alerts

Learn about all Socket alerts

Glossary

Open source and security terms

Blog

Application Security

Customer Stories

About

Why we built Socket

Love

See why developers love Socket

Careers

Join our team

Investors

Learn about our investors

Security

Our security practices

Why Socket?

Socket vs Dependabot

Socket vs Semgrep

Socket vs EndorLabs

Achievements

Fortune Cyber 60

Pricing Love Docs

Sign in Demo Install

rubygems
Categories
Server
Web
Crawler

Crawler

http_crawler

初级开发工程师，基于 http 写的爬虫扩展包。请不要随意下载里面有很多坑。

0.2.2 • 6 years ago

wriggle

A simple directory crawler DSL.

1.3.0 • 14 years ago

tsjobcrawler

Crawls job listing websites for jobs requiring security clearance.

0.1.7 • 8 years ago

cosmicrawler

Cosmicrawler is crawler library for Ruby. It provides scalable asynchronous crawling by (http|file|etc) using EventMachine.

0.0.1 • 12 years ago

kudzu

A simple web crawler for ruby

1.3.3 • last month

driller

Driller is a command line Ruby based web crawler based on Anemone. Driller can crawl website and reports error pages and slow pages and generates HTML reports.

0.1.4 • 10 years ago

movie_crawler

Grab the movies information from the atmovies.com

0.2.2 • 10 years ago

crawler-movie-core

This rubygem does not have a description or summary.

1.2.0 • 3 years ago

email_crawler

Email crawler: crawls the top ten Google search results looking for email addresses and exports them to CSV.

0.1.1 • 9 years ago

event-crawler

Generic Web crawler with a DSL that parses event-related data from web pages

0.1.0 • 13 years ago

kabutops

Dead simple yet powerful Ruby crawler for easy parallel crawling with support for an anonymity.

0.3.0 • 9 years ago

crawler-core

This rubygem does not have a description or summary.

1.1.0 • 3 years ago

flyerhzm-regexp_crawler

RegexpCrawler is a Ruby library for crawl data from website using regular expression.

0.9.1 • 10 years ago

file_crawler

FileCrawler searches and controls files in local directory

0.6.0 • 6 years ago

krawler

Simple little website crawler.

1.0.14 • 12 years ago

preadly-bulbasaur

Bulbasaur is a helper for crawler operations used in Pread.ly

0.9.0 • 9 years ago

apollo-crawler

Gem for crawling data from external sources

0.1.31 • 12 years ago

omelete

Ruby web crawler to access omelete informations

2.0.7 • 12 years ago

awesomecrawler

A little website crawler.

0.1.3 • 14 years ago

news_crawler

A flexible, modular web crawler

0.0.4 • 11 years ago

botch

Botch is a DSL for quickly creating web crawlers. Inspired by Sinatra.

0.1.5 • 11 years ago

adsense_crawler_for_private

Easy way to enable AdSense crawler to login and see private or custom pages in your rails application. Basically one custom login filter. Gem enables you to easily slightly increase revenues from Google AdSense/AdWords. It makes it easy to enable crawling on private pages and so get better targeted ads even in pages behind login screen.

1.2.1 • 5 years ago

promoqui-api-sdk

This gem helps Crawler Writers to interact with the PromoQui REST API

2.1.8 • 6 years ago

polipus-elasticsearch

Add support for ElasticSearch in Polipus crawler

0.0.4 • 9 years ago

catflap

A simple solution to provide on-demand service access (e.g. port 80 on webserver), where a more robust and secure VPN solution is not available. Essentially, it is a more user-friendly form of "port knocking". The original proof-of-concept implementation was run for almost three years by Demotix, to protect development and staging servers from search engine crawlers and other unwanted traffic.

1.0.1 • 9 years ago

spiderman

your friendly neighborhood web crawler

2.0.0 • 5 years ago

spiderable

Allows your rails application to be spiderable by crawlers

0.0.3 • 11 years ago

vrowser

Server browser and Crawler for many games (L4D2, TF2, CS:S, KZMOD, The Ship)

0.1.5 • 13 years ago

bot_detection

Checks a user agent for a web crawler

1.0.9 • 9 years ago

rubygems-crawler

A very simple crawler for RubyGems.org used to demo the power of ElasticSearch at RubyConf 2013

0.1.0 • 11 years ago

nightcrawler

Minimal sharding solution for AR

0.0.2 • 14 years ago

medusa-crawler

== Medusa: a ruby crawler framework {rdoc-image:https://badge.fury.io/rb/medusa-crawler.svg}[https://rubygems.org/gems/medusa-crawler] rdoc-image:https://github.com/brutuscat/medusa-crawler/workflows/Ruby/badge.svg?event=push Medusa is a framework for the ruby language to crawl and collect useful information about the pages it visits. It is versatile, allowing you to write your own specialized tasks quickly and easily. === Features * Choose the links to follow on each page with +focus_crawl+ * Multi-threaded design for high performance * Tracks +301+ HTTP redirects * Allows exclusion of URLs based on regular expressions * Records response time for each page * Obey _robots.txt_ directives (optional, but recommended) * In-memory or persistent storage of pages during crawl, provided by Moneta[https://github.com/moneta-rb/moneta] * Inherits OpenURI behavior (redirects, automatic charset and encoding detection, proxy configuration options). <b>Do you have an idea or a suggestion? {Open an issue and talk about it}[https://github.com/brutuscat/medusa-crawler/issues/new]</b> === Examples Medusa is versatile and to be used programatically, you can start with one or multiple URIs: require 'medusa' Medusa.crawl('https://www.example.com', depth_limit: 2) Or you can pass a block and it will yield the crawler back, to manage configuration or drive its crawling focus: require 'medusa' Medusa.crawl('https://www.example.com', depth_limit: 2) do |crawler| crawler.discard_page_bodies = some_flag # Persist all the pages state across crawl-runs. crawler.clear_on_startup = false crawler.storage = Medusa::Storage.Moneta(:Redis, 'redis://redis.host.name:6379/0') crawler.skip_links_like(/private/) crawler.on_pages_like(/public/) do |page| logger.debug "[public page] #{page.url} took #{page.response_time} found #{page.links.count}" end # Use an arbitrary logic, page by page, to continue customize the crawling. crawler.focus_crawl(/public/) do |page| page.links.first end end

1.0.0 • 4 years ago

parallel588_polipus

An easy to use distributed web-crawler framework based on Redis

0.4.1 • 10 years ago

arb-crawler

Web page crawler.

1.0.3 • 7 years ago

bookcrawler

Fetch books metadata using Amazon Product Advertising API

0.3.0 • 9 years ago

indexable

Rack middleware that executes javascript before serving pages to crawlers.

0.1.2 • 11 years ago

crawler-address-core

This rubygem does not have a description or summary.

1.0.1 • 3 years ago

manga-crawler

A gem that collects mangas from websites

0.2.0 • 12 years ago

mwcrawler

Essa gema provê uma api ruby para se fazer o scrapping de páginas html do sistema matricula web e retornar um conteudo que pode ser mais facilmente processado pelo programa

0.1.2 • 4 years ago

pioneer

Simple async HTTP crawler based on em-synchrony

0.0.9 • 13 years ago

crawler-engine

Crawler Engine provides function of crawl all news from the customized website

0.1.0 • 13 years ago

cve_crawler

A periodic crawler that fetches the latest CVE additions, parses them, and filters them

0.3.0 • 9 years ago

livedoor-feeddiscover

livedoor-feeddiscover performs feed autodiscovery using the livedoor Feed Discover API. livedoor Feed Discover API find a Atom/RSS feed(s) from the livedoor Reader crawler database. So, livedoor-feeddiscover do not access the target URL.

1.1.0 • 15 years ago

iron-crawler

A generic web crawler that doesn't crawl outside URLs.

1.2.1 • 9 years ago

pantopoda

Pantopoda is a web crawler that visits all links on a given domain that's fast and effective.

0.0.9 • 9 years ago

attribute_imagifiable

Using paperclip to generate images from sensible attributes like e-mails and telephone numbers, in order to reduce crawler's success

0.0.8 • 11 years ago

crawler-operator-core

This rubygem does not have a description or summary.

1.0.4 • 3 years ago

crawler-movie-tmdb

This rubygem does not have a description or summary.

1.1.0 • 3 years ago

marmiton_crawler

A web scrawler to get a Marmiton's recipe

1.0.3 • 8 years ago

jobs_crawler

Crawl the senegalese web, looking for jobs using the excellent wombat gem

0.1.8 • 6 years ago

Product

Package Alerts
Integrations
Docs
Pricing
FAQ
Roadmap
Changelog

About

About
Love
Blog
Glossary
Discord Community
CareersHiring
Send Feedback
Contact Us
System Status

Packages

npm

Directory
Explore
Random Package
Most Popular
Top Maintainers
Removed Packages

Go

Directory
Explore
Random Package

Maven

Directory
Explore
Random Package

PyPI

Directory
Explore
Random Package

Rubygems

Directory
Explore
Random Package

Stay in touch

Get open source security insights delivered straight into your inbox.

Enter your email

Terms
Privacy
Security

Made with ⚡️ by Socket Inc