Retrieves a list of URLs to seed the crawler by publishing them to a RabbitMQ exchange.
This crawler will use my personnal scraper named 'RecipeScraper' to dowload recipes data from Marmiton, 750g or cuisineaz
A ruby social media stat crawler
a simple wrapper for nokogiri/rest-client, aim to make capybara like dsl
CIA World Factbook crawler
web crawler that generates a sitemap to a neo4j database. It will also store broken_links and total number of pages on site
it starts a crawler for Middleman sites
Crawler for http://legendas.tv to see the most dowloaded subtitles
A crappy crawler for a crappy bank interface
Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-brutefuzzy-service provides a fuzzy simhash lookup index as a distributed service.
This file crawler helps to decect if there are new files in a directory.
This rubygem does not have a description or summary.
Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-brutefuzzy-protobuf gem contains the protocol buffer generated java classes for the iudex-brutefuzzy-service.
A client for the PageMunch web crawler API
A crappy crawler for a crappy bank interface
The wrapper of capybara for crawler
Easy to use DSL that helps scraping data from websites. Thanks to it, writing web crawlers would be very fast and intuitive. Traversing through html nodes and fetching all of the HTML attributes, would be possible. Just like in jQuery - you will find methods like parent, children, first, find, siblings etc. Furthermore, you are able to download images, web pages, and store all content in the database. Please visit my Github account for more details.
This is web comic crawler that Akihabara otaku.
This gem scowers 4chan (or any chan copy theoretically) searching for threads that contains key words specified by you on boards specified by you and downloads all the images, gifs and webms to a specified folder
A demo of Web Crawler using arb-crawler
Discovery Mission is an easy-to-use website crawler. Use it for generating sitemaps.
Simple crawler using Redis as backend
Crawls the mails in a IMAP Folder
This gem allows to crawl news articles from RSS feeds.
Crawl multiple torrent sites.
A simple crawler to list SESC events schedule on Terminal
A Crawler for NewRank
Octocrawler is an intuitive and simple Github API wrapper
Crawler for downloading comics from komiks.gildia.pl
Crawler4J filter plugin for Embulk
Simple Web Crawler
Add support for Cassandra in Polipus crawler
A store for carrierwave that uses swift. NightcrawlerSwift client is used for authenication, list and store operations.
Website crawler harvesting e-mails. Uses Sidekiq and Typhoeus.
A set of classes for dealing with options. It includes a crawler for Yahoo!Finance.
Scrawler is a project management framework optimized for programmer happiness and sustainable productivity.
Ruby ptt crawling tool
A webcrawler, which grabs fine-grained data of your personal Nike+ runs and saves these as XML and JSON files.
It read file containing list of urls and produces output file with domain, page title, twitter, facebook and google plus handles found on the page
The Baidu Crawler is to crawl data with your demmand
Hushes worthless Rails exceptions & logs, such as those caused by bots and crawlers.
A simple news crawler. You can specify the structure of your xml or rss feeds.
Ruby Web crawler
SuperCrawler allows you to easily crawl full web sites or web pages (extracting internal links and assets) in few seconds.
Mobile App Review Crawler
This gem is a web crawler sample code.So I don't reccmmend that you use.
Friendly, neighborhood web crawler for quality assurance.
Stupid crawler that looks for URLs on a given site. Result is saved as two CSV files one with found URLs and another with failed URLs.
This gem offers: classes to subclass and create a manga site crawler; a dowloader to use with these classes; some site-specific scripts.
The application crawls a URL and extracts links, tags and sequences. These features are written to an output file