EmailCrawler
Email crawler: crawls the top N Google search results looking for email addresses and exports them to CSV.
Installation
$ gem install email_crawler
Usage
email-crawler --help
email-crawler --query "berlin walks"
- Select which Google website to use (defaults to google.com.br)
email-crawler --query "berlin walks" --google-website google.de
- Specify how many search results URLs to collect (defaults to 100)
email-crawler --query "berlin walks" --max-results 250
- Specify how many internal links are to be scanned for email addresses (defaults to 100)
email-crawler --query "berlin walks" --max-links 250
- Specify how many threads to use when searching for links and email addresses (defaults to 50)
email-crawler --query "berlin walks" --concurrency 25
- Exclude certain domains from pages scanned for email addresses
email-crawler --query "berlin walks" --blacklist berlin.de --blacklist berlin.com
- Redirect output to a file
email-crawler --query "berlin walks" > ~/Desktop/belin-walks-emails.csv
Contributing
- Fork it ( http://github.com/wecodeio/email_crawler/fork )
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request