eval-spider -- Programmable spidering of web sites with node.js
Install
From npm:
npm install eval-spider
(How to use the) API
Creating a Spider
var spider = require('eval-spider');
var spider = new spider(options);
spider(options)
The options
object can have the following fields:
maxPages
- Integer containing the maximum Pages to be crawled. Default 10
requestThrottle
-Integer How many Request at a time. Default 5
url
- String Website url. Default 'https://medium.com'
fileName
- String Output File Name . Default 'output.csv'
connect
- Map Aerospike Db Details . Default {host:(Default : localhost) ,port :(Default 3000),namespace : (Default : test),set:(Default webcrawler),metadata:(Default : {}) }
Queuing an URL for spider to fetch.
spider.crawler()
Return a Promise
Response : Response when promise is resolve
{
response : Array, // Result set
crawledUrls : Map, // Crawled Urls
count : Integer // Number of Crawled Urls
}
Write response in csv
spider.writeToFile(name,data)
- write Data into csv file. Name(string) optional : Result file name, Data(array)
Write response in aerospike
spider.aerospike(data)
- write Data into aerospike. Data(array)