pop-api-scraper

Features
The pop-api-scraper project aims to provide the core modules for the
popcorn-api
scraper, but
can also be used for other purposes by using middleware.
- Strategy pattern with providers
- Cronjobs
- Scraper wrapper class
- HttpService with
got
Installation
$ npm install --save pop-api-scraper pop-api
Documentation
Usage
For the basic setup you need to create a Provider
(strategy) the
PopApiScraper
instance can use. The PopApiScraper
implements the strategy
pattern, where the providers are the strategies.
The example below makes a HTTP GET request to a web service or website. from
there on you are free to implement how and what data you want to get from it.
import { AbstractProvider, HttpService } from 'pop-api-scraper'
export default class ExampleProvider extends AbstractProvider {
constructor(PopApiScraper, {name, configs, maxWebRequests = 2}) {
super(PopApiScraper, {name, configs, maxWebRequests})
}
scrapeConfig(config) {
this.httpService = new HttpService({
baseUrl: config.baseUrl
})
return this.httpService.get('/posts', config.httpOptions)
.then(res => res.data)
}
}
Bundle it all up together with
pop-api
:
import os from 'os'
import { PopApi } from 'pop-api'
import { join } from 'path'
import { Cron, PopApiScraper } from 'pop-api-scraper'
import ExampleProvider from './ExampleProvider'
(async () => {
try {
PopApiScraper.use(ExampleProvider, {
name: 'example-provider',
configs: [{
baseUrl: 'https://jsonplaceholder.typicode.com',
httpOptions: {
query: {
foo: 'bar'
}
}
}],
maxWebRequests: 2
})
PopApi.use(PopApiScraper, {
statusPath: join(...[os.tmpdir(), 'status.json']),
updatedPath: join(...[os.tmpdir(), 'updated.json'])
})
PopApi.use(Cron, {
cronTime: '0 0 */6 * * *',
start: false
})
const res = await PopApi.scraper.scrape()
console.info(res[0])
} catch (err) {
console.error(err)
}
})()
License
MIT License