data-sourcer
Get (and filter) data from multiple different data sources quickly and efficiently.
Installation
Add data-sourcer
to your existing node application like this:
npm install data-sourcer --save
This will install data-sourcer
and add it to your application's package.json
file.
API
Public methods of this module are listed here.
getData
getData([options])
Gets data from all sources.
Usage:
var DataSourcer = require('data-sourcer');
var myDataSourcer = new DataSourcer({
sourcesDir: 'path-to-your-sources-directory'
});
myDataSourcer.getData({
series: true,
filter: {
mode: 'stict',
include: {
someField: ['1']
}
}
})
.on('data', function(data) {
console.log(data);
})
.on('error', function(error) {
console.error(error);
})
.once('end', function() {
console.log('Done!');
});
All available options:
var options = {
browser: {
headless: true,
slowMo: 0,
timeout: 10000,
},
defaultRequestOptions: null,
filter: {
mode: 'strict',
include: {
},
exclude: {
}
},
getDataMethodName: 'getData',
requestQueue: {
concurrency: 10,
delay: 0,
},
series: false,
sourcesBlackList: null,
sourcesDir: null,
sourcesWhiteList: null,
};
listSources
listSources([options])
Get list of all data sources.
Usage:
var DataSourcer = require('data-sourcer');
var myDataSourcer = new DataSourcer({
sourcesDir: 'path-to-your-sources-directory'
});
console.log(myDataSourcer.listSources());
Sample sources
:
[
{
name: 'somewhere',
homeUrl: 'http://somewhere.com',
requiredOptions: {}
},
{
name: 'somewhere-else',
homeUrl: 'http://www.somewhere-else.com',
requiredOptions: {}
}
]
All available options:
var options = {
sourcesBlackList: null,
sourcesWhiteList: null,
};
Defining Sources
Each of your data sources should be a separate JavaScript file to be included via node's require()
method. You are only required to define a getData(options)
method, which should return an event emitter. See the following sample for more details:
module.exports = {
homeUrl: 'https://somewhere.com',
requiredOptions: {
apiKey: 'You can get an API key for this service by creating an account at https://somewhere.com'
},
getData: function(options) {
var emitter = options.newEventEmitter();
_.defer(function() {
emitter.emit('error', new Error('Something bad happened!'));
emitter.emit('data', data);
emitter.emit('end');
});
return emitter;
}
};
Options that are passed to your sources:
- filter -
object
- Passed through from the options that you provide the getData
function. - newPage -
function
with signature newPage(cb)
- Get a new puppeteer page instance. See the puppeteer docs for more details. Use as follows: - request -
function
- Wrapper function for the request module with the default options you provided via defaultRequestOptions
. Requests made via the options.request
instance are queued if using the requestQueue
option. - series -
boolean
- Passed through from the options that you provide the getData
function. - sourceOptions
object
- These are custom source options which are passed through to your source by name. You can use the requiredOptions
source attribute to define which options are required for your source to run properly. Some example of a required option would be an API key or secret for some third-party web API.
Contributing
There are a number of ways you can contribute:
- Improve or correct the documentation - All the documentation is in this readme file. If you see a mistake, or think something should be clarified or expanded upon, please submit a pull request
- Report a bug - Please review existing issues before submitting a new one; to avoid duplicates. If you can't find an issue that relates to the bug you've found, please create a new one.
- Request a feature - Again, please review the existing issues before posting a feature request. If you can't find an existing one that covers your feature idea, please create a new one.
- Fix a bug - Have a look at the existing issues for the project. If there's a bug in there that you'd like to tackle, please feel free to do so. I would ask that when fixing a bug, that you first create a failing test that proves the bug. Then to fix the bug, make the test pass. This should hopefully ensure that the bug never creeps into the project again. After you've done all that, you can submit a pull request with your changes.
Before you contribute code, please read through at least some of the source code for the project. I would appreciate it if any pull requests for source code changes follow the coding style of the rest of the project.
Now if you're still interested, you'll need to get your local environment configured.
Configure Local Environment
Step 1: Get the Code
First, you'll need to pull down the code from GitHub:
git clone https://github.com/chill117/data-sourcer.git
Step 2: Install Dependencies
Second, you'll need to install the project dependencies as well as the dev dependencies. To do this, simply run the following from the directory you created in step 1:
npm install
Tests
This project includes an automated regression test suite. To run the tests:
npm test
Changelog
See changelog.md
License
This software is MIT licensed:
A short, permissive software license. Basically, you can do whatever you want as long as you include the original copyright and license notice in any copy of the software/source. There are many variations of this license in use.
Funding
This project is free and open-source. If you would like to show your appreciation by helping to fund the project's continued development and maintenance, you can find available options here.