Security News
tea.xyz Spam Plagues npm and RubyGems Package Registries
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
scrappers
Advanced tools
Readme
A set of utility classes for node.js to make scrapping the web easier.
There is support for custom browser headers, encodings and compression.
npm install --save scrapper
The url of the target page
An object with a public "parse" method.
######Example:
var hnParser = {
//$ is cheerio (jquery) instance of the parsed page
parse:function($){
//get the text of the third link in a page
return $('a').eq(3).text();
}
};
####encoding
The encoding of the target html page. This parameter is optional and defaults to "utf-8"
####headers
An object containing key-value pairs of headers. Defaults to:
{
'User-Agent': "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.94 Safari/537.36"
}
####gzip
A flag to enable disable the gzip compressing. By default it is enabled (set to true
.
You will probably not want to disable this, if the page is not compressed, it will still be parsed correctly (see request)
####Options can be passed on instantiation:
var scrapper = new PageScrapper({
url: HACKER_NEWS_HOME,
parser: hnParser
});
####Or on the get
request:
scrappers.get(options, done);
Options passed in the get
request, will extend the options passed on instantiation for the duration of the request.
A base class for scrapping a web page.
####Example:
Get the third link from hacker news home page.
#####Import scrapper object
var PageScrapper = require('scrappers').PageScrapper;
The parse functin will rescive a cheerio instance with hn html.
var hnParser = {
//$ is cheerio (jquery) instance of the parsed page
parse:function($){
//get the text of the third link in a page
return $('a').eq(3).text();
}
};
#####Instantiate a scraper object
var HACKER_NEWS_HOME = "https://news.ycombinator.com/";
var scrapper = new PageScrapper({
url: HACKER_NEWS_HOME,
parser: hnParser
});
#####Parse!
scrapper.get(function(err,parsed){
console.log('Third link on hacker news page is:", parsed);
});
Third link on hacker news page is: comments
A base class for scrapping an rss feed.
####Example:
Get a list of article titles for ask hacker news rss.
#####Import scrapper object
var RssScrapper = require('scrappers').RssScrapper;
The parse functin will rescive a javascript object representing a single rss article.
var hnParser = {
//gets a parsed rss articale in an object
parse:function(article){
return article.title;
}
};
#####Instantiate a scraper object
var HACKER_NEWS_RSS = "http://hnrss.org/ask";
var scrapper = new RssScrapper({
url: HACKER_NEWS_RSS,
parser: hnParser
});
#####Parse!
scrapper.get(function(err,parsed){
//print all articles on an rss
console.log("Ask:Hn titles", parsed);
});
Ask:HN titles:
[
'Ask HN: Do you like the idea of social network and learning?★',
'Ask HN: How does Saved stories feature work?',
'Ask HN: AGPL on a Code Generator App',
'Ask HN: How do you read your programming books?',
'Ask HN: Is OpenGL Worth Learning?',
'Ask HN: How to produce vnc like Browserling?',
'Ask HN: How do I solve problems/code outside of the book I used to learn python?',
'Ask HN: Self Study Learning Path',
'Ask HN: How to build quality software in a fast paced startup enviorment?',
'Ask HN: Is Agar.io currently making or losing money?',
'Ask HN: Any success with Toastmasters?',
'Ask HN: Has anyone else found Angular to be destroying their productivity?',
'Ask HN: How to survive a horrible tech job while looking for a new one?',
'Ask HN: How can a successful startup adopt a strong testing workflow?',
'Ask HN: What kind of software will be used to develop VR applications?',
'Ask HN: How do you prepare for a Technical Interview',
'Ask HN: Recommend one Business/Startup book',
'Ask HN: Should I branch off my startup\'s technology into a separate company?',
'Ask HN: Test/Play with 3D Printing Library',
'Ask HN: What database storage engine do you use, and why?'
]
To run tests use:
npm test
FAQs
A set of utility classes for node.js to make scrapping the web easier.
The npm package scrappers receives a total of 4 weekly downloads. As such, scrappers popularity was classified as not popular.
We found that scrappers demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
Security News
As cyber threats become more autonomous, AI-powered defenses are crucial for businesses to stay ahead of attackers who can exploit software vulnerabilities at scale.
Security News
UnitedHealth Group disclosed that the ransomware attack on Change Healthcare compromised protected health information for millions in the U.S., with estimated costs to the company expected to reach $1 billion.