![Maven Central Adds Sigstore Signature Validation](https://cdn.sanity.io/images/cgdhsj6q/production/7da3bc8a946cfb5df15d7fcf49767faedc72b483-1024x1024.webp?w=400&fit=max&auto=format)
Security News
Maven Central Adds Sigstore Signature Validation
Maven Central now validates Sigstore signatures, making it easier for developers to verify the provenance of Java packages.
A little module that makes scraping websites a little easier. Uses node.js and jQuery.
Via npm:
$ npm install scraper
First argument is an url as a string, second is a callback which exposes a jQuery object with your scraped site as "body" and third is an object from the request containing info about the url.
var scraper = require('scraper');
scraper('http://search.twitter.com/search?q=javascript', function(err, jQuery) {
if (err) {throw err}
jQuery('.msg').each(function() {
console.log(jQuery(this).text().trim()+'\n');
});
});
First argument is an object containing settings for the "request" instance used internally, second is a callback which exposes a jQuery object with your scraped site as "body" and third is an object from the request containing info about the url.
var scraper = require('scraper');
scraper(
{
'uri': 'http://search.twitter.com/search?q=nodejs'
, 'headers': {
'User-Agent': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)'
}
}
, function(err, $) {
if (err) {throw err}
$('.msg').each(function() {
console.log($(this).text().trim()+'\n');
});
}
);
First argument is an array containing either strings or objects, second is a callback which exposes a jQuery object with your scraped site as "body" and third is an object from the request containing info about the url.
You can also add rate limiting to the fetcher by adding an options object as the third argument containing 'reqPerSec': float.
var scraper = require('scraper');
scraper(
[
'http://search.twitter.com/search?q=javascript'
, 'http://search.twitter.com/search?q=css'
, {
'uri': 'http://search.twitter.com/search?q=nodejs'
, 'headers': {
'User-Agent': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)'
}
}
, 'http://search.twitter.com/search?q=html5'
]
, function(err, $) {
if (err) {throw err;}
$('.msg').each(function() {
console.log($(this).text().trim()+'\n');
});
}
, {
'reqPerSec': 0.2 // Wait 5sec between each external request
}
);
Contains the info about what page/pages will be scraped
'http://www.nodejs.org'
or
{
'uri': 'http://search.twitter.com/search?q=nodejs'
, 'headers': {
'User-Agent': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)'
}
}
or
[
urlString
, urlString
, requestObject
, urlString
]
The callback that allows you do use the data retrieved from the fetch.
function(err, $) {
if (err) {throw err;}
$('.msg').each(function() {
console.log($(this).text().trim()+'\n');
}
}
This argument is an object containing settings for the fetcher overall.
FAQs
Easier web scraping using jQuery.
The npm package scraper receives a total of 0 weekly downloads. As such, scraper popularity was classified as not popular.
We found that scraper demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Maven Central now validates Sigstore signatures, making it easier for developers to verify the provenance of Java packages.
Security News
CISOs are racing to adopt AI for cybersecurity, but hurdles in budgets and governance may leave some falling behind in the fight against cyber threats.
Research
Security News
Socket researchers uncovered a backdoored typosquat of BoltDB in the Go ecosystem, exploiting Go Module Proxy caching to persist undetected for years.