Security News
The Unpaid Backbone of Open Source: Solo Maintainers Face Increasing Security Demands
Solo open source maintainers face burnout and security challenges, with 60% unpaid and 60% considering quitting.
sitemapper
Advanced tools
Parse through a sitemaps xml to get all the urls for your crawler.
npm install sitemapper --save
const Sitemapper = require('sitemapper');
const sitemap = new Sitemapper();
sitemap.fetch('https://wp.seantburke.com/sitemap.xml').then(function(sites) {
console.log(sites);
});
import Sitemapper from 'sitemapper';
(async () => {
const Google = new Sitemapper({
url: 'https://www.google.com/work/sitemap.xml',
timeout: 15000, // 15 seconds
});
try {
const { sites } = await Google.fetch();
console.log(sites);
} catch (error) {
console.log(error);
}
})();
// or
const sitemapper = new Sitemapper();
sitemapper.timeout = 5000;
sitemapper.fetch('https://wp.seantburke.com/sitemap.xml')
.then(({ url, sites }) => console.log(`url:${url}`, 'sites:', sites))
.catch(error => console.log(error));
You can add options on the initial Sitemapper object when instantiating it.
requestHeaders
: (Object) - Additional Request Headers (e.g. User-Agent
)timeout
: (Number) - Maximum timeout in ms for a single URL. Default: 15000 (15 seconds)url
: (String) - Sitemap URL to crawldebug
: (Boolean) - Enables/Disables debug console logging. Default: Falseconcurrency
: (Number) - Sets the maximum number of concurrent sitemap crawling threads. Default: 10retries
: (Number) - Sets the maximum number of retries to attempt in case of an error response (e.g. 404 or Timeout). Default: 0rejectUnauthorized
: (Boolean) - If true, it will throw on invalid certificates, such as expired or self-signed ones. Default: Truelastmod
: (Number) - Timestamp of the minimum lastmod value allowed for returned urlsfield
: (Object) - An object of fields to be returned from the sitemap. For Example: { loc: true, lastmod: true, changefreq: true, priority: true }
. Leaving a field out has the same effect as field: false
. If not specified sitemapper defaults to returning the 'classic' array of urls.
const sitemapper = new Sitemapper({
url: 'https://art-works.community/sitemap.xml',
rejectUnauthorized: true,
timeout: 15000,
requestHeaders: {
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:81.0) Gecko/20100101 Firefox/81.0'
}
});
An example using all available options:
const sitemapper = new Sitemapper({
url: 'https://art-works.community/sitemap.xml',
timeout: 15000,
requestHeaders: {
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:81.0) Gecko/20100101 Firefox/81.0'
},
debug: true,
concurrency: 2,
retries: 1,
});
var Sitemapper = require('sitemapper');
var Google = new Sitemapper({
url: 'https://www.google.com/work/sitemap.xml',
timeout: 15000 //15 seconds
});
Google.fetch()
.then(function (data) {
console.log(data);
})
.catch(function (error) {
console.log(error);
});
// or
var sitemapper = new Sitemapper();
sitemapper.timeout = 5000;
sitemapper.fetch('https://wp.seantburke.com/sitemap.xml')
.then(function (data) {
console.log(data);
})
.catch(function (error) {
console.log(error);
});
npm install sitemapper@1.1.1 --save
var Sitemapper = require('sitemapper');
var sitemapper = new Sitemapper();
sitemapper.getSites('https://wp.seantburke.com/sitemap.xml', function(err, sites) {
if (!err) {
console.log(sites);
}
});
FAQs
Parser for XML Sitemaps to be used with Robots.txt and web crawlers
The npm package sitemapper receives a total of 20,583 weekly downloads. As such, sitemapper popularity was classified as popular.
We found that sitemapper demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Solo open source maintainers face burnout and security challenges, with 60% unpaid and 60% considering quitting.
Security News
License exceptions modify the terms of open source licenses, impacting how software can be used, modified, and distributed. Developers should be aware of the legal implications of these exceptions.
Security News
A developer is accusing Tencent of violating the GPL by modifying a Python utility and changing its license to BSD, highlighting the importance of copyleft compliance.