
Security News
Crates.io Implements Trusted Publishing Support
Crates.io adds Trusted Publishing support, enabling secure GitHub Actions-based crate releases without long-lived API tokens.
Note: Version 2 of this package may differ in results from version 1.x. Mainly because the parser is now using Cheerio
npm install mapsite
or
yarn add mapsite
const { SitemapParser } = require("mapsite");
const options = {
rejectInvalidContentType: true,
userAgent: "customUA",
maximumRetries: 1,
maximumDepth: 5,
timeout: 3000,
debug: false,
};
const parser = new SitemapParser(options);
const { SitemapParser } = require("mapsite");
const parser = new SitemapParser({
proxy: 'https://username:password@proxy.host:3000'
});
All options are optional, with default fallbacks encoded.
rejectInvalidContentType
: boolean;
Checks that the response content-type header MUST be:
application/xml
application/rss+xml
text/xml
default: true
userAgent
: string;
Adds a custom User-Agent
string to the requests.
default: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36 mapsite/1.0
maximumRetries
: number;
How many times a url in the <loc>
tag of an XML index file should be requested when response status is not < 400.
default: 1
maximumDepth
: string;
How many levels deep should XML index files be traversed. E.g. if index files are nested 3 levels and maximum depth is 2. The last response will not crawl the URLs in the <loc>
tag further.
default: 2
timeout
: number;
The number of milliseconds allowed for a request to complete, both headers or body will timeout at this point.
debug
: boolean;
Logs info, warning and error messages as the parser runs (WIP).
proxy
: string;
A URL of a proxy server to proxy the request through.
const parser = new SitemapParser();
const result = await parser.run("https://example.com/sitemap.xml");
result
: MapsiteResponse;
The result shape looks as follows:
const result = {
type: "sitemap",
urls: ["https://example.com"],
errors: [
{
url: "https://example.com/sitemap-index.xml",
reason: "Brief description of what went wrong",
},
],
};
const { readFileSync } = require("fs");
const parser = new SitemapParser();
const buffer = Buffer.from(readFileSync("./sitemap.xml")); // Or a buffer from an uploaded file
const result = await parser.fromBuffer(buffer);
result
: MapsiteResponse;
The result shape looks as follows:
const result = {
type: "sitemap", // or 'index'
urls: ["https://example.com"],
errors: [
{
url: "buffer",
reason: "Brief description of what went wrong",
},
],
};
FAQs
A module to parse urls from a local or remote sitemap.xml
We found that mapsite demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Crates.io adds Trusted Publishing support, enabling secure GitHub Actions-based crate releases without long-lived API tokens.
Research
/Security News
Undocumented protestware found in 28 npm packages disrupts UI for Russian-language users visiting Russian and Belarusian domains.
Research
/Security News
North Korean threat actors deploy 67 malicious npm packages using the newly discovered XORIndex malware loader.