Security News
vlt Debuts New JavaScript Package Manager and Serverless Registry at NodeConf EU
vlt introduced its new package manager and a serverless registry this week, innovating in a space where npm has stagnated.
The jsdom npm package is a pure-JavaScript implementation of many web standards, notably the WHATWG DOM and HTML Standards, for use with Node.js. It is designed to create a web browsing environment similar to what you would find in a web browser, allowing for the parsing and manipulation of HTML and XML documents. It can be used for testing web pages and JavaScript libraries in a Node.js environment.
DOM Manipulation
This feature allows you to manipulate the DOM of a document. The code sample demonstrates how to append a new list item to an unordered list.
const { JSDOM } = require('jsdom');
const dom = new JSDOM(`<ul><li>Item 1</li></ul>`);
const ul = dom.window.document.querySelector('ul');
ul.appendChild(dom.window.document.createElement('li')).textContent = 'Item 2';
console.log(dom.window.document.body.innerHTML);
Event Handling
This feature enables you to simulate and handle events. The code sample shows how to simulate a click event on a button and handle it by logging a message.
const { JSDOM } = require('jsdom');
const dom = new JSDOM(`<button id='myButton'>Click me</button>`);
const button = dom.window.document.querySelector('#myButton');
button.addEventListener('click', () => console.log('Button clicked!'));
button.click();
Fetching Remote Content
This feature allows you to fetch remote content and create a JSDOM instance from it. The code sample demonstrates fetching and logging the HTML content of a webpage.
const { JSDOM } = require('jsdom');
JSDOM.fromURL('https://example.com/').then(dom => {
console.log(dom.serialize());
});
Running Scripts
This feature allows you to run scripts within the context of the JSDOM instance. The code sample shows how to execute a script that changes the text content of the document body.
const { JSDOM } = require('jsdom');
const dom = new JSDOM(`<script>document.body.textContent = 'Hello, world!';</script>`);
console.log(dom.window.document.body.textContent);
Cheerio is a fast, flexible, and lean implementation of core jQuery designed specifically for the server. It does not implement a full web browser API as jsdom does, but it is often used for simpler DOM manipulation tasks and is faster due to its focus on a subset of use cases.
Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. It is more powerful for tasks like web scraping, automated testing, and rendering since it operates over a real browser, but it is heavier and more resource-intensive compared to jsdom.
PhantomJS is a headless WebKit scriptable with a JavaScript API. It has similar capabilities to Puppeteer but uses WebKit as the rendering engine instead of Chromium. It is no longer actively maintained, but it was once a popular choice for headless website testing.
A JavaScript implementation of the WHATWG DOM and HTML standards, for use with io.js.
$ npm install jsdom
Note that as of our 4.0.0 release, jsdom no longer works with Node.js™, and instead requires io.js. You are still welcome to install a release in the 3.x series if you use Node.js™.
jsdom.env
jsdom.env
is an API that allows you to throw a bunch of stuff at it, and it will generally do the right thing.
You can use it with a URL
// Count all of the links from the io.js build page
var jsdom = require("jsdom");
jsdom.env(
"https://iojs.org/dist/",
["http://code.jquery.com/jquery.js"],
function (errors, window) {
console.log("there have been", window.$("a").length - 4, "io.js releases!");
}
);
or with raw HTML
// Run some jQuery on a html fragment
var jsdom = require("jsdom");
jsdom.env(
'<p><a class="the-link" href="https://github.com/tmpvar/jsdom">jsdom!</a></p>',
["http://code.jquery.com/jquery.js"],
function (errors, window) {
console.log("contents of a.the-link:", window.$("a.the-link").text());
}
);
or with a configuration object
// Print all of the news items on Hacker News
var jsdom = require("jsdom");
jsdom.env({
url: "http://news.ycombinator.com/",
scripts: ["http://code.jquery.com/jquery.js"],
done: function (errors, window) {
var $ = window.$;
console.log("HN Links");
$("td.title:not(:last) a").each(function() {
console.log(" -", $(this).text());
});
}
});
or with raw JavaScript source
// Print all of the news items on Hacker News
var jsdom = require("jsdom");
var fs = require("fs");
var jquery = fs.readFileSync("./jquery.js", "utf-8");
jsdom.env({
url: "http://news.ycombinator.com/",
src: [jquery],
done: function (errors, window) {
var $ = window.$;
console.log("HN Links");
$("td.title:not(:last) a").each(function () {
console.log(" -", $(this).text());
});
}
});
The do-what-I-mean API is used like so:
jsdom.env(string, [scripts], [config], callback);
string
: may be a URL, file name, or HTML fragmentscripts
: a string or array of strings, containing file names or URLs that will be inserted as <script>
tagsconfig
: see belowcallback
: takes two arguments
errors
: either null
, if nothing goes wrong, or an array of errorswindow
: a brand new window
, if there were no loading errorsExample:
jsdom.env(html, function (errors, window) {
// free memory associated with the window
window.close();
});
If you would like to specify a configuration object only:
jsdom.env(config);
config.html
: a HTML fragmentconfig.file
: a file which jsdom will load HTML from; the resulting window's location.href
will be a file://
URL.config.url
: sets the resulting window's location.href
; if config.html
and config.file
are not provided, jsdom will load HTML from this URL.config.scripts
: see scripts
above.config.src
: an array of JavaScript strings that will be evaluated against the resulting document. Similar to scripts
, but it accepts JavaScript instead of paths/URLs.config.cookieJar
: cookie jar which will be used by document and related resource requests. Can be created by jsdom.createCookieJar()
method. Useful to share cookie state among different documents as browsers does.config.parsingMode
: either "auto"
, "html"
, or "xml"
. The default is "auto"
, which uses HTML behavior unless config.url
responds with an XML Content-Type
, or config.file
contains a filename ending in .xml
or .xhtml
. Setting to "xml"
will attempt to parse the document as an XHTML document. (jsdom is currently only OK at doing that.)config.document
:
referrer
: the new document will have this referrer.cookie
: manually set a cookie value, e.g. 'key=value; expires=Wed, Sep 21 2011 12:00:00 GMT; path=/'
. Accepts cookie string or array of cookie strings.config.headers
: an object giving any headers that will be used while loading the HTML from config.url
, if applicableconfig.features
: see Flexibility section below. Note: the default feature set for jsdom.env
does not include fetching remote JavaScript and executing it. This is something that you will need to carefully enable yourself.config.resourceLoader
: a function that intercepts subresource requests and allows you to re-route them, modify, or outright replace them with your own content. More below.config.done
, config.loaded
, config.created
: see below.config.concurrentNodeIterators
: the maximum amount of NodeIterator
s that you can use at the same time. The default is 10
; setting this to a high value will hurt performance.Note that at least one of the callbacks (done
, loaded
, or created
) is required, as is one of html
, file
, or url
.
If you just want to load the document and execute it, the done
callback shown above is the simplest. If anything goes wrong, either while loading the document and creating the window, or while executing any <script>
s, the problem will show up in the errors
array passed as the first argument.
However, if you want more control over or insight into the initialization lifecycle, you'll want to use the created
and/or loaded
callbacks:
created(error, window)
The created
callback is called as soon as the window is created, or if that process fails. You may access all window
properties here; however, window.document
is not ready for use yet, as the HTML has not been parsed.
The primary use-case for created
is to modify the window object (e.g. add new functions on built-in prototypes) before any scripts execute.
You can also set an event handler for 'load'
or other events on the window if you wish. But the loaded
callback, below, can be more useful, since it includes script errors.
If the error
argument is non-null
, it will contain whatever loading error caused the window creation to fail; in that case window
will not be passed.
loaded(errors, window)
The loaded
callback is called along with the window's 'load'
event. This means it will only be called if creation succeeds without error. Note that by the time it has called, any external resources will have been downloaded, and any <script>
s will have finished executing.
If errors
is non-null
, it will contain an array of all JavaScript errors that occured during script execution. window
will still be passed, however.
done(errors, window)
Now that you know about created
and loaded
, you can see that done
is essentially both of them smashed together:
<script>
s cause errors, then errors
will be null, and window
will be usable.errors
will be an array containing those errors, but window
will still be usable.errors
will be an array containing the creation error, and window
will not be passed.If you load scripts asynchronously, e.g. with a module loader like RequireJS, none of the above hooks will really give you what you want. There's nothing, either in jsdom or in browsers, to say "notify me after all asynchronous loads have completed." The solution is to use the mechanisms of the framework you are using to notify about this finishing up. E.g., with RequireJS, you could do
// On the io.js side:
var window = jsdom.jsdom(...).defaultView;
window.onModulesLoaded = function () {
console.log("ready to roll!");
};
<!-- Inside the HTML you supply to jsdom -->
<script>
requirejs(["entry-module"], function () {
window.onModulesLoaded();
});
</script>
For more details, see the discussion in #640, especially @matthewkastor's insightful comment.
By default, jsdom.env
will not process and run external JavaScript, since our sandbox is not foolproof. That is, code running inside the DOM's <script>
s can, if it tries hard enough, get access to the Node environment, and thus to your machine. If you want to (carefully!) enable running JavaScript, you can use jsdom.jsdom
, jsdom.jQueryify
, or modify the defaults passed to jsdom.env
.
jsdom.jsdom
The jsdom.jsdom
method does fewer things automatically; it takes in only HTML source, and it does not allow you to separately supply scripts that it will inject and execute. It just gives you back a document
object, with usable document.defaultView
, and starts asynchronously executing any <script>
s included in the HTML source. You can listen for the 'load'
event to wait until scripts are done loading and executing, just like you would in a normal HTML page.
Usage of the API generally looks like this:
var jsdom = require("jsdom").jsdom;
var doc = jsdom(markup, options);
var window = doc.defaultView;
markup
is a HTML document to be parsed. You can also pass undefined
to get the basic document, equivalent to what a browser will give if you open up an empty .html
file.
options
: see the explanation of the config
object above.
One of the goals of jsdom is to be as minimal and light as possible. This section details how someone can change the behavior of Document
s before they are created. These features are baked into the DOMImplementation
that every Document
has, and may be tweaked in two ways:
Document
, by overriding the configuration:var jsdom = require("jsdom").jsdom;
var doc = jsdom("<html><body></body></html>", {
features: {
FetchExternalResources : ["img"]
}
});
Do note, that this will only affect the document that is currently being created. All other documents will use the defaults specified below (see: Default Features).
require("jsdom").defaultDocumentFeatures = {
FetchExternalResources: ["script"],
ProcessExternalResources: false
};
Default features are extremely important for jsdom as they lower the configuration requirement and present developers a set of consistent default behaviors. The following sections detail the available features, their defaults, and the values that jsdom uses.
FetchExternalResources
["script"]
["script", "img", "css", "frame", "iframe", "link"]
or false
jsdom.env
: false
Enables/disables fetching files over the file system/HTTP
ProcessExternalResources
["script"]
["script"]
or false
jsdom.env
: false
Enables/disables JavaScript execution
SkipExternalResources
false
(allow all)/url to be skipped/
or false
/http:\/\/example.org/js/bad\.js/
Filters resource downloading and processing to disallow those matching the given regular expression
jsdom lets you intercept subresource requests using config.resourceLoader
. config.resourceLoader
expects a function which is called for each subresource request with the following arguments:
resource
: a vanilla JavaScript object with the following properties
url
: a parsed URL object.cookie
: the content of the HTTP cookie header (key=value
pairs separated by semicolons).baseUrl
: the base URL used to resolve relative URLs.defaultFetch(callback)
: a convenience method to fetch the resource online.callback
: a function to be called with two arguments
error
: either null
, if nothing goes wrong, or an Error
object.body
: a string representing the body of the resource.For example, fetching all JS files from a different directory and running them in strict mode:
var jsdom = require("jsdom");
jsdom.env({
url: "http://example.com/",
resourceLoader: function (resource, callback) {
var pathname = resource.url.pathname;
if (/\.js$/.test(pathname)) {
resource.url.pathname = pathname.replace("/js/", "/js/raw/");
resource.defaultFetch(function (err, body) {
if (err) return callback(err);
callback(null, '"use strict";\n' + body);
});
} else {
resource.defaultFetch(callback);
}
},
features: {
FetchExternalResources: ["script"],
ProcessExternalResources: ["script"],
SkipExternalResources: false
}
});
jsdom includes support for using the canvas package to extend any <canvas>
elements with the canvas API. To make this work, you need to include canvas as a dependency in your project, as a peer of jsdom. If jsdom can find the canvas package, it will use it, but if it's not present, then <canvas>
elements will behave like <div>
s.
var jsdom = require("jsdom").jsdom;
var document = jsdom("hello world");
var window = document.defaultView;
console.log(window.document.documentElement.outerHTML);
// output: "<html><head></head><body>hello world</body></html>"
console.log(window.innerWidth);
// output: 1024
console.log(typeof window.document.getElementsByClassName);
// outputs: function
var jsdom = require("jsdom");
var window = jsdom.jsdom().defaultView;
jsdom.jQueryify(window, "http://code.jquery.com/jquery-2.1.1.js", function () {
window.$("body").append('<div class="testing">Hello World, It works</div>');
console.log(window.$(".testing").text());
});
var jsdom = require("jsdom").jsdom;
var window = jsdom().defaultView;
window.__myObject = { foo: "bar" };
var scriptEl = window.document.createElement("script");
scriptEl.src = "anotherScript.js";
window.document.body.appendChild(scriptEl);
// anotherScript.js will have the ability to read `window.__myObject`, even
// though it originated in io.js!
var jsdom = require("jsdom").jsdom;
var serializeDocument = require("jsdom").serializeDocument;
var doc = jsdom("<!DOCTYPE html>hello");
serializeDocument(doc) === "<!DOCTYPE html><html><head></head><body>hello</body></html>";
doc.documentElement.outerHTML === "<html><head></head><body>hello</body></html>";
var jsdom = require("jsdom");
var cookieJar = jsdom.createCookieJar();
jsdom.env({
url: 'http://google.com',
cookieJar: cookieJar,
done: function (err1, window1) {
//...
jsdom.env({
url: 'http://code.google.com',
cookieJar: cookieJar,
done: function (err2, window2) {
//...
}
});
}
});
var jsdom = require("jsdom");
var virtualConsole = jsdom.createVirtualConsole();
virtualConsole.sendTo(console);
var document = jsdom.jsdom(undefined, {
virtualConsole: virtualConsole
});
var jsdom = require("jsdom");
var virtualConsole = jsdom.createVirtualConsole();
virtualConsole.on("log", function (message) {
console.log("console.log called ->", message);
});
var document = jsdom.jsdom(undefined, {
virtualConsole: virtualConsole
});
Post-initialization, if you didn't pass in a virtualConsole
or no longer have a reference to it, you can retreive the virtualConsole
by using:
var virtualConsole = jsdom.getVirtualConsole(window);
Our mission is to get something very close to a headless browser, with emphasis more on the DOM/HTML side of things than the CSS side. As such, our primary goals are supporting The DOM Standard and The HTML Standard. We only support some subset of these so far; in particular we have the subset covered by the outdated DOM 2 spec family down pretty well. We're slowly including more and more from the modern DOM and HTML specs, including some Node
APIs, querySelector(All)
, attribute semantics, the history and URL APIs, and the HTML parsing algorithm.
We also support some subset of the CSSOM, largely via @chad3814's excellent cssstyle package. In general we want to make webpages run headlessly as best we can, and if there are other specs we should be incorporating, let us know.
FAQs
A JavaScript implementation of many web standards
The npm package jsdom receives a total of 19,503,257 weekly downloads. As such, jsdom popularity was classified as popular.
We found that jsdom demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
vlt introduced its new package manager and a serverless registry this week, innovating in a space where npm has stagnated.
Security News
Research
The Socket Research Team uncovered a malicious Python package typosquatting the popular 'fabric' SSH library, silently exfiltrating AWS credentials from unsuspecting developers.
Security News
At its inaugural meeting, the JSR Working Group outlined plans for an open governance model and a roadmap to enhance JavaScript package management.