What is linkedom?
Linkedom is a lightweight, fast, and efficient library for working with DOM and HTML in Node.js environments. It provides a comprehensive set of APIs to manipulate and traverse the DOM, similar to what you would find in a browser environment.
What are linkedom's main functionalities?
DOM Manipulation
This feature allows you to parse HTML strings and manipulate the DOM elements. In this example, we parse an HTML string, select a div element by its ID, change its text content, and then output the modified HTML.
const { parseHTML } = require('linkedom');
const { document } = parseHTML('<html><body><div id="app"></div></body></html>');
const appDiv = document.getElementById('app');
appDiv.textContent = 'Hello, World!';
console.log(document.toString());
Event Handling
Linkedom supports event handling similar to browser environments. This example demonstrates how to add an event listener to a button element and dispatch a click event programmatically.
const { parseHTML } = require('linkedom');
const { document, Event } = parseHTML('<html><body><button id="btn">Click me</button></body></html>');
const button = document.getElementById('btn');
button.addEventListener('click', () => console.log('Button clicked!'));
const event = new Event('click');
button.dispatchEvent(event);
CSS Selector Queries
You can use CSS selectors to query elements in the DOM. This example shows how to select all list items with a specific class and log their text content.
const { parseHTML } = require('linkedom');
const { document } = parseHTML('<html><body><ul><li class="item">Item 1</li><li class="item">Item 2</li></ul></body></html>');
const items = document.querySelectorAll('.item');
items.forEach(item => console.log(item.textContent));
Other packages similar to linkedom
jsdom
jsdom is a popular library that simulates a browser environment in Node.js. It provides a full-featured DOM and HTML parser, making it suitable for testing and server-side rendering. Compared to linkedom, jsdom is more comprehensive but also heavier and slower.
cheerio
Cheerio is a fast, flexible, and lean implementation of core jQuery designed specifically for the server. It parses HTML and XML documents and provides a jQuery-like API for DOM manipulation. Cheerio is lighter and faster than jsdom but does not support a full DOM environment like linkedom.
node-html-parser
node-html-parser is a fast HTML parser that can parse and manipulate HTML documents. It is lightweight and efficient but does not provide a full DOM API like linkedom. It is suitable for simple HTML parsing and manipulation tasks.
🔗 linkedom
data:image/s3,"s3://crabby-images/701f5/701f524b0bec931df2dd94e4e60708522b81a0a2" alt="Coverage Status"
Social Media Photo by JJ Ying on Unsplash
A triple-linked lists based DOM with the following goals:
- avoid maximum callstack/recursion or crashes, even under heaviest conditions.
- guarantee linear performance from small to big documents.
- be close to the current DOM standard, but not too close.
- fully replace basicHTML (but ... it's already much better).
import {DOMParser, parseHTML} from 'linkedom';
const {
window, document, customElements,
HTMLElement,
Event, CustomEvent
} = parseHTML(`
<!doctype html>
<html lang="en">
<head>
<title>Hello SSR</title>
</head>
<body>
<form>
<input name="user">
<button>
Submit
</button>
</form>
</body>
</html>
`);
customElements.define('custom-element', class extends HTMLElement {
connectedCallback() {
console.log('it works 🥳');
}
});
document.body.appendChild(
document.createElement('custom-element')
);
document.toString();
document.querySelectorAll('form, input[name], button');
Simulating JSON Bootstrap
This module is based on DOMParser API, hence it creates a new document
each time new DOMParser().parseFromString(...)
is invoked.
As there's no global pollution whatsoever, to retrieve classes and features associated to the document
returned by parseFromString
, you need to access its defaultView
property, which is a special proxy that lets you get pseudo-global-but-not-global properties and classes.
Accordingly, to simulate new JSDOM(html).window
behavior, you can use a tiny helper like the following one:
import {parseHTML} from 'linkedom';
function JSDOM(html) { return parseHTML(html).defaultView; }
const {document} = new JSDOM('<h1>Hello LinkeDOM 👋</h1>').window;
How does it work?
All nodes are linked on both sides, and all elements consist of 2 nodes, also linked in between.
Attributes are always at the beginning of an element, while zero or more extra nodes can be found before the end.
A fragment is a special element without boundaries, or parent node.
Node: ← node →
Attr<Node>: ← attr → ↑ ownerElement?
Text<Node>: ← text → ↑ parentNode?
Comment<Node>: ← comment → ↑ parentNode?
Element<Node>: ← start ↔ end → ↑ parentNode?
Fragment<Element>: start ↔ end
Element example:
parentNode?
↑
├────────────────────────────────────────────┐
│ ↓
node? ← start → attr* → text* → comment* → element* → end → node?
↑ │
└────────────────────────────────────────────┘
Fragment example:
┌────────────────────────────────────────────┐
│ ↓
start → attr* → text* → comment* → element* → end
↑ │
└────────────────────────────────────────────┘
Why is this better?
Moving N nodes from a container, being it either an Element or a Fragment, requires the following steps:
- update the first left link of the moved segment
- update the last right link of the moved segment
- connect the left side, if any, of the moved node at the beginning of the segment, with the right side, if any, of the node at the end of such segment
- update the parentNode of the segment to either null, or the new parentNode
As result, there are no array operations, and no memory operations, and everything is kept in sync by updating a few properties, so that removing 3714
sparse <div>
elements in a 12M document, as example, takes as little as 3ms, while appending a whole fragment takes close to 0ms.
Try npm run benchmark:html
to see it yourself.
This structure also allows programs to avoid issues such as "Maximum call stack size exceeded" (basicHTML), or "JavaScript heap out of memory" crashes (JSDOM), thanks to its reduced usage of memory and zero stacks involved, hence scaling better from small to very big documents.
Are childNodes and children always computed?
At this point, even if this module is ready to cache results when no mutations happen, and since repeated crawling is not a too common pattern, but it can always be cached in user-land, the core always crawl left to right or right to left so that it guarantees it's always in sync with the current DOM state.
Parsing VS Node Types
This module parses, and works, only with the following nodeType
:
ELEMENT_NODE
ATTRIBUTE_NODE
TEXT_NODE
COMMENT_NODE
DOCUMENT_NODE
DOCUMENT_FRAGMENT_NODE
Everything else, at least for the time being, is considered YAGNI, and it won't likely ever land in this project, as there's no goal to replicate deprecated features of this aged Web.
Benchmarks
To run the benchmark locally, please follow these commands:
git clone https://github.com/WebReflection/linkedom.git
cd linkedom/test
npm i
cd ..
npm i
npm run benchmark
Following a couple of (outdated) benchmark results example.
benchmark:dom
data:image/s3,"s3://crabby-images/ea4e6/ea4e6e0b11c6fd21a14f78214f162efcf4d58c41" alt="benchmark output example"
benchmark:html
data:image/s3,"s3://crabby-images/403c4/403c4573bcb978bb167879f4310e95e8d2f62986" alt="benchmark output example"