Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
@xmldom/xmldom
Advanced tools
A pure JavaScript W3C standard-based (XML DOM Level 2 Core) DOMParser and XMLSerializer module.
The @xmldom/xmldom package is a pure JavaScript W3C standard-based (XML DOM Level 2 Core) DOMParser and XMLSerializer module. It allows users to parse XML strings into a DOM tree and serialize DOM objects back into XML strings. This package is useful for server-side and client-side manipulation of XML data.
Parsing XML to DOM
This feature allows you to parse a string of XML and create a DOM tree, which can then be manipulated or queried.
const { DOMParser } = require('@xmldom/xmldom');
const xmlString = '<root><child>Hello World</child></root>';
const doc = new DOMParser().parseFromString(xmlString, 'text/xml');
console.log(doc.documentElement.nodeName); // 'root'
Serializing DOM to XML
This feature enables you to take a DOM tree and convert it back into a string of XML, which can be stored or transmitted.
const { XMLSerializer } = require('@xmldom/xmldom');
const doc = new DOMParser().parseFromString('<root></root>', 'text/xml');
doc.documentElement.appendChild(doc.createElement('child'));
const xmlString = new XMLSerializer().serializeToString(doc);
console.log(xmlString); // '<root><child/></root>'
Manipulating DOM
This feature demonstrates how to manipulate the DOM tree by changing the text content of an element.
const { DOMParser } = require('@xmldom/xmldom');
const xmlString = '<root><child>Old Value</child></root>';
const doc = new DOMParser().parseFromString(xmlString, 'text/xml');
doc.getElementsByTagName('child')[0].textContent = 'New Value';
const newXmlString = new XMLSerializer().serializeToString(doc);
console.log(newXmlString); // '<root><child>New Value</child></root>'
libxmljs is a Node.js package that provides bindings to the libxml C library. It allows for parsing and serializing XML, and it is known for its high performance. However, it requires compiling native code, which can be a disadvantage compared to the pure JavaScript implementation of @xmldom/xmldom.
Cheerio is a fast, flexible, and lean implementation of core jQuery designed specifically for the server. It can parse markup and provides an API for manipulating the resulting data structure, similar to jQuery. While it is not a full XML DOM parser, it can handle a subset of XML and is often used for web scraping and server-side DOM manipulation.
jsdom is a pure-JavaScript implementation of many web standards, notably the WHATWG DOM and HTML Standards, for use with Node.js. It is designed to simulate a web browser's environment and can be used to test and scrape web applications. jsdom is more comprehensive than @xmldom/xmldom as it includes support for HTML and the DOM Level 3, but it is also heavier and more complex.
Since version 0.7.0 this package is published to npm as @xmldom/xmldom
and no longer as xmldom
, because we are no longer able to publish xmldom
.
For better readability in the docs, we will continue to talk about this library as "xmldom".
xmldom is a javascript ponyfill to provide the following APIs that are present in modern browsers to other runtimes:
new DOMParser().parseFromString(xml, mimeType) => Document
new DOMImplementation().createDocument(...) => Document
new XMLSerializer().serializeToString(node) => string
The target runtimes xmldom
supports are currently Node >= v14.6 (and very likely any other ES5 compatible runtime).
When deciding how to fix bugs or implement features, xmldom
tries to stay as close as possible to the various related specifications/standards.
As indicated by the version starting with 0.
, this implementation is not feature complete and some implemented features differ from what the specifications describe.
Issues and PRs for such differences are always welcome, even when they only provide a failing test case.
This project was forked from it's original source in 2019, more details about that transition can be found in the CHANGELOG.
npm install @xmldom/xmldom
const { DOMParser, XMLSerializer } = require('@xmldom/xmldom')
const source = `<xml xmlns="a">
<child>test</child>
<child/>
</xml>`
const doc = new DOMParser().parseFromString(source, 'text/xml')
const serialized = new XMLSerializer().serializeToString(doc)
Note: in Typescript and ES6 (see #316) you can use the import
approach, as follows:
import { DOMParser } from '@xmldom/xmldom'
parseFromString(xmlsource, mimeType)
// the options argument can be used to modify behavior
// for more details check the documentation on the code or type definition
new DOMParser(options)
serializeToString(node)
readonly class properties (aka NodeType
),
these can be accessed from any Node
instance node
:
if (node.nodeType === node.ELEMENT_NODE) {...
ELEMENT_NODE
(1
)ATTRIBUTE_NODE
(2
)TEXT_NODE
(3
)CDATA_SECTION_NODE
(4
)ENTITY_REFERENCE_NODE
(5
)ENTITY_NODE
(6
)PROCESSING_INSTRUCTION_NODE
(7
)COMMENT_NODE
(8
)DOCUMENT_NODE
(9
)DOCUMENT_TYPE_NODE
(10
)DOCUMENT_FRAGMENT_NODE
(11
)NOTATION_NODE
(12
)attribute:
nodeValue
| prefix
| textContent
readonly attribute:
nodeName
| nodeType
| parentNode
| parentElement
| childNodes
| firstChild
| lastChild
| previousSibling
| nextSibling
| attributes
| ownerDocument
| namespaceURI
| localName
| isConnected
| baseURI
method:
insertBefore(newChild, refChild)
replaceChild(newChild, oldChild)
removeChild(oldChild)
appendChild(newChild)
hasChildNodes()
cloneNode(deep)
normalize()
contains(otherNode)
getRootNode()
isEqualNode(otherNode)
isSameNode(otherNode)
isSupported(feature, version)
hasAttributes()
extends the Error type thrown as part of DOM API.
readonly class properties:
INDEX_SIZE_ERR
(1
)DOMSTRING_SIZE_ERR
(2
)HIERARCHY_REQUEST_ERR
(3
)WRONG_DOCUMENT_ERR
(4
)INVALID_CHARACTER_ERR
(5
)NO_DATA_ALLOWED_ERR
(6
)NO_MODIFICATION_ALLOWED_ERR
(7
)NOT_FOUND_ERR
(8
)NOT_SUPPORTED_ERR
(9
)INUSE_ATTRIBUTE_ERR
(10
)INVALID_STATE_ERR
(11
)SYNTAX_ERR
(12
)INVALID_MODIFICATION_ERR
(13
)NAMESPACE_ERR
(14
)INVALID_ACCESS_ERR
(15
)attributes:
code
with a value matching one of the above constants.method:
hasFeature(feature, version)
(deprecated)createDocumentType(qualifiedName, publicId, systemId)
createDocument(namespaceURI, qualifiedName, doctype)
Document : Node
readonly attribute:
doctype
| implementation
| documentElement
method:
createElement(tagName)
createDocumentFragment()
createTextNode(data)
createComment(data)
createCDATASection(data)
createProcessingInstruction(target, data)
createAttribute(name)
createEntityReference(name)
getElementsByTagName(tagname)
importNode(importedNode, deep)
createElementNS(namespaceURI, qualifiedName)
createAttributeNS(namespaceURI, qualifiedName)
getElementsByTagNameNS(namespaceURI, localName)
getElementById(elementId)
DocumentFragment : Node
Element : Node
readonly attribute:
tagName
method:
getAttribute(name)
setAttribute(name, value)
removeAttribute(name)
getAttributeNode(name)
setAttributeNode(newAttr)
removeAttributeNode(oldAttr)
getElementsByTagName(name)
getAttributeNS(namespaceURI, localName)
setAttributeNS(namespaceURI, qualifiedName, value)
removeAttributeNS(namespaceURI, localName)
getAttributeNodeNS(namespaceURI, localName)
setAttributeNodeNS(newAttr)
getElementsByTagNameNS(namespaceURI, localName)
hasAttribute(name)
hasAttributeNS(namespaceURI, localName)
Attr : Node
attribute:
value
readonly attribute:
name
| specified
| ownerElement
readonly attribute:
length
method:
item(index)
readonly attribute:
length
method:
getNamedItem(name)
setNamedItem(arg)
removeNamedItem(name)
item(index)
getNamedItemNS(namespaceURI, localName)
setNamedItemNS(arg)
removeNamedItemNS(namespaceURI, localName)
CharacterData : Node
method:
substringData(offset, count)
appendData(arg)
insertData(offset, arg)
deleteData(offset, count)
replaceData(offset, count, arg)
Text : CharacterData
method:
splitText(offset)
Comment : CharacterData
readonly attribute:
name
| entities
| notations
| publicId
| systemId
| internalSubset
Notation : Node
readonly attribute:
publicId
| systemId
Entity : Node
readonly attribute:
publicId
| systemId
| notationName
EntityReference : Node
ProcessingInstruction : Node
attribute:
data
readonly attribute:target
attribute:
textContent
method:
isDefaultNamespace(namespaceURI)
lookupNamespaceURI(prefix)
[Node] Source position extension;
attribute:
lineNumber
//number starting from 1
columnNumber
//number starting from 1
The implementation is based on several specifications:
From the W3C DOM Parsing and Serialization (WD 2016) xmldom
provides an implementation for the interfaces:
DOMParser
XMLSerializer
Note that there are some known deviations between this implementation and the W3 specifications.
Note: The latest version of this spec has the status "Editors Draft", since it is under active development. One major change is that the definition of the DOMParser
interface has been moved to the HTML spec
The original author claims that xmldom implements [DOM Level 2] in a "fully compatible" way and some parts of [DOM Level 3], but there are not enough tests to prove this. Both Specifications are now superseded by the [DOM Level 4 aka Living standard] wich has a much broader scope than xmldom. In the past, there have been multiple (even breaking) changes to align xmldom with the living standard, so if you find a difference that is not documented, any contribution to resolve the difference is very welcome (even just reporting it as an issue).
xmldom implements the following interfaces:
Attr
CDATASection
CharacterData
Comment
Document
DocumentFragment
DocumentType
DOMException
DOMImplementation
Element
Entity
EntityReference
LiveNodeList
NamedNodeMap
Node
NodeList
Notation
ProcessingInstruction
Text
more details are available in the (incomplete) API Reference section.
xmldom does not have any goal of supporting the full spec, but it has some capability to parse, report and serialize things differently when it is told to parse HTML (by passing the HTML namespace).
xmldom has an own SAX parser implementation to do the actual parsing, which implements some interfaces in alignment with the Java interfaces SAX defines:
XMLReader
DOMHandler
There is an idea/proposal to make it possible to replace it with something else in https://github.com/xmldom/xmldom/issues/55
FAQs
A pure JavaScript W3C standard-based (XML DOM Level 2 Core) DOMParser and XMLSerializer module.
The npm package @xmldom/xmldom receives a total of 7,140,134 weekly downloads. As such, @xmldom/xmldom popularity was classified as popular.
We found that @xmldom/xmldom demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.