
Security News
Attackers Are Hunting High-Impact Node.js Maintainers in a Coordinated Social Engineering Campaign
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.
untemplate
Advanced tools
A node.js package that uses templates to extract structured info from HTML.
This node module provides a way to scrape structured information from websites based on HTML templates.
Templating engines like handlebars/jade:
{ structured: 'info' } + {{ template }} = <html>
This module:
<html> - {{ template }} = { structured: 'info' }
Alternatively, you can think of untemplate.js as a declarative DSL for web scraping.
import { untemplate } from 'untemplate';
// obtain a DOM element from somewhere
let element = get-dom-from-html(`
<div>
<div>
<span> alberta </span>
</div>
<div>
<span> bc </span>
<span> canada </span>
</div>
</div>
`;
let template = `
<div>
<span> {{ region }} </span>
<span ?> {{ country }} </span>
</div>
`;
let data = untemplate(template, element);
// data: [{ region: 'alberta' }, { region: 'bc', country: 'canada' }]
How do you make templates you ask? See the API section below for details on the deduceTemplate function.
Refer to the specs file at spec/untemplateSpec.js for examples of each of these features.
<div> hello </div> and a template like <div> {{ greeting }} </div>, produces the associative array { greeting: 'hello' }{{ property }} captures are optional to simplify template creation#untemplate(dsl, element[, cb, rate])dsl: the template as a string; valid templates are valid HTML, sans attributes, with one notable exception: optional="true". This makes the node it's attached to optional in the template. Some sugar: <div optional="true"></div> <=> <div?></div>.element: the root DOM element to search for the template in. This must be a proper DOM element, either output from some library like xmldom or from the browsercb: (optional) a progress callback that is periodically called with the approximate completion percentage (first argument) and a stop function (second argument) that causes untemplate to terminate early if calledrate: (optional) the approximate percent of progress that occurs between each call of cbelementEarlyStopException: thrown if the cb function calls its stop argument#precomputeNeedles(dsl[, cb, rate])dsl: the template as a string; valid templates are valid HTML, sans attributes, with one notable exception: optional="true". This makes the node it's attached to optional in the template. Some sugar: <div optional="true"></div> <=> <div?></div>.cb: (optional) a progress callback that is periodically called with the precomputation completion percentage (first argument) and a stop function (second argument) that causes precomputeNeedles to terminate early if calledrate: (optional) the approximate percent of progress that occurs between each call of cb#untemplateWithNeedles instead of #untemplate for a drastic performance increaseEarlyStopException: thrown if the cb function calls its stop argument#deduceTemplate(elements[, cb, rate])examples: a list of HTML nodes as strings. These nodes should all represent the same "type of thing" in the page you'd like to apply the template to. For instance, the HTML for each search result in a long list of search results. These nodes must all share the same outermost tag.cb: (optional) a progress callback that is periodically called with the approximate completion percentage (first argument) and a stop function (second argument) that causes deduceTemplate to terminate early if calledrate: (optional) the approximate percent of progress that occurs between each call of cbUnresolveableExamplesError: thrown if the input examples cannot be reconciled for any reason (usually just because they do not share a common outermost tag)EarlyStopException: thrown if the cb function calls its stop argument#deduceTemplateVerbose(elements[, prefix, cb, rate])examples: see arguments for #deduceTemplateprefix: (optional) a string to prefix all of the generated property selectors withcb: (optional) a progress callback that is periodically called with the approximate completion percentage (first argument) and a stop function (second argument) that causes deduceTemplateVerbose to terminate early if calledrate: (optional) the approximate percent of progress that occurs between each call of cbmaximalDsl: a template to match the examples with property selectors in all possible positionsconsolidatedValues: an object literal mapping the property selectors in maximalDsl to the literal values in dslWithLiteralsdslWithLiterals: the same minimum template returned by #deduceTemplateUnresolveableExamplesError: thrown in same cases as #deduceTemplateEarlyStopException: thrown if the cb function calls its stop argumentThis project uses Webpack to generate a bundle file and Flow to check types. The following commands will use yarn, but you can use npm or npm run interchangeably.
yarn install to grab the dependencies.yarn flow to check the typesyarn test to run tests (this project uses jasmine)yarn build to create a bundle file in lib/For a front-end interface to the #deduceTemplate function, open the index.html file in a browser. This will allow you to paste in HTML and deduce the template the minimally matches those html examples.
MIT © fin ventures
FAQs
A node.js package that uses templates to extract structured info from HTML.
We found that untemplate demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.

Security News
Axios compromise traced to social engineering, showing how attacks on maintainers can bypass controls and expose the broader software supply chain.

Security News
Node.js has paused its bug bounty program after funding ended, removing payouts for vulnerability reports but keeping its security process unchanged.