Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
html-select
Advanced tools
match a tokenized html stream with css selectors
var select = require('html-select');
var tokenize = require('html-tokenize');
var fs = require('fs');
fs.createReadStream(__dirname + '/page.html')
.pipe(tokenize())
.pipe(select('.content span', function (e) {
console.log('matched:', e);
}))
;
with this html input:
<html>
<body>
<h1>whoa</h1>
<div class="content">
<span class="greeting">beep boop</span>
<span class="name">robot</div>
</div>
</body>
</html>
produces this output:
matched: { name: 'span', attributes: { class: 'greeting' } }
matched: { name: 'span', attributes: { class: 'name' } }
var select = require('html-select')
Return a writable stream w
that expects an object stream of
html-tokenize records as input.
cb(tag)
fires on the same tick for each tag
matching the css selector
selector
. tag
looks like:
{ name: 'input', attributes: { type: 'text', 'name': 'user', value: 'beep' } }
The records are of the form:
$ echo -e '<html><body><h1>beep boop</h1></body></html>' | html-tokenize
["open","<html>"]
["open","<body>"]
["open","<h1>"]
["text","beep boop"]
["close","</h1>"]
["close","</body>"]
["close","</html>"]
["text","\n"]
except the second item in each record will be a Buffer if you get the results from html-tokenize directly.
Additionally to tag.name
and tag.attributes
, you can create a readable
stream with all the contents nested under tag
.
When opts.outer
is true
, the outerHTML content of the currently selected tag
is included. For example, taking the selector and opts
from process.argv
:
var select = require('html-select');
var tokenize = require('html-tokenize');
var fs = require('fs');
var minimist = require('minimist');
var argv = minimist(process.argv.slice(2), { boolean: [ 'outer' ] });
var selector = argv._.join(' ');
process.stdin.pipe(tokenize())
.pipe(select(selector, function (e) {
e.createReadStream(argv).pipe(process.stdout);
}))
;
Running this program normally gives:
$ node read.js .content < page.html
<span class="greeting">beep boop</span>
<span class="name">robot</div>
but running the program with opts.outer
as true
produces:
$ node read.js .content --outer < page.html
<div class="content">
<span class="greeting">beep boop</span>
<span class="name">robot</div>
</div>
usage: html-select SELECTOR
Given a newline-separated json stream of html tokenize output on stdin,
print matching tags as newline-separated json on stdout.
E > F
E + F
With npm do:
npm install html-select
to get the library or
npm install -g html-select
to get the command-line program.
MIT
FAQs
match a tokenized html stream with css selectors
The npm package html-select receives a total of 4,235 weekly downloads. As such, html-select popularity was classified as popular.
We found that html-select demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.