
Research
/Security News
Toptal’s GitHub Organization Hijacked: 10 Malicious Packages Published
Threat actors hijacked Toptal’s GitHub org, publishing npm packages with malicious payloads that steal tokens and attempt to wipe victim systems.
dom-collector
Advanced tools
It simply transforms a given url into key-value organized JSON with specification.
npm install --save dom-collector
Under the hood, it does ...
Validate rule specification you passed.
Load web page with well-known library request
Parse and fetch elements with proved dom selector cheerio; it might be better than jsdom.
Filter values and fill the default value configured.
Replace collected values into JSON Object, also iterative elements will be into JSON Array.
Return a thenable Promise function to be resolved asynchronously.
For this html body
<ul id="content-list">
<li data-id="1">
<a href="#"> aaa </a>
</li>
<li data-id="2">
<a href="#"> bbb </a>
</li>
<li data-id="3">
<a href="#"></a>
</li>
</ul>
Add a rule below
collector = require 'dom-collector'
rule =
url: 'https://gist.githubusercontent.com/eces/f8d377992a12f64dc353/raw/75fd1607925e12bb82fdc7890514a3899781531d/test-01.html'
timeout: 15000
encoding: 'utf8'
params: []
headers:
'User-Agent': 'Mozilla/5.0(iPad; U; CPU iPhone OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B314 Safari/531.21.10'
selector: [
{
key: 'items[]'
value: '#content-list li'
type: 'array'
default: []
}
{
key: 'items[].label'
value: 'a'
type: 'string'
filter: 'trim'
default: 'default'
}
{
key: 'items[].src'
value: '[data-id]'
type: 'number'
}
]
task = collector.fetch_json rule
task.then (result) ->
console.log result
Then, it brings the result
{
"items": [
{ "label": "aaa", "src": 1 }
{ "label": "bbb", "src": 2 }
{ "label": "default", "src": 3 }
]
}
fetch_json(rule: Object)
require('dom-collector').fetch_json(rule);
This is DOM selector to find values for key. It supports querySelector and jQuery selector like. When you are supposed to do $('#content')
then this value should be #content
.
This key will be exposed and created into result JSON. If key has []
array notation, it becomes a parent key and every keys ending with parent[]
become children of the parent. If parent key has no entry, children may not resolved from empty array.
string
, number
, boolean
Please note that the default value will be set if failed type-casting.
This default value will be replaced into value if no element is found, and also
string
and string length is zero.number
and falsy with isFinite
; NaN, Infinity, undefined.This regular expression will be evaluated and return the first value.
100
can be found from <li onclick="contentView(100, 3);"></li>
with below matcher:
match: "contentView\\(([0-9]+)\\,"
Reference: eces/dom-collector/src/filter.coffee
70.5M
to 70500
1,000,000
to 1000000
"\r\n hello. "
to "hello."
value
to String(value)
value
to Number(value)
value
to Boolean(value)
Please be aware of unintended boolean conversion from this reading MDN - Boolean.
The value passed as the first parameter is converted to a boolean value, if necessary. If value is omitted or is 0, -0, null, false, NaN, undefined, or the empty string (""), the object has an initial value of false. All other values, including any object or the string "false", create an object with an initial value of true.
Do not confuse the primitive Boolean values true and false with the true and false values of the Boolean object.
Any object whose value is not undefined or null, including a Boolean object whose value is false, evaluates to true when passed to a conditional statement.
grunt build
grunt test
Welcome
Under MIT License.
FAQs
A simple DOM crawler based on JSON scheme.
The npm package dom-collector receives a total of 0 weekly downloads. As such, dom-collector popularity was classified as not popular.
We found that dom-collector demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
/Security News
Threat actors hijacked Toptal’s GitHub org, publishing npm packages with malicious payloads that steal tokens and attempt to wipe victim systems.
Research
/Security News
Socket researchers investigate 4 malicious npm and PyPI packages with 56,000+ downloads that install surveillance malware.
Security News
The ongoing npm phishing campaign escalates as attackers hijack the popular 'is' package, embedding malware in multiple versions.