
Research
/Security News
Weaponizing Discord for Command and Control Across npm, PyPI, and RubyGems.org
Socket researchers uncover how threat actors weaponize Discord across the npm, PyPI, and RubyGems ecosystems to exfiltrate sensitive data.
node-sr-crawler
Advanced tools
这里有两个概念需要你了解一下,1.请求实例 2.请求调度仓库
生成一个请求实例
var req = new Request(config);
发送这个请求
var p = req.request(); //返回请求的Promise对象
配置对象(标准模式)
{
url: 'http://www.xxx.com/',
retry: 0, //默认0,重试次数(次)
retryTimeout: 0, //默认0,判定请求超时的依据(ms)
pageMode: false, //开启分页爬取模式,默认false
charSet: 'UTF-8', //请求压面的字符编码,默认utf-8
}
开启分页爬取模式后会忽略单独请求配置对象中的url字段,请求地址是结合调度仓库配置对象url字段和请求配置对象中pageIndex
生成一个仓库
var store = new Spider(config);
请求入库
var p = store.queue(req); //返回入库请求的Promise对象
配置对象(标准模式)
{
sendRate: 2000, //请求的发送间隔(ms),默认2000
retry: 0, //仓库中请求的重试次数,默认0
retryTimeout: 0, //仓库中请求的超时判定依据(ms),默认0
pageMode: false //是否开启分页模式,默认false
}
一个分页模式的应用必须是双向的,也就是说,仓库和请求必须同时开启该请求才会开启分页模式,如想使用分页模式仓库必须开启pageMode,仓库中的请求可以选择性地开启
仓库 分页模式所需配置项
{
pageMode: true,
url: 'http://www.xxx.com/page=1', //分页爬取第1页的地址,地址中的页码必须是单数
pagePattern: /page=1/g, //能匹配上述(page=1)的正则表达式
}
请求 分页模式所需配置项
{
pageMode: true,
pageIndex: 1, //请求页码,会替换上面pagePattern中数字部分
}
var Spider = require('node-spider');
配置对象写法详见config篇
var s1 = new Spider(config);
var requset = new Request(requsetConfig);
var p = s1.queue(requset); //返回请求的promise实例
s1.on('request',function(request){
//request 实例
})
FAQs
node.js crawler
We found that node-sr-crawler demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
/Security News
Socket researchers uncover how threat actors weaponize Discord across the npm, PyPI, and RubyGems ecosystems to exfiltrate sensitive data.
Security News
Socket now integrates with Bun 1.3’s Security Scanner API to block risky packages at install time and enforce your organization’s policies in local dev and CI.
Research
The Socket Threat Research Team is tracking weekly intrusions into the npm registry that follow a repeatable adversarial playbook used by North Korean state-sponsored actors.