Security News
The Dark Side of Open Source
At Node Congress, Socket CEO Feross Aboukhadijeh uncovers the darker aspects of open source, where applications that rely heavily on third-party dependencies can be exploited in supply chain attacks.
cheerio-httpcli
Advanced tools
Readme
Node.jsã§WEBããŒãžã®ã¹ã¯ã¬ã€ãã³ã°ãè¡ãéã«å¿ èŠãšãªãæåã³ãŒãã®å€æãšãcheerioã«ãã£ãŠããŒã¹ããHTMLãjQueryã®ããã«æäœã§ããHTTPã¯ã©ã€ã¢ã³ãã¢ãžã¥ãŒã«ã§ãã
$('a')
èŠçŽ ã®ãªã³ã¯å
ããã¡ã€ã«ãšããŠããŠã³ããŒãå¯èœ$('img')
èŠçŽ ç»åããã¡ã€ã«ãšããŠããŠã³ããŒãå¯èœ(LazyLoad察å¿)$('a,img,script,link')
èŠçŽ ã®URLã絶察ãã¹ã§ååŸå¯èœéçãªHTMLãããŒã¹ã«åŠçããã¢ãžã¥ãŒã«ãªã®ã§SPAãªã©ã¯ã©ã€ã¢ã³ããµã€ãã®JavaScriptã«ãã£ãŠã³ã³ãã³ããååŸ/å€æŽããã¿ã€ãã®WEBããŒãžã«ã¯å¯Ÿå¿ããŠããŸããã
var client = require('cheerio-httpcli');
// Googleã§ãnode.jsãã«ã€ããŠæ€çŽ¢ããã
var word = 'node.js';
client.fetch('http://www.google.com/search', { q: word }, function (err, $, res, body) {
// ã¬ã¹ãã³ã¹ããããåç
§
console.log(res.headers);
// HTMLã¿ã€ãã«ã衚瀺
console.log($('title').text());
// ãªã³ã¯äžèŠ§ã衚瀺
$('a').each(function (idx) {
console.log($(this).attr('href'));
});
});
å梱ã®example/google.jsã¯Googleæ€çŽ¢çµæã®äžèŠ§ãååŸãããµã³ãã«ã§ããåèã«ããŠãã ããã
npm install cheerio-httpcli
url
ã§æå®ããWEBããŒãžãGETã¡ãœããã§ååŸããæåã³ãŒãã®å€æãšHTMLããŒã¹ãè¡ãcallback
é¢æ°ã«è¿ããŸãã
callback
é¢æ°ã«ã¯ä»¥äžã®4ã€ã®åŒæ°ãæž¡ãããŸãã
cheerio.load()
ã§HTMLã³ã³ãã³ããããŒã¹ãããªããžã§ã¯ã(ç¬èªæ¡åŒµç)response
ãªããžã§ã¯ã(ç¬èªæ¡åŒµç)GETæã«ãã©ã¡ãŒã¿(?foo=bar&hoge=fuga
)ãä»å ããå Žåã¯ç¬¬2åŒæ°ã®get-param
ã«é£æ³é
åã§æå®ããŸãã
ãããããååŸå¯Ÿè±¡ã®WEBããŒãžã®ãšã³ã³ãŒãã£ã³ã°ãåãã£ãŠããå Žåã¯encode
ã«sjis
ãeuc-jp
ãªã©ã®æååãã»ããããããšã§èªåå€å®ã«ãã誀å€å®(æ»
å€ã«çºçããŸããã)ãé²æ¢ããããšãã§ããŸãã
get-param
ãencode
ãå Žåã«ãã£ãŠã¯callback
ãçç¥å¯èœã§ãã
// get-paramãšencodeãçç¥ => GETãã©ã¡ãŒã¿æå®ãªã & ãšã³ã³ãŒãã£ã³ã°èªåå€å®
client.fetch('http://hogehoge.com/fuga.html', function (err, $, res, body) {
...
});
// get-paramãçç¥ => GETãã©ã¡ãŒã¿æå®ãªã & ãšã³ã³ãŒãã£ã³ã°æå®(sjis)
client.fetch('http://hogehoge.com/fuga.html', 'sjis', function (err, $, res, body) {
...
});
// encodeãçç¥ => GETãã©ã¡ãŒã¿æå®(?foo=bar) & ãšã³ã³ãŒãã£ã³ã°èªåå€å®
client.fetch('http://hogehoge.com/fuga.html', { foo: 'bar' }, function (err, $, res, body) {
...
});
// url以å€å
šéšçç¥ => GETãã©ã¡ãŒã¿æå®ãªã & ãšã³ã³ãŒãã£ã³ã°èªåå€å® & ãããã¹åœ¢åŒ(åŸè¿°)
client.fetch('http://hogehoge.com/fuga.html')
.then(function (result) {
...
});
fetch()
ã®ç¬¬4åŒæ°ã§ããcallback
é¢æ°ãçç¥ãããšãæ»ãå€ãšããŠPromiseãªããžã§ã¯ããè¿ããŸããå
ã»ã©ã®ãµã³ãã«ããããã¹åœ¢åŒã§åŒã³åºããšä»¥äžã®ããã«ãªããŸãã
var client = require('cheerio-httpcli');
// Googleã§ãnode.jsãã«ã€ããŠæ€çŽ¢ããã
var word = 'node.js';
// callbackãæå®ããªãã£ãã®ã§Promiseãªããžã§ã¯ããè¿ã
var p = client.fetch('http://www.google.com/search', { q: word })
p.then(function (result) {
// ã¬ã¹ãã³ã¹ããããåç
§
console.log(result.response.headers);
// HTMLã¿ã€ãã«ã衚瀺
console.log(result.$('title').text());
// ãªã³ã¯äžèŠ§ã衚瀺
result.$('a').each(function (idx) {
console.log(result.$(this).attr('href'));
});
})
p.catch(function (err) {
console.log(err);
});
p.finally(function () {
console.log('done');
});
callback
é¢æ°ãæå®ããªãfetch()
ã®æ»ãå€ãp
å€æ°ãåãåãããã®p
å€æ°ãéããŠthen
(æ£åžžçµäºæ)ããã³catch
(ãšã©ãŒçºçæ)ã®åŠçãè¡ã£ãŠããŸãããŸããæ£åžžçµäºã§ããšã©ãŒã§ãå¿
ãæåŸã«éãåŠçã§ããfinally
ã䜿çšã§ããŸãã
then
ã«æž¡ããããã©ã¡ãŒã¿ã¯ã³ãŒã«ããã¯åœ¢åŒã§åŒã³åºããéã«callback
é¢æ°ã«æž¡ããããã®ãšåãã§ããã第1åŒæ°ã®ãªããžã§ã¯ãã«ãŸãšããŠå
¥ã£ãŠãããšããç¹ã§ç°ãªãã®ã§ã泚æãã ããã
error
... Errorãªããžã§ã¯ã$
... cheerio.load()
ã§HTMLã³ã³ãã³ããããŒã¹ãããªããžã§ã¯ã(ç¬èªæ¡åŒµç)response
... requestã¢ãžã¥ãŒã«ã®response
ãªããžã§ã¯ã(ç¬èªæ¡åŒµç)body
... UTF-8ã«å€æããHTMLã³ã³ãã³ã.then(function (result) {
console.log(result);
// => {
// error: ...,
// $: ...,
// response: ...,
// body: ...
// };
});
ãšãããµã€ãã®ãããããŒãžã«ã¢ã¯ã»ã¹ããŠããã®äžã®ãšããããŒãžã«ç§»åããŠ...ãšããããã«é ãè¿œã£ãŠWEBããŒãžã«æœã£ãŠããããå Žåãªã©ãã¡ãœãããã§ãŒã³ã§ãããªæãã«æžãããšãã§ããŸãã
var client = require('cheerio-httpcli');
client.fetch(<TOPããŒãžã®URL>)
.then(function (result) {
// äœãåŠç
return client.fetch(<ããŒãžAã®URL>); // Promiseãªããžã§ã¯ããè¿ã
})
.then(function (result) {
// äœãåŠç
return client.fetch(<ããŒãžA-1ã®URL>); // Promiseãªããžã§ã¯ããè¿ã
})
.then(function (result) {
// äœãåŠç
return client.fetch(<ããŒãžA-2ã®URL>); // Promiseãªããžã§ã¯ããè¿ã
})
.catch(function (err) {
// ã©ããã§ãšã©ãŒãçºç
console.log(err);
})
.finally(function () {
// TOPããŒãž => ããŒãžA => ããŒãžA-1 => ããŒãžA-2ã®é ã«ã¢ã¯ã»ã¹ããåŸã«å®è¡ããã
// ãšã©ãŒãçºçããå Žåãcatchã®åŠçåŸã«å®è¡ããã
console.log('done');
});
å®äœã¯rsvpã®Promiseãªããžã§ã¯ããªã®ã§ã詳现ã¯ãã¡ãã®ããã¥ã¡ã³ããã芧ãã ããã
fetch()
ã®ç¬¬4åŒæ°ã®callback
é¢æ°ãæå®ããå Žåã¯Promiseãªããžã§ã¯ãã¯è¿ããŸããããããã£ãŠã³ãŒã«ããã¯åœ¢åŒã§åŒã³åºãã€ã€Promiseãªããžã§ã¯ãã§äœãããããšããããšã¯ã§ããŸããã
ãŸããasync
ãšawait
ã䜿çšããŠäžèšåŠçã以äžã®ããã«æžãããšãã§ããŸãã
const client = require('cheerio-httpcli');
(async () => {
try {
const result1 = await client.fetch(<TOPããŒãžã®URL>);
// äœãåŠç
const result2 = await client.fetch(<ããŒãžAã®URL>);
// äœãåŠç
const result3 = await client.fetch(<ããŒãžA-1ã®URL>);
// äœãåŠç
const result4 = await client.fetch(<ããŒãžA-2ã®URL>);
} catch (err) {
console.log(err);
}
console.log('done');
})();
éåæã§å®è¡ãããfetch()
ã®åæç(ãªã¯ãšã¹ããå®äºãããŸã§æ¬¡ã®è¡ã«é²ãŸãªã)ãšãªããŸããfs.readFile()
ã«å¯Ÿããfs.readFileSync()
ã®é¢ä¿ãšåããããªæå³åãã«ãªããŸãã
fetch()
ã®ãããã¹åœ¢åŒãšåæ§ã§ããthen
ã«æž¡ããããªããžã§ã¯ããšåæ§ã®åœ¢åŒã§ããvar client = require('cheerio-httpcli');
var result1 = client.fetchSync('http://foo.bar.baz/');
console.log(result1);
// => {
// error: ...,
// $: ...,
// response: ...,
// body: ...
// }
console.log(result1.$('title')); // => http://foo.bar.baz/ã®ã¿ã€ãã«ã衚瀺ããã
var result2 = client.fetchSync('http://hoge.fuga.piyo/');
console.log(result2.$('title')); // => http://hoge.fuga.piyo/ã®ã¿ã€ãã«ã衚瀺ããã
- åæãªã¯ãšã¹ãã¯ãå€éšã¹ã¯ãªãããspawnSync()ã§å®è¡ããŠåŠçãå®äºãããŸã§åŸ ã€ããšãã圢ã§å®è£ ããŠããã®ã§ããã©ãŒãã³ã¹ã¯éåžžã«æªãã§ã(éåæãªã¯ãšã¹ãã®10åçšåºŠã¯æéãããããŸã)ããããã£ãŠãå®è£ ããŠãããŠãªãã§ãããåºæ¬ã¯éåæãªã¯ãšã¹ãã§åŠçãè¡ããã©ãããŠãããã ãã¯åæãªã¯ãšã¹ãã«ããããšãã£ãå Žåã®ã¿ããšãã䜿ãæ¹ããå§ãããŸãã
- åæãªã¯ãšã¹ãã®æ»ãå€å ã®ã¬ã¹ãã³ã¹ã¯response.toJSON()ããããã®ãªã®ã§éåæçãšã¯å 容ãè¥å¹²ç°ãªããŸããstatusCodeãheadersãrequestãªã©ã®äž»èŠããããã£ã¯å ±éããŠäœ¿çšã§ããã®ã§ç¹ã«å€§ããªåé¡ã¯ãªãããšæããŸãããç¹æ®ãªäœ¿ãæ¹ãããå Žåã«ã¯æ³šæãå¿ èŠã§ãã
name
ã§æå®ããããããã£ã«å€value
ãèšå®ããã¡ãœããã§ãã
ãªããžã§ã¯ãã¯ããŒãžãããŸãã第3åŒæ°ã®nomerge
ãtrue
ã®æã¯ããŒãžãããvalue
ã®ãªããžã§ã¯ãã§äžæžããããŸãã
var client = require('cheerio-httpcli');
client.set('timeout', 10000); // ã¿ã€ã ã¢ãŠãã30ç§ãã10ç§ãžå€æŽ
client.set('headers', { // ãªã¯ãšã¹ããããã®refereã®ã¿ãå€æŽ
referer: 'http://hoge.fuga/piyo.html'
});
client.set('headers', {}, true); // ãªã¯ãšã¹ããããã空ã«
ãªããžã§ã¯ãã®å ŽåãããŒã«å¯Ÿããå€ã«null
ãã»ãããããšãã®å€ã¯åé€ãããŸãã
// [before]
// client.headers => {
// lang: 'ja-JP',
// referer: 'http://hoge.fuga/piyo.html',
// 'user-agent': 'my custom user-agent'
// }
client.set('headers', {
referer: null
});
// [after]
// client.headers => {
// lang: 'ja-JP',
// 'user-agent': 'my custom user-agent'
// }
ååšããããããã£ã«ã€ããŠã¯ ãããã㣠ãåç §ããŠãã ããã
çŸç¶ã§ã¯TypeScriptäžã§ã®ããããã£æŽæ°çšã®ã¡ãœããã§ãããçŽæ¥ããããã£ã«å€ãä»£å ¥ããæ¹åŒã¯å°æ¥çã«å»æ¢ããäºå®ã§ãã
ãã©ãŠã¶ããšã®User-Agentãã¯ã³ã¿ããã§èšå®ããã¡ãœããã§ãã
var client = require('cheerio-httpcli');
client.setBrowser('chrome'); // GoogleChromeã®User-Agentã«å€æŽ
client.setBrowser('android'); // Androidã®User-Agentã«å€æŽ
client.setBrowser('googlebot'); // Googlebotã®User-Agentã«å€æŽ
User-Agentãæå®ãããã©ãŠã¶ã®ãã®ã«å€æŽããå Žåã¯true
ã察å¿ããŠããªããã©ãŠã¶ãæå®ãããšUser-Agentã¯å€æŽãããã«false
ãè¿ããŸãã
0.7.0
ããå€ãè¿ããªãããã«ãªããŸããã
察å¿ããŠãããã©ãŠã¶ã¯ä»¥äžã®ãšããã§ãã
default
ãªãããã©ãŠã¶ã®çŽ°ããããŒãžã§ã³ã®æå®ãŸã§ã¯ã§ããªãã®ã§ããããã£ãæå®ãè¡ãããå Žåã¯æåã§ä»¥äžã®ããã«User-Agentãæå®ããŠãã ããã
// IE6ã®User-Agentãæåã§æå®
client.set('headers', {
'User-Agent': 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)'
});
setBrowser()
ã¯å°æ¥çã«åé€äºå®ã§ãã代ããã«ä»¥äžã®ããã«set()
ã¡ãœããã䜿çšããããã«ããŠãã ããã
client.set('browser', 'firefox');
cheerio-httpcliã¯ãå®è¡æã«ã€ã³ã¹ããŒã«ãããŠããiconvç³»ã®ã¢ãžã¥ãŒã«ããã§ãã¯ããŠå©çšããã¢ãžã¥ãŒã«ãèªåçã«æ±ºå®ããŠããŸããåªå é äœã¯ä»¥äžã®ãšããã§ãã
iconv-liteã¯cheerio-httpcliã®ã€ã³ã¹ããŒã«æã«äŸåã¢ãžã¥ãŒã«ãšããŠäžç·ã«ã€ã³ã¹ããŒã«ãããŸããããã€ãã£ãã¢ãžã¥ãŒã«ã§ããiconvãã€ã³ã¹ããŒã«ãããŠããå ŽåãåŠçé床ã察å¿æåã³ãŒãã®å€ããšããã¡ãªããããããã¡ããåªå ããŠããŒãããããã«ãªã£ãŠããŸãã
ããŒãžã§ã³0.3.1ãŸã§ã¯æåªå ã¯iconv-jpã§ããããé·æéã¡ã³ããããŠããªãããšãšNode.js 0.12ç³»ã§ã³ã³ãã€ã«ã§ããªããªã£ãŠããçŸç¶ãèæ ®ããŠããã©ã«ãã®å€æãšã³ãžã³åè£ããé€å€ããŸããã
ãããŠ
setIconvEngine()
ã§iconv-jpãæå®ããããšã¯å¯èœã§ããéæšå¥šã§ãã
ãã®ã¡ãœããã¯èªåçã«ããŒããããiconvç³»ã¢ãžã¥ãŒã«ãç Žæ£ããŠã䜿çšããiconvç³»ã¢ãžã¥ãŒã«ãæåã§æå®ããããã®ãã®ã§ããã¢ãžã¥ãŒã«ãã¹ãæã®åãæ¿ãçšã¡ãœãããªã®ã§åºæ¬çã«ã¯å®çšæ§ã¯ãããŸããã
iconv-module-name
ã«ã¯äœ¿çšããiconvç³»ã¢ãžã¥ãŒã«å(iconv
, iconv-lite
, iconv-jp
)ã®ããããã®æååãæå®ããŸãã
var client = require('cheerio-httpcli');
// ãããŠiconv-liteã䜿çš
client.setIconvEngine('iconv-lite');
client.fetch( ...
setIconvEngine()
ã¯å°æ¥çã«åé€äºå®ã§ãã代ããã«ä»¥äžã®ããã«set()
ã¡ãœããã䜿çšããããã«ããŠãã ããã
client.set('iconv', 'iconv-lite');
cheerio-httpcliã¯ãã®ããã»ã¹ãåäœããŠããéãåçš®èšå®ãã¯ãããŒãä¿æãç¶ããŸãã
reset()
ãå®è¡ãããšãèšå®æ
å ±ãã¯ãããŒããã¹ãŠåæåããŠããã»ã¹èµ·åæãšåãç¶æ
ã«æ»ããŸãã
cheerio-httpcliã¯åºæ¬çã«ã¯ã·ã³ã°ã«ã€ã³ã¹ã¿ã³ã¹ã§åäœããããããã°ã€ã³ãå¿ èŠãªåäžãµã€ãã«è€æ°ã¢ã«ãŠã³ãã§åæã«ãã°ã€ã³ãããšãã£ãããšã¯ã§ããŸããã§ãã(ã¯ãããŒæ å ±ãåŸãããã°ã€ã³ããã¢ã«ãŠã³ãã®ãã®ã§äžæžããããŠããŸã)ã
0.8.0
ããå®è£
ãããfork()
ã¡ãœããã䜿çšããŠåã€ã³ã¹ã¿ã³ã¹ãäœæããããããã®ã€ã³ã¹ã¿ã³ã¹ã§å¥ã
ã®ã¢ã«ãŠã³ãã§åäžãµã€ãã«ãã°ã€ã³ã§ããããã«ãªããŸããã
// clientã¯ã¡ã€ã³ã€ã³ã¹ã¿ã³ã¹
var client = require('cheerio-httpcli');
// åã€ã³ã¹ã¿ã³ã¹ãäœæ
var child1 = client.fork();
// åäžã®ãµã€ãã«ããããå¥ã
ã®ã¢ã«ãŠã³ãã§ãã°ã€ã³
var url = 'https://need.login.web.service/login';
client.fetch(url, function (err, $, res, body) {
$('#login').submit({
username: 'foo',
passowrd: 'password_for_foo',
}, function (err, $, res, body) {
console.log($('.user_name').text()); // => foo
});
});
child1.fetch(url, function (err, $, res, body) {
$('#login').submit({
username: 'bar,
passowrd: 'password_for_bar,
}, function (err, $, res, body) {
console.log($('.user_name').text()); // => bar
});
});
// åã€ã³ã¹ã¿ã³ã¹ã¯ããã€ã§ãäœæå¯èœ
var child2 = client.fork();
var child3 = client.fork();
fork()
ãå®è¡ã§ããã®ã¯ã¡ã€ã³ã€ã³ã¹ã¿ã³ã¹ã®ã¿ã§ããåã€ã³ã¹ã¿ã³ã¹ã¯fork()
ã¡ãœãããå®è¡ã§ããŸããã
var child = client.fork();
var grandChild = child.fork(); // => ãšã©ãŒ
fork()
ãããçŽåŸã®åã€ã³ã¹ã¿ã³ã¹ã®èšå®ãã¯ãããŒæ
å ±ã¯ã¡ã€ã³ã€ã³ã¹ã¿ã³ã¹ã®ãã®ãåŒãç¶ãã ç¶æ
ã§ããããã®åŸã¯ããããã®ã€ã³ã¹ã¿ã³ã¹ã§å€æŽå¯èœã§ãã
client.set('browser', 'ie'); // UserAgentãIEã«èšå®
var child = client.fork(); // ãã®æç¹ã§ã¯åã€ã³ã¹ã¿ã³ã¹ã®UserAgentãIE
child.set('browser', 'firefox'); // åã€ã³ã¹ã¿ã³ã¹ã®UserAgentã ããFirefoxã«å€æŽ
åã€ã³ã¹ã¿ã³ã¹ã¯ããŠã³ããŒããããŒãžã£ãæã£ãŠããŸãããåã€ã³ã¹ã¿ã³ã¹ã§$(...).download()
ã¡ãœãããå®è¡ãããå Žåã¯ã¡ã€ã³ã€ã³ã¹ã¿ã³ã¹ã®ããŠã³ããŒããããŒãžã£ãŒã«ç»é²ãããŸãã
// ã¡ã€ã³ã€ã³ã¹ã¿ã³ã¹ã®ããŠã³ããŒããããŒãžã£ãŒ
client.download.on('ready', function (stream) {
...
});
var child = client.fork();
child.fetch('http://...', function (err, $, res, body) {
$('img').download(); // ã¡ã€ã³ã€ã³ã¹ã¿ã³ã¹ã®ããŠã³ããŒããããŒãžã£ãŒã«éããã
});
çŸåšã®ã¯ãããŒæ å ±ãJSONåã§ãã圢ã§ååŸããŸãã
// ãã°ã€ã³ãå¿
èŠãªãµã€ãã«ãã°ã€ã³ãã
client.fetch('https://need.login.web.service/login', function (err, $, res, body) {
$('#login').submit({
username: 'foo',
passowrd: 'password_for_foo',
}, function (err, $, res, body) {
// ãã°ã€ã³åŸã®ã¯ãããŒæ
å ±ãJSONãã¡ã€ã«ãšããŠæžãåºã
fs.writeFileSync('cookie.json', JSON.stringify(client.exportCookies()), 'utf-8');
});
});
ã¯ãããŒæ å ±ã«ã¯WEBãµã€ãã®ã»ãã·ã§ã³IDãªã©ãã»ãã¥ãªãã£äžéåžžã«éèŠãªæ å ±ãæžã蟌ãŸããŠããå¯èœæ§ãããã®ã§ãåæ±ãã«ã¯åå泚æããŠãã ãã(ã»ãã·ã§ã³IDãæŒæŽ©ãããšã¢ã«ãŠã³ãã®ä¹ã£åããªã©ãããæãããããŸã)ã
exportCookies()
ã§ååŸããã¯ãããŒæ
å ±ãèªã¿èŸŒã¿ãŸãã
// ãã°ã€ã³åŸã«åºåããã¯ãããŒæ
å ±ãèªã¿èŸŒã
client.importCookies(JSON.parse(fs.readFileSync('cookie.json', 'utf-8')));
client.fetch('https://need.login.web.service/mypage', function (err, $, res, body) {
// ãããªããã°ã€ã³åŸã®ããŒãžã«ã¢ã¯ã»ã¹ã§ãã
console.log($('.user_name').text()); // => foo
});
ãããŸã§ã¯ãããŒæ å ±ã埩å ããŠããã ããªã®ã§ãäžèšã®ããã«ãããªããã°ã€ã³åŸã®ããŒãžã«ã¢ã¯ã»ã¹ã§ãããã©ããã¯åWEBãµã€ãã®ä»æ§ã«ãããŸãã
åããããã£ã¯ä»¥äžã®ããã«ååŸãæŽæ°ããŸãã
// ããŒãžã§ã³æ
å ±ã衚瀺
console.log(client.version);
// ã¿ã€ã ã¢ãŠãæéãå€æŽ
client.set('timeout', 5000);
// [éæšå¥š] çŽæ¥æŽæ°ããããšãå¯èœã§ããå°æ¥çã«äžå¯ãšãªããŸã
client.timeout = 3000;
readonly
cheerio-httpcliã®ããŒãžã§ã³æ å ±ã§ãã
set()
ãsetBrowser()
ã§ã»ãããããã©ãŠã¶åã§ã(ie
ãchrome
ãªã©)ã
æªèšå®ã®å Žåã¯null
ãæåã§User-Agentãèšå®ããå Žåã¯custom
ãšå
¥ã£ãŠããŸãã
â»æªèšå®æã«ã¯ååfetch()
æã«chrome
ãã»ãããããŸãã
䜿çšäžã®iconvç³»ã¢ãžã¥ãŒã«åã§ã(iconv
ãiconv-lite
ãiconv-jp
ã®ãããã)ã
requestã¢ãžã¥ãŒã«ã§äœ¿çšãããªã¯ãšã¹ããããæ
å ±ã®é£æ³é
åã§ããããã©ã«ãã§ã¯äœãæå®ãããŠããŸããããfetch()
å®è¡æã«User-Agentã空ã®å Žåã¯èªåçã«User-Agentã«GoogleChromeã®æ
å ±ãå
¥ããŸãã
requestã¢ãžã¥ãŒã«ã§æå®ããã¿ã€ã ã¢ãŠãæ
å ±ãããªç§ã§æå®ããŸããããã©ã«ãã¯30000
(30ç§)ãšãªã£ãŠããŸãã
ãµãŒããŒãšã®éä¿¡ã«gzip転éã䜿çšãããã©ãããçåœå€ã§æå®ããŸããããã©ã«ãã¯true
(gzip転éãã)ã§ãã
ãªãã¡ã©ãèªåã§ã»ãããããã©ããã®æå®ã§ããtrue
ã«ãããš1ã€åã«fetch()
ããããŒãžã®URLãèªåã§ãªã¯ãšã¹ããããã®Refererã«ã»ãããããŸããããã©ã«ãã¯true
ã§ãã
<meta http-equiv="refresh" content="0;URL=...">
ãšãã£ãMETAã¿ã°ãHTMLå
ã«çºèŠããå Žåã«èªåã§ãã®URLã«ãªãã€ã¬ã¯ãããŸãããã ãã<!--[if IE]>ïœ<![endif]-->
ã®ãããªIEæ¡ä»¶ä»ãã³ã¡ã³ãå
ã«ããå Žåã¯ãªãã€ã¬ã¯ãããŸãããããã©ã«ãã¯false
ã§ãã
Googleæ€çŽ¢ãããå Žåã¯
followMetaRefresh
ã¯false
ã«ããŠãã ãããGoogleã®æ€çŽ¢çµæHTMLã«ã¯åžžã«METAã¿ã°ã®Refreshæå®ãå ¥ã£ãŠããã®ã§(ãããæ¯å埮åŠã«ç°ãªãURL)ããªãã€ã¬ã¯ããã«ãŒãããŠæçµçã«ãšã©ãŒã«ãªããŸãã
fetch()
ãªã©ã§åä¿¡ããããŒã¿ã®éçéãæ°å€(ãã€ãæ°)ã§æå®ããŸãããã®å€ãè¶
ãããµã€ãºãåä¿¡ãã段éã§ãšã©ãŒãçºçããŸãããŠãŒã¶ãŒããå
¥åãããURLã解æããçšéãªã©ã«ãããŠãäžçšæã«å€§ããããŒã¿ãèªã¿èŸŒãã§ããŸãåç·ãå æããå¯èœæ§ãããå Žåã«æå®ããŠãããæ¹ãè¯ãã§ãããã
ç»åã®ããŠã³ããŒãæã«ã¯é©çšãããŸããã
ããã©ã«ãã¯null
(å¶éãªã)ã§ãã
var client = require('cheerio-httpcli');
// åä¿¡æå¶éã1MBã«æå®
client.set('maxDataSize', 1024 * 1024);
// 1MB以äžã®ã®HTMLãæå®
client.fetch('http://big.large.huge/data.html', function (err, $, res, body) {
console.log(err.message); // => 'data size limit over'
});
ãªããmaxDataSize
ãè¶
ããå Žåã¯éäžãŸã§åä¿¡ããããŒã¿ã¯ç Žæ£ãããŸãã
cheerio-httpcliã¯ååŸããããŒãžãXMLã§ãããšå€å¥ããå Žåãèªåçã«cheerioã®XMLã¢ãŒããæå¹ã«ããŠã³ã³ãã³ããããŒã¹ããŸã(Content-Type
ãšURLã®æ¡åŒµåãèŠãŠå€å¥ããŠããŸã)ã
true
ã«ãããšãã®èªåå€å¥ãç¡å¹ã«ããŠåžžã«HTMLã¢ãŒãã§ã³ã³ãã³ããããŒã¹ããããã«ãªããŸããããã©ã«ãã¯false
(èªåå€å¥ãã)ã§ãã
äž»ã«SSLæ¥ç¶ãªã©ã®ã»ãã¥ãªãã£ã®èšå®ãè¡ããªãã·ã§ã³ã§ããcheerio-httpcliå éšã§äœ¿çšããŠããrequestã¢ãžã¥ãŒã«ã«ãã®ãŸãŸæž¡ãããŸããããã©ã«ãã¯ç©ºé£æ³é åã§ãã
åºæ¬çã«ã¯äœãèšå®ããå¿ èŠã¯ãããŸããããhttpsããŒãžãžã®ã¢ã¯ã»ã¹ãã§ããªãå Žåã«ãã®ãªãã·ã§ã³ãèšå®ããããšã«ãã解決ããå¯èœæ§ããããŸããèšå®æ¹æ³ãªã©ã®è©³çŽ°ã¯requestã¢ãžã¥ãŒã«ã®ããã¥ã¡ã³ããåç §ããŠãã ããã
// TLS1.2ã§ã®æ¥ç¶ã匷å¶ãã
client.set('agentOptions', {
secureProtocol: 'TLSv1_2_method'
});
true
ã«ãããšãªã¯ãšã¹ãã®åºŠã«ãããã°æ
å ±ãåºåããŸã(stderr
)ãããã©ã«ãã¯false
ã§ãã
var client = require('cheerio-httpcli');
// ãããã°è¡šç€ºON
client.set('debug', true);
client.fetch( ...
readonly
ãã¡ã€ã«ããŠã³ããŒããããŒãžã£ãŒãªããžã§ã¯ãã§ãããã®ãªããžã§ã¯ããéããŠãã¡ã€ã«ããŠã³ããŒãã«é¢ããèšå®ãè¡ããŸã(詳现ã¯$(image-element).download()ãåç §)ã
cheerio-httpcliã§ã¯cheerioãªããžã§ã¯ãã®prototypeãæ¡åŒµããŠããã€ãã®äŸ¿å©ã¡ãœãããå®è£ ããŠããŸãã
ååŸããWEBããŒãžã«é¢ããæ
å ±(url
ãencoding
ãisXml
)ãååŸã§ããŸãã
client.fetch('http://hogehoge/', function (err, $, res, body) {
var docInfo = $.documentInfo();
console.log(docInfo.url); // http://hogehoge/
console.log(docInfo.encoding); // 'utf-8'
console.log(docInfo.isXml); // XMLã¢ãŒãã§ããŒã¹ãããå Žåã¯true
});
fetch()
ã§æå®ããURLããªãã€ã¬ã¯ããããå Žåã¯ãªãã€ã¬ã¯ãå
ã®URLãurl
ã«å
¥ããŸããencoding
ã«é¢ããŠãåæ§ã§ãæçµçã«å°éããããŒãžã®ãšã³ã³ãŒãã£ã³ã°ãå
¥ããŸãã
a
èŠçŽ ãããã¯éä¿¡ãã¿ã³ç³»èŠçŽ ã§äœ¿çšå¯èœã§ãããããããæåãç°ãªããŸãã
a
èŠçŽ href
å±æ§ã«æå®ãããŠããURLãšååŸããããŒãžã®URLãçµã¿åãããŠç§»åå
ã®URLãäœæããfetch()
ãå®è¡ããŸãã
client.fetch('http://hogehoge/')
.then(function (result) {
// id="login"ã®åã®ãªã³ã¯ãã¯ãªãã¯(ãããã¹åœ¢åŒ)
return result.$('#login a').click();
})
.then(function (result) {
// ã¯ãªãã¯ããå
ã®URLååŸåŸã®åŠç
});
泚æç¹ãšããŠããã®click()
ã¡ãœããã¯javascriptãªã³ã¯ãonclick="..."
ãªã©ã®åçåŠçã«ã¯å¯Ÿå¿ããŠããŸããããããŸã§ãhref
ã®URLã«ç°¡åã«ã¢ã¯ã»ã¹ã§ããããã®æ©èœã§ãã
input[type=submit]
ãbutton[type=submit]
ãinput[type=image]
èŠçŽ ã察象ãšãªããŸãã
æŒãããéä¿¡ãã¿ã³ãæå±ãããã©ãŒã å
ã«é
眮ãããŠããinput
ãcheckbox
ãªã©ã®ãã©ãŒã éšåããéä¿¡ãã©ã¡ãŒã¿ãèªåäœæããaction
å±æ§ã®URLã«method
å±æ§ã§ãã©ãŒã éä¿¡ãå®è¡ããŸãã
client.fetch('http://hogehoge/')
.then(function (result) {
var form = $('form[name=login]');
// ãŠãŒã¶ãŒåãšãã¹ã¯ãŒããã»ãã(field()ã«ã€ããŠã¯åŸè¿°)
form.field({
user: 'guest',
pass: '12345678'
});
// éä¿¡ãã¿ã³ãæŒããŠãã©ãŒã ãéä¿¡(ã³ãŒã«ããã¯åœ¢åŒ)
// â»äžã§æå®ããuserãšpass以å€ã¯ããã©ã«ãã®ãã©ã¡ãŒã¿ãšãªã
form.find('input[type=submit]').click(function (err, $, res, body) {
// ãã©ãŒã éä¿¡åŸã«ç§»åããããŒãžååŸåŸã®åŠç
});
})
cheerio-httpcliã¯å éšã§ã¯ãããŒãä¿æããã®ã§ããã°ã€ã³ãå¿ èŠãªããŒãžã®ååŸãªã©ããã®ãã©ãŒã éä¿¡ã§ãã°ã€ã³ããåŸã«å·¡åã§ããããã«ãªããŸãã
ãªãããã¡ããåçåŠçã§ããonsubmit="xxx"
ãéä¿¡ãã¿ã³ã®onclick="..."
ã«ã¯å¯Ÿå¿ããŠããŸããã
$(...).click()
æã®å¯Ÿè±¡èŠçŽ ãè€æ°ããå Žåã¯å
é ã®èŠçŽ ã«å¯ŸããŠã®ã¿åŠçãè¡ãããŸããfetch()
ãšåæ§ã«åŒæ°ã®callback
é¢æ°ã®æç¡ã§ã³ãŒã«ããã¯åœ¢åŒãšãããã¹åœ¢åŒã®æå®ãåãæ¿ããããŸããéåæã§å®è¡ãããclick()
ã®åæçãšãªããŸãã
then
ã«æž¡ããããªããžã§ã¯ããšåæ§ã®åœ¢åŒã§ããa
èŠçŽ var client = require('cheerio-httpcli');
// fetch()ã¯éåæã§è¡ã£ãŠãã®äžã§åæãªã¯ãšã¹ãããå Žå
client.fetch('http://foo.bar.baz/', function (err, $, res, body) {
var result = $('a#login').clickSync();
console.log(result);
// => {
// error: ...,
// $: ...,
// response: ...,
// body: ...
// }
});
var client = require('cheerio-httpcli');
// ãã©ãŒã ã®ããããŒãžã«åæãªã¯ãšã¹ã
var result1 = client.fetchSync('http://foo.bar.baz/');
var form = result1.$('form[name=login]');
form.field({
user: 'guest',
pass: '12345678'
});
// ãã©ãŒã éä¿¡ãåæãªã¯ãšã¹ã
var result2 = form.find('input[type=submit]').clickSync();
// ãã©ãŒã éä¿¡åŸã«ç§»åããããŒãžååŸåŸã®åŠç
.
.
.
form
èŠçŽ ã§ã®ã¿äœ¿çšã§ããŸãã
æå®ãããã©ãŒã å
ã«é
眮ãããŠããinput
ãcheckbox
ãªã©ã®ãã©ãŒã éšåããéä¿¡ãã©ã¡ãŒã¿ãèªåäœæããaction
å±æ§ã®URLã«method
å±æ§ã§ãã©ãŒã ãéä¿¡ããŸããfetch()
ãšåæ§ã«åŒæ°ã®callback
é¢æ°ã®æç¡ã§ã³ãŒã«ããã¯åœ¢åŒãšãããã¹åœ¢åŒã®æå®ãåãæ¿ããããŸãã
ãŸãããã©ãŒã éä¿¡ãã©ã¡ãŒã¿ã¯param
åŒæ°ã§æå®ããé£æ³é
åã®å
容ã§äžæžãã§ããã®ã§ãå©çšããåŽã§ã¯ãã©ã¡ãŒã¿ãå€æŽãããé
ç®ã ãæå®ããã ãã§æžã¿ãŸãã
client.fetch('http://hogehoge/')
.then(function (result) {
// ãŠãŒã¶ãŒåãšãã¹ã¯ãŒãã ãå
¥åããŠãããšã¯ãã©ãŒã ã®ããã©ã«ãå€ã§éä¿¡ãã
var loginInfo = {
user: 'guest',
pass: '12345678'
};
// name="login"ãã©ãŒã ãéä¿¡(ã³ãŒã«ããã¯åœ¢åŒ)
result.$('form[name=login]').submit(loginInfo, function (err, $, res, body) {
// ãã©ãŒã éä¿¡åŸã«ç§»åããããŒãžååŸåŸã®åŠç
});
})
ãã®ä»ã®ä»æ§ã¯$(submit-element).click()
ãšåæ§ã§ãã
onsubmit="xxx"
ã«ã¯å¯Ÿå¿ããŠããŸããã$(...)
ã§ååŸããform
èŠçŽ ãè€æ°ããå Žåã¯å
é ã®èŠçŽ ã«å¯ŸããŠã®ã¿å®è¡ãããŸãã$(submit-element).click()
ã¯æŒãããã¿ã³ã®ãã©ã¡ãŒã¿ããµãŒããŒã«éä¿¡ãããŸããã$(form-element).submit()
ã¯éä¿¡ç³»ãã¿ã³ã®ãã©ã¡ãŒã¿ããã¹ãŠé€å€ããäžã§ãµãŒããŒã«éä¿¡ããŸãã
<form>
<input type="text" name="user" value="guest">
<input type="submit" name="edit" value="edit">
<input type="submit" name="delete" value="delete">
</form>
äžèšãã©ãŒã ã¯1ãã©ãŒã å
ã«è€æ°ã®submit
ãã¿ã³ããããŸããããããã®ã¡ãœããã«ãããã®ãã©ãŒã ã®éä¿¡æã®ãã©ã¡ãŒã¿ã¯ä»¥äžã®ããã«ãªããŸãã
// $(submit-element).click()ã®å Žå
$('[name=edit]').click(); // => '?user=guest&edit=edit'
// $(form-element).submit()ã®å Žå
$('form').submit(); // => '?user=guest'
ãã®ããã«1ãã©ãŒã å
ã«è€æ°ã®submit
ãã¿ã³ãããå ŽåããµãŒããŒåŽã§ã¯æŒããããã¿ã³ã®ãã©ã¡ãŒã¿ã§åŠçãåå²ãããŠããå¯èœæ§ãããã®ã§ã$('form').submit()
ã ãšæ£åžžãªçµæãåŸãããªããããããŸããã
å®éã«ãã©ãŠã¶ããæåã§ãã©ãŒã ãéä¿¡ããæåã«è¿ãã®ã¯$(submit-element).click()
ã«ãªããŸãã
0.8.0
ããinput[type=file]
èŠçŽ ã«ããŒã«ã«ãã¡ã€ã«ãã¹ãã»ããããããšã«ããããã¡ã€ã«ã®ã¢ããããŒããã§ããããã«ãªããŸããã
<form action="/upload.php" enctype="multipart/form-data" method="post">
<input type="text" name="title">
<input type="file" name="upload_file">
<input type="submit">
</form>
äžèšã®ãããªãã©ãŒã ã§ãã¡ã€ã«ãã¢ããããŒãããã«ã¯ä»¥äžã®ããã«æå®ããŸãã
$('form').submit({
title: 'ãå®ç»å',
upload_file: '/path/to/secret/yabai.jpg'
});
// ãããã¯
$('form').find('input[name=title]').val('ãå®ç»å');
$('form').find('input[name=upload_file]').val('/path/to/secret/yabai.jpg');
$('form').submit();
察象ã®input[type=file]
èŠçŽ ãè€æ°ãã¡ã€ã«(multiple
)ã«å¯Ÿå¿ããŠããå Žåã¯é
åã§æå®å¯èœã§ãã
$('form').submit({
title: 'ãå®ç»å',
upload_file: [
'/path/to/secret/yabai.jpg',
'/path/to/secret/sugee.jpg'
]
});
// ãããã¯
$('form').find('input[name=title]').val('ãå®ç»å');
$('form').find('input[name=upload_file]').val([
'/path/to/secret/yabai.jpg',
'/path/to/secret/sugee.jpg'
]);
$('form').submit();
æå®ãããã¡ã€ã«ãã¹ãååšããªãã£ãããè€æ°ãã¡ã€ã«ã«å¯Ÿå¿ããŠããªãinput[type=file]
èŠçŽ ã§è€æ°ãã¡ã€ã«ãæå®ããå Žåã¯ãšã©ãŒã«ãªããŸãã
å梱ã®example/upload.jsã¯ãã¡ã€ã«ã¢ããããŒãã®ãµã³ãã«ã§ããåèã«ããŠãã ããã
éåæã§å®è¡ãããsubmit()
ã®åæçãšãªããŸããæ»ãå€ã¯ãããã¹åœ¢åŒã®then
ã«æž¡ããããªããžã§ã¯ããšåæ§ã®åœ¢åŒã§ãã
submit()
ã®ãããã¹åœ¢åŒãšåæ§ã§ããthen
ã«æž¡ããããªããžã§ã¯ããšåæ§ã®åœ¢åŒã§ããvar client = require('cheerio-httpcli');
// ãããããŒãžã«ã¢ã¯ã»ã¹(ãããåæãªã¯ãšã¹ãã«ããããšãå¯èœ)
client.fetch('http://foo.bar.baz/', function (err, $, res, body) {
// åæãªã¯ãšã¹ãã§ãã°ã€ã³ããŒãžã«ç§»å
var result1 = $('a#login').clickSync();
// åæãªã¯ãšã¹ãã§ãã°ã€ã³ãã©ãŒã éä¿¡
var result2 = result1.$('form[name=login]').submitSync({
account: 'guest',
password: 'guest'
});
// ãã°ã€ã³çµæ確èª
console.log(result2.response.statusCode);
});
$(...).css()
ã$(...).attr()
ãšåãæèŠã§ãã©ãŒã éšåã®å€ãååŸ/æå®ã§ããã¡ãœããã§ããåŒã³åºãæã®åŒæ°ã«ãã£ãŠåäœãå€ãããŸããform
èŠçŽ ã§äœ¿çšå¯èœã§ãã
name
:æååãvalue
:ãªãform-element
å
ã®éšåname
ã®çŸåšã®å€ãååŸããŸãã
// userã®valueãååŸ
$('form[name=login]').field('user'); // => 'guest'
name
:æååãvalue
:æåå or é
åform-element
å
ã®éšåname
ã®å€ãvalue
ã«å€æŽããŸããåäžname
ã®è€æ°ãã§ãã¯ããã¯ã¹ãè€æ°éžæselect
ã®å Žåã¯é
åã§ãŸãšããŠéžæå€ãæå®ã§ããŸãã
// passã®valueãèšå®
$('form[name=login]').field('pass', 'admin');
// è€æ°éžæå¯èœéšåã®å Žå
$('form[name=login]').field('multi-select', [ 'hoge', 'fuga', 'piyo' ]);
name
:é£æ³é
åãvalue
:ãªãæå®ãããé£æ³é
åå
ã®name
:value
ãäžæ¬ã§form-element
å
ã®éšåã«åæ ããŸãã
// äžæ¬ã§èšå®
$('form[name=login]').field({
user: 'foo',
pass: 'bar'
});
form-element
å
ã®å
šéšåã®name
ãšvalue
ãé£æ³é
åã§ååŸããŸãã
// äžæ¬ã§ååŸ
$('form[name=login]').field();
// => {
// user: 'foo',
// pass: 'bar',
// remember: 1
// }
第3åŒæ°ã®onNotFound
ã¯ãéšåã«å€ãèšå®ããéã«åç
§ããããªãã·ã§ã³ã§ããæå®ããname
ã®éšåããã©ãŒã å
ã«ååšããªãã£ãæã®åäœã以äžã®ããããã®æååã§æå®ããŸãã
throw
... äŸå€ãçºçããŸããappend
... æ°èŠã«ãã®name
éšåãäœæããŠãã©ãŒã ã«è¿œå ããŸã(æååã®å Žåã¯hidden
ãé
åã®å Žåã¯checkbox
)ãonNotFound
ãæå®ããªãã£ãå Žåã¯äŸå€ã¯çºçãããæ°èŠã«name
éšåã®è¿œå ãããŸãã(äœãããªã)ã
// loginãã©ãŒã å
ã«abcãšããnameã®éšåããªãæã®åäœ
$('form[name=login]').field('abc', 'hello', 'throw');
// => äŸå€: Element named 'abc' could not be found in this form
$('form[name=login]').field('abc', 'hello', 'append');
// => <input type="hidden" name="abc" value="hello"> ãè¿œå
$('form[name=login]').field('abc', [ 'hello', 'world' ], 'append');
// => <input type="checkbox" name="abc" value="hello" checked>
// <input type="checkbox" name="abc" value="world" checked> ãè¿œå
$('form[name=login]').field('abc', 'hello');
// => äœãããªã
æå®ãããã§ãã¯ããã¯ã¹ãã©ãžãªãã¿ã³ã®èŠçŽ ãéžæç¶æ ã«ããŸãã察象ã®èŠçŽ ãå ããéžæç¶æ ã®å Žåã¯äœãå€åããŸããã
察象èŠçŽ ãè€æ°ããå Žåã¯å¯Ÿè±¡ãã¹ãŠãéžæç¶æ ã«ããŸãããã©ãžãªãã¿ã³ã«é¢ããŠã¯åã°ã«ãŒãå ã§è€æ°ãéžæç¶æ ã«ããããšã¯ã§ããªãã®ã§ãæåã«è©²åœããèŠçŽ ãéžæç¶æ ã«ããŸãã
$('input[name=check_foo]').tick(); // => check_fooãéžæç¶æ
ã«
$('input[type=checkbox]').tick(); // => å
šãã§ãã¯ããã¯ã¹ãéžæç¶æ
ã«
$('input[name=radio_bar][value=2]').tick(); // => radio_barã®valueã2ã®ã©ãžãªãã¿ã³ãéžæç¶æ
ã«
$('input[type=radio]').tick(); // => åã©ãžãªãã¿ã³ã°ã«ãŒãã®å
é ãéžæç¶æ
ã«
æå®ãããã§ãã¯ããã¯ã¹ãã©ãžãªãã¿ã³ã®èŠçŽ ãééžæç¶æ ã«ããŸãã察象ã®èŠçŽ ãå ããééžæç¶æ ã®å Žåã¯äœãå€åããŸããã
察象èŠçŽ ãè€æ°ããå Žåã¯å¯Ÿè±¡ãã¹ãŠãééžæç¶æ ã«ããŸãã
$('input[name=check_foo]').untick(); // => check_fooãééžæç¶æ
ã«
$('input[type=checkbox]').untick(); // => å
šãã§ãã¯ããã¯ã¹ãééžæç¶æ
ã«
$('input[name=radio_bar][value=2]').untick(); // => radio_barã®valueã2ã®ã©ãžãªãã¿ã³ãééžæç¶æ
ã«
$('input[type=radio]').untick(); // => å
šã©ãžãªãã¿ã³ãééžæç¶æ
ã«
a
èŠçŽ ã®href
ãimg
èŠçŽ ã®src
ãscript
èŠçŽ ã®src
ããããã¯link
èŠçŽ ã®href
ã®URLãå®å
šãªåœ¢(絶察ãã¹)ã«ãããã®ãååŸããŸããå
ããå®å
šãªURLã«ãªã£ãŠããå Žå(å€éšãªã³ã¯ãªã©)ãjavascript:void(0)
ãšãã£ãURLã§ãªããªã³ã¯ã¯ãã®å
容ããã®ãŸãŸè¿ããŸãã
<a id="top" href="../index.html">ãããããŒãž</a>
http://foo.bar.baz/hoge/
ãšããããŒãžå
ã«äžèšã®ãããªãªã³ã¯ãããå Žåã$(...).attr('href')
ãš$(...).url()
ã®æ»ãå€ã¯ãããã以äžã®ããã«ãªããŸãã
console.log($('a#top').attr('href')); // => '../index.html'
console.log($('a#top').url()); // => 'http://foo.bar.baz/index.html'
ãŸãã察象ã®èŠçŽ ãè€æ°ããå Žåã¯åèŠçŽ ã®çµ¶å¯ŸURLãé åã«æ ŒçŽããŠè¿ããŸãã
console.log($('a').url());
// => [
// 'http://foo.bar.baz/index.html',
// 'http://foo.bar.baz/xxx.html',
// 'https://www.google.com/'
// ]
第1åŒæ°ã®filter
ã¯ã察象èŠçŽ ã®href
ãsrc
ã®URLã3çš®é¡ã«åé¡ããŠãååŸå¯Ÿè±¡ããé€å€ãããã©ãããã£ã«ã¿ãªã³ã°ãããªãã·ã§ã³ã§ãã
relative
... çžå¯ŸURL(ãµã€ãå
ãªã³ã¯)absolute
... 絶察URL(http(s)ããå§ãŸããªã³ã¯(äž»ã«ãµã€ãå€ãªã³ã¯))invalid
... URL以å€(JavaScriptãªã©)ãµã€ãå ãªã³ã¯ã絶察URLã§æå®ããŠããããŒãžãããã®ã§ã絶察URL = ãµã€ãå€ãªã³ã¯ãšã¯éããŸããã
åãã£ã«ã¿ãtrue
ã«ãããšååŸãfalse
ã«ãããšé€å€ãšããæå³ã«ãªããŸããããã©ã«ãã¯ãã¹ãŠtrue
ã«ãªã£ãŠããŸãã
<a href="./page2.html">
<a href="./#foo">
<a href="javascript:hogehoge();">
<a href="http://www.yahoo.com/">
ãã®ãããªHTMLã«å¯ŸããŠ$('a').url()
ãåçš®filter
ãªãã·ã§ã³æå®ã§å®è¡ããæã®æ»ãå€ã¯ä»¥äžã®ããã«ãªããŸãã
// æå®ç¡ã
console.log($('a').url();
// => [
// 'http://foo.bar.baz/page2.html',
// 'http://foo.bar.baz/#foo',
// 'javascript:hogehoge();',
// 'https://www.yahoo.com/'
// ]
// çžå¯Ÿãªã³ã¯ã®ã¿ååŸ
console.log($('a').url({
relative: true,
absolute: false,
invalid: false
}));
// => [
// 'http://foo.bar.baz/page2.html',
// 'http://foo.bar.baz/#foo'
// ]
// URLãšããŠæå¹ãªãã®ã®ã¿ååŸ(é€å€ãããã®ã ãfalseã®æå®ã§ãOK)
console.log($('a').url({ invalid: false }));
// => [
// 'http://foo.bar.baz/page2.html',
// 'http://foo.bar.baz/#foo',
// 'https://www.yahoo.com/'
// ]
ãªãã察象ãšãªãèŠçŽ ã1ã€ã®ã¿ã®æã®æ»ãå€ã¯é
åã§ã¯ãªã絶察URLã®æååã«ãªããŸããããã®éã®filter
ãªãã·ã§ã³ã®æå®ãšãã®çµæã¯ä»¥äžã®ããã«ãªããŸãã
<a id="top" href="index.html">Ajax</a>
äžèšãªã³ã¯ã¯çžå¯Ÿãªã³ã¯ãªã®ã§åé¡ãšããŠã¯relative
ã«å
¥ããŸãããã®ærelative
ãé€å€ãããªãã·ã§ã³ã§url()
ãåŒã³åºããšæ»ãå€ã¯undefined
ãšãªããŸãã
console.log($('#top').url({ relative: false })); // => undefined
第2åŒæ°ã®src-attr
ã¯ãimg
èŠçŽ ããç»åURLãšããŠååŸããå±æ§åãæå®ãããªãã·ã§ã³ã§ã(æåå or é
å)ã
ååŸå¯Ÿè±¡ã®WEBããŒãžã§LazyLoadç³»ã®jQueryãã©ã°ã€ã³ãªã©ã䜿ã£ãŠããå Žåã¯src
å±æ§ã«ãããŒã®ç»åURLãå
¥ã£ãŠãããããŸããããã®ãããªimg
èŠçŽ ã§src
å±æ§ä»¥å€ããURLãååŸããéã«æå®ããŸãã
<img src="blank.gif" data-original-src="http://this.is/real-image.png">
ãã®ãããªHTMLã§ãsrc
ã®blank.gif
ã§ã¯ãªãdata-original-src
ã®http://this.is/real-image.png
ãååŸãããå Žåã¯ä»¥äžã®ããã«æå®ããŸãã
// filterãªãã·ã§ã³ã¯çç¥å¯èœ
$('img').url('data-original-src');
data-original-src
ããã®èŠçŽ ã«ååšããªãå Žåã¯src
å±æ§ã®URLãååŸããŸãã
ãªããããã©ã«ãã§ã¯data-original
>data-lazy-src
>data-src
>src
ã®åªå
é ã«ãªã£ãŠããŸããããã©ã«ãã®åªå
é äœãç Žæ£ããŠsrc
å±æ§ã®ç»åãæåªå
ã§ååŸãããå Žåã¯ã
$('img').url({ invalid: false }, []);
ã®ããã«ç©ºé åãæå®ããŸãã
æ¡åŒµcheerioãªããžã§ã¯ãããããŠã³ããŒããããŒãžã£ãŒãžã®ç»é²ãè¡ããŸãã<img src="data:image/png;base64,/9j/4AAQSkZJRgABA ...">
ãšãã£ãåã蟌ã¿ç»åããã€ããªåããŠããŠã³ããŒãã§ããŸãã
a
èŠçŽ ãimg
èŠçŽ 以å€ã§download()
ãå®è¡ãããšäŸå€ãçºçããŸãã
a
èŠçŽ ã®å Žåã¯ãªã³ã¯å
ã®URLãããŠã³ããŒãããimg
èŠçŽ ã®å Žåã¯ç»åèªäœãããŠã³ããŒãããŸãã
ãªããdownload()
ãå®è¡ããéã«ã¯ä»¥äžã®äŸã®ãããªããŠã³ããŒããããŒãžã£ãŒã®èšå®ãå¿
èŠã«ãªããŸãã
var fs = require('fs');
var client = require('cheerio-httpcli');
// â ããŠã³ããŒããããŒãžã£ãŒã®èšå®(å
šããŠã³ããŒãã€ãã³ããããã§åŠçããã)
client.download
.on('ready', function (stream) {
// ä¿åå
ãã¡ã€ã«ã®ã¹ããªãŒã äœæ
var write = fs.createWriteStream('/path/to/image.png');
write
.on('finish', function () {
console.log(stream.url.href + 'ãããŠã³ããŒãããŸãã');
})
.on('error', console.error);
// ããŠã³ããŒãã¹ããªãŒã ããããŒã¿ãèªã¿èŸŒãã§ãã¡ã€ã«ã¹ããªãŒã ã«æžã蟌ã
stream
.on('data', function (chunk) {
write.write(chunk);
})
.on('end', function () {
write.end();
});
})
.on('error', function (err) {
console.error(err.url + 'ãããŠã³ããŒãã§ããŸããã§ãã: ' + err.message);
})
.on('end', function () {
console.log('ããŠã³ããŒããå®äºããŸãã');
});
// â£äžŠåããŠã³ããŒãå¶éã®èšå®
client.download.parallel = 4;
// â¡ã¹ã¯ã¬ã€ãã³ã°éå§
client.fetch('http://foo.bar.baz/', function (err, $, res, body) {
// â¢class="thumbnail"ã®ç»åãå
šéšããŠã³ããŒã
$('img.thumbnail').download();
console.log('OK!');
});
â ã®client.download
ãšããã®ãcheerio-httpcliã«å
èµãããŠããããŠã³ããŒããããŒãžã£ãŒã«ãªããŸãã
ã¹ã¯ã¬ã€ãã³ã°äžã«$(...).download()
ã¡ãœããã§å®è¡ããããã¡ã€ã«ã®ããŠã³ããŒããå§ãŸããšclient.download
ã®ready
ã€ãã³ããçºçããŸã(ãšã©ãŒãçºçããå Žåã¯error
ã€ãã³ã)ã
end
ã€ãã³ãã¯ããŠã³ããŒãåŸ
ã¡ã®URLããªããªã£ãæã«çºçããŸãã
â ã§ã¯è²ã
ãªå Žæããå®è¡ããã$(...).download()
æã®å
±éåŠçãèšå®ããŠããŸãããã®äŸã§ã¯åŒæ°ã«æž¡ãããããŠã³ããŒãå
ãã¡ã€ã«ã®ã¹ããªãŒã ã/path/to/image.png
ã«ä¿åããŠããŸãã
client.download
ã®ã€ãã³ãåŠçèšå®ãå®äºãããâ¡ã¹ã¯ã¬ã€ãã³ã°ã«å
¥ããŸãã
â¡ã§WEBããŒãžãååŸãããã®äžã®â¢ã§$(...).download()
ã¡ãœãããå®è¡ããŠããŸãã
ãã®æã$('img.thumbnail')
ã«è©²åœããç»åèŠçŽ ã10åãã£ããšãããšããã®10åã®ç»åèŠçŽ ããŸãšããŠããŠã³ããŒããããŒãžã£ãŒã«ç»é²ãããŸã(ãã§ã«ç»é²æžã¿ã®URLã¯é€å€ãããŸã)ã
å°ãæ»ã£ãŠâ£ãèŠããšäžŠåããŠã³ããŒãæ°å¶éãèšå®ãããŠããŸããä»åã®äŸã§ã¯4
ãªã®ã§ãç»é²ããã10åã®ç»åèŠçŽ ã®å
ãå³åº§ã«4ã€ãããŠã³ããŒãåŠçã«å
¥ããŸãã
æ®ãã®6èŠçŽ ã¯ããŠã³ããŒãåŸ ã¡ãã¥ãŒã«å ¥ããæåã®4ã€ã®å ã®ã©ããã®ããŠã³ããŒããå®äºããŠç©ºããã§ãããšã次ã®ç»åURLããã®ç©ºãéšåã«ã«ç»é²ãããŠããŠã³ããŒããå®è¡ããã ... ãšããæµãã§ãã
ç»åããŠã³ããŒãã¯æ¬ç·ã§ããâ¡â¢ã®ã¹ã¯ã¬ã€ãã³ã°ãšã¯éåæã§è¡ãããŸãã
äžèšã®äŸã§ã¯â¢ãå®è¡ããŠãOK!ãã衚瀺ããã段éã§æ¬ç·ã®ã¹ã¯ã¬ã€ãã³ã°ã¯çµãããŸãããç»åã®ããŠã³ããŒãã¯ãŸã éäžã§ããããŸããããŠã³ããŒããããŒãžã£ãŒã«ç»é²ããå šç»åã®ããŠã³ããŒããå®äºãããŸã§ã¯ãã®ã¹ã¯ãªããèªäœã¯çµäºããŸããã
ãOK!ãã衚瀺ãããŠããªããªãã³ã³ãœãŒã«ã«å¶åŸ¡ãæ»ã£ãŠããªããããšãã£ãŠ
Ctrl+C
ãšãã¯ããã«ãããŠã³ããŒãå®äºãŸã§ãåŸ ã¡ãã ããã
ãªããããŠã³ããŒããã¡ã€ã«ã®ã¹ããªãŒã ããã¡ã€ã«ã«ä¿åããéã«ã¯stream.pipe(fs.createWriteStream(<file>)
ãšãã£ãæå®æ¹æ³ããããŸããããã®æ¹æ³ã ãšhttps://qiita.com/himox_x/items/a06d3fb111d67c0c8e8dã®ãããªåé¡ãçºçããã±ãŒã¹ãããã®ã§ãon('data')
ãšon('end')
ã§å¶åŸ¡ããæ¹ãå®å
šã§ã(åå ã«ã€ããŠã¯çŸæç¹ã§äžæã§ã)ã
ãŸããcheerio-httpclièªäœã«ããŠã³ããŒããã¡ã€ã«ã®ã¹ããªãŒã ããã¡ã€ã«ã«ä¿åããæ©èœãããã®ã§ããã¡ããå©çšããããšããå§ãããŸã(詳现ã¯stream
ã«å®è£
ãããŠããããããã£/ã¡ãœããã®stream.saveAs()
ãåç
§)ã
第1åŒæ°ã®src-attr
ãªãã·ã§ã³ã¯ãurl()
ãšåæ§ã«img
èŠçŽ ããç»åURLãšããŠååŸããå±æ§åãæå®å¯èœã§ã(æåå or é
å)ã
a
èŠçŽ ã«å¯ŸããŠdownload()
ã¡ãœãããå®è¡ããå Žåã«ã¯ãã®æå®ã¯ç¡èŠãããŸãã
<img src="blank.gif" data-original-src="http://this.is/real-image.png">
äžèšã®ãããªHTMLã§ãsrc
ã®blank.gif
ã§ã¯ãªãdata-original-src
ã®http://this.is/real-image.png
ãããŠã³ããŒããããå Žåã¯ä»¥äžã®ããã«æå®ããŸãã
$('img').download('data-original-src');
ãã®ä»ä»æ§ã¯
url()
ã®src-attr
é ãåç §
$(...).download()
ã§ç»é²ãããURLã®ããŠã³ããŒãæå
±éèšå®ã«ãªããŸãã
ããŠã³ããŒãã®åæ䞊åå®è¡æ°ãæå®ããŸãã1
ïœ5
ã®éã§æå®ããŸã(ããã©ã«ãã¯3
)ããã§ã«ããŠã³ããŒããå§ãŸã£ãŠãã段éã§å€ãå€æŽããå Žåã¯ãçŸåšå®è¡äžã®ããŠã³ããŒãããã¹ãŠå®äºããŠããåæ ãããŸãã
ããŠã³ããŒããããŒãžã£ãŒã®çŸåšã®åŠçç¶æ³ã確èªã§ããŸããèªã¿åãå°çšãªã®ã§æ°å€ãå€æŽããŠãããŠã³ããŒãã®ç¶æ³ã¯å€åããŸããã
state
ã«ã¯ä»¥äžã®2é
ç®ãç»é²ãããŠããŸãã
queue
... ããŠã³ããŒãåŸ
ã¡ä»¶æ°complete
... ããŠã³ããŒãå®äºä»¶æ°error
... ãšã©ãŒä»¶æ°console.log(client.download.state); // => { queue: 10, complete: 3, error: 0 }
ãŸããããŠã³ããŒãã€ãã³ãå
ã§this.state
ã§ã確èªã§ããŸãã
client.download
.on('ready', function (stream) {
console.log(this.state); // => { queue: 2, complete: 5, error: 1 }
...
ããŠã³ããŒããããŒãžã£ãŒã¯éè€ããURLãé€å€ããããã«URLãã£ãã·ã¥ãå éšã§æã£ãŠããŸããäœããã®çç±ã§ãã®ãã£ãã·ã¥ãã¯ãªã¢ããå Žåã«äœ¿çšããŸãã
ãã£ãã·ã¥ãã¯ãªã¢ããããšãããäžåºŠåãURLãããŠã³ããŒããããŒãžã£ãŒã«ç»é²ã§ããŸãã
download.on
ã§èšå®å¯èœãªã€ãã³ãã¯ä»¥äžã®éãã§ãã
$(...).download()
ã¡ãœããã§èŠçŽ ã®URLãããŠã³ããŒããããŒãžã£ãŒã«ç»é²ãããæã«çºçããã€ãã³ãã§ããåŒæ°ã«ã¯ç»é²ãããURLãå
¥ããŸãã
ããŠã³ããŒãéå§æã«çºçããã€ãã³ãæã®åŠçã§ããURLæ¯ã«çºçããŸããåŒæ°ã®stream
ã«ã¯ããŠã³ããŒãå
ãã¡ã€ã«ã®ã¹ããªãŒã ãå
¥ããŸããstream
ã«å®è£
ãããŠããããããã£/ã¡ãœããã¯ä»¥äžã®ãšããã§ãã
url
... ããŠã³ããŒããã¡ã€ã«ã®URLãªããžã§ã¯ãã§ããURLã®æååã¯stream.url.href
ã§ååŸã§ããŸããBase64åã蟌ã¿ç»åã®å Žåã¯URLãªããžã§ã¯ãã§ã¯ãªãbase64
ãšããæååãå
¥ããŸããtype
... Content-Typeãå
¥ããŸãããµãŒããŒããè¿ãããã¬ã¹ãã³ã¹ãããã«Content-Typeããªãå Žåã¯undefined
ã«ãªããŸããlength
... Content-Lengthãå
¥ããŸãããµãŒããŒããè¿ãããã¬ã¹ãã³ã¹ãããã«Content-Lengthããªãå Žåã¯-1
ã«ãªããŸããtoBuffer(callback)
... ã¹ããªãŒã ãBufferã«å€æããŠã³ãŒã«ããã¯é¢æ°(err
, buffer
)ã«è¿ããŸããããŠã³ããŒããã¡ã€ã«ã®å
容ããã¹ãŠã¡ã¢ãªäžã«èªã¿èŸŒãã®ã§å·šå€§ãªãã¡ã€ã«ã®å Žåã¯ããã ãã¡ã¢ãªãæ¶è²»ããŸããsaveAs(filepath, callback)
... ã¹ããªãŒã ãfilepath
ã§æå®ãããã¡ã€ã«ã«ä¿åããŸããend()
... ã¹ããªãŒã ã®èªã¿èŸŒã¿ãçµäºããŸããready
ã€ãã³ãå
㧠ã¹ããªãŒã ãèªã¿èŸŒãŸãã«åŠçãæããå Žåãªã©ã¯å¿
ãåŒã³åºããŠãã ãã (ãã®ãŸãŸã«ããŠãããšãã¥ãŒãè©°ãŸã£ãŠæ¬¡ã®ããŠã³ããŒããã§ããªããªãããšããããŸãããŸããã¹ããªãŒã ãèªã¿èŸŒãŸããã«æŸçœ®ããããŸãŸtimeout
æéãçµéãããšerror
ã€ãã³ããçºçããŠåŒ·å¶çã«ãšã©ãŒæ±ããšãªããŸã)ãtoBuffer()
ãšsaveAs()
ã«ã€ããŠã®ä»æ§
callback
ãçç¥ããå Žåããããã¹åœ¢åŒã§ã®å®è¡ãšãªããŸãã
// ã³ãŒã«ããã¯åœ¢åŒ
stream.toBuffer(function (err, buffer) {
...
});
stream.saveAs('/path/to/save.file', function (err) {
...
});
// ãããã¹åœ¢åŒ
stream.toBuffer()
.then(function (buffer) { ... })
.catch(function (err) { ... })
.finally(function () { ... });
stream.saveAs('/path/to/save.file')
.then(function () { ... })
.catch(function (err) { ... })
.finally(function () { ... });
stream.on('data')
ãªã©ã§èªåã§ã¹ããªãŒã ãèªã¿åºãå§ããåŸã«toBuffer()
ãsaveAs()
ãå®è¡ãããšãšã©ãŒãšãªããŸãã
stream
.on('data', function (chunk) { ... })
.on('end', function () { ... });
stream.saveAs('/path/to/save.file', function (err) {
// ãšã©ãŒãçºç
});
ããŠã³ããŒãäžã«ãšã©ãŒãçºçããæã«çºçããã€ãã³ãæã®åŠçã§ããåŒæ°ã®err
ãªããžã§ã¯ãã«ã¯url
ããããã£(ããŠã³ããŒãå
ã®URL)ãå
¥ã£ãŠããŸãã
ããŠã³ããŒãåŸ ã¡ãã¥ãŒã空ã«ãªã£ãæã«çºçããã€ãã³ãæã®åŠçã§ããåŒæ°ã¯ãããŸããã
client.download
.on('add', function (url) {
console.log('added: ' + url);
})
.on('ready', function (stream) {
// gifç»å以å€ã¯ãããªã
if (! /\.gif$/i.test(stream.url.pathname)) {
return stream.end();
}
// åçš®æ
å ±è¡šç€º
console.log(stream.url.href); // => 'http://hogehoge.com/foobar.png'
console.log(stream.type); // => 'image/png'
console.log(stream.length); // => 10240
// BufferåããŠãã¡ã€ã«ã«ä¿å
stream.toBuffer(function (err, buffer) {
fs.writeFileSync('foobar.png', buffer, 'binary');
});
})
.on('error', function (err) {
console.error(err.url + ': ' + err.message);
})
.on('end', function (err) {
console.log('queue is empty');
});
å梱ã®example/irasutoya.jsãšexample/yubin.jsã¯ç»åããªã³ã¯å ã®ãã¡ã€ã«ãããŠã³ããŒããããµã³ãã«ã§ããåèã«ããŠãã ããã
察象èŠçŽ ã®HTMLéšåããã¹ãŠHTMLãšã³ãã£ãã£åããæååãè¿ããŸããåºæ¬çã«ã¯äœ¿ãéã¯ãªããšæããŸãã
// <h1>ããã«ã¡ã¯</h1>
console.log($('h1').html()) // => 'ããã«ã¡ã¯'
console.log($('h1').entityHtml()); // => 'こんにちは'
fetch()
ãcheerio.click()
ãcheerio.submit()
ãªã©ã§ååŸã§ããresponse
ãªããžã§ã¯ãã¯requestã¢ãžã¥ãŒã«ã§ååŸãããã®ã§ãããç¬èªæ¡åŒµãšããŠcookies
ããããã£ãä»ã足ããŠããŸãã
client.fetch('http://hogehoge/')
.then(function (result) {
// ãããã¹åœ¢åŒã§ãã°ã€ã³ãã©ãŒã éä¿¡
return result.$('form[name=login]').submit({ user: 'hoge', pass: 'fuga' })
})
.then(function (result) {
// ãã°ã€ã³åŸã®ã¯ãããŒå
容確èª
console.log(result.response.cookies);
});
ãã®cookies
ããããã£ã«ã¯çŸåšååŸããããŒãžã®ãµãŒããŒããéãããŠããã¯ãããŒã®ããŒãšå€ãé£æ³é
åã§å
¥ã£ãŠããŸããã»ãã·ã§ã³IDããã°ã€ã³ç¶æ
ã®ç¢ºèªãªã©ã«äœ¿ãããããããŸããã
ãªãããã®cookies
ã®å€ãå€æŽããŠããªã¯ãšã¹ãåŠçã«ã¯åæ ãããŸãããã¯ãããŒç¢ºèªå°çšã®ããããã£ã§ãã
BasicèªèšŒãå¿ èŠãªããŒãžã«ã¯ä»¥äžã®äºéãã®æ¹æ³ã§ã¢ã¯ã»ã¹ã§ããŸãã
var client = require('cheerio-httpcli');
var user = 'hoge';
var password = 'foobarbaz';
client.set('headers', {
Authorization: 'Basic ' + new Buffer(user + ':' + password).toString('base64')
});
client.fetch('http://securet.example.com', function (err, $, res, body) {
.
.
.
// äžèŠã«ãªã£ããæ¶å»(ãããå¿ãããšãã®åŸå¥ã®ããŒãžã«ã¢ã¯ã»ã¹ãããšãã«ãèªèšŒæ
å ±ãéä¿¡ããŠããŸã)
delete(client.headers['Authorization']);
});
var client = require('cheerio-httpcli');
var user = 'hoge';
var password = 'foobarbaz';
client.fetch('http://' + user + ':' + password + '@securet.example.com', function (err, $, res, body) {
詳现ã¯ãã¡ã
ç°å¢å€æ°HTTP_PROXY
ã«http://ãããã·ãµãŒããŒã®ã¢ãã¬ã¹:ããŒã/
ãã»ãããããšãããã·ãµãŒããŒçµç±ã§WEBããŒãžãååŸããŸãã
process.env.HTTP_PROXY = 'http://proxy.hoge.com:18080/'; // ãããã·ãµãŒããŒãæå®
var client = require('cheerio-httpcli');
client.fetch('http://foo.bar.baz/', ...
0.7.2
ããhttpsæ¥ç¶æ¹æ³ã«é¢ããä»æ§ã«è¥å¹²å€æŽãããããã®åœ±é¿ã§ä»ãŸã§æ¥ç¶ã§ããŠããããŒãžã«æ¥ç¶ã§ããªããšããã±ãŒã¹ãçºçãããããããŸããã
0.7.1
ãŸã§ãšåãæåã«ããå Žåã¯ä»¥äžã®ããã«èšå®ããŠãã ããã
var client = require('cheerio-httpcli');
var constants = require('constants'); // <- constantsã¢ãžã¥ãŒã«ãå¥éã€ã³ã¹ããŒã«
client.set('agentOptions', {
secureOptions: constants.SSL_OP_NO_TLSv1_2
});
cheerio-httpcliã§ã¢ã¯ã»ã¹ãããµãŒããŒåŽã®SSL蚌ææžèšå®ã«äžåããããšçºçããããšããããŸãã
åºæ¬çã«ã¯ãµãŒããŒåŽã§å¯Ÿå¿ããŠããããªããšæ¥ç¶ã¯ã§ããªãã®ã§ãããã¹ã¯ãªããã®æåã®æ¹ã«ä»¥äžã®æå®ããããšSSL蚌ææžæ€èšŒã§ãšã©ãŒãçºçããŠãç¡çããåŠçãç¶è¡ããŠã¢ã¯ã»ã¹ããããšãå¯èœã§ãã
process.env.NODE_TLS_REJECT_UNAUTHORIZED = 0; // SSL蚌ææžæ€èšŒãšã©ãŒãç¡èŠããèšå®
var client = require('cheerio-httpcli');
ãã ãããã®æ¹æ³ã¯å®å šã§ãªããµã€ãã«ã¢ã¯ã»ã¹ã§ãããšããå±éºãªæå®ãªã®ã§ããããŸã§ãç·æ¥æã«ãããŠã®ã¿èªå·±è²¬ä»»ã§èšå®ããããã«ããŠãã ããã
<dc:title>ã¿ã€ãã«</dc:title>
ãã®ãããªXMLã¿ã°ã¯
$('dc:title').text();
ã§ã¯ååŸã§ããŸããã
$('dc\\:title').text();
ãšãã£ãå ·åã«ã³ãã³ãã\ãã§ãšã¹ã±ãŒãããããšã§ååŸããããšãã§ããŸãã
0.8.0
ã§å®è£
ãããexportCookies()
ãšimportCookies()
ã§ããããããã®ã¡ãœããã§èªã¿æžããããã¯ãããŒJSONã¯Puppeteerã§äœ¿çšãããŠãããã®ãšäºææ§ããããŸãã
ãããã£ãŠãcheerio-httpcliã®exportCookies()
ã§æžãåºããã¯ãããŒJSONãPuppeteerããèªã¿èŸŒãã ããPuppeteerã§æžãåºããã¯ãããŒJSONãcheerio-httpcliã®importCookies()
ã§èªã¿èŸŒãããšãã§ããŸãã
// cheerio-httpcliã§ãã°ã€ã³ãå¿
èŠãªãµã€ãã«ãã°ã€ã³ãã
client.fetch('https://need.login.web.service/login', function (err, $, res, body) {
$('#login').submit({
username: 'foo',
passowrd: 'password_for_foo',
}, function (err, $, res, body) {
// ãã°ã€ã³åŸã®ã¯ãããŒæ
å ±ãJSONãã¡ã€ã«ãšããŠæžãåºã
fs.writeFileSync('cookie.json', JSON.stringify(client.exportCookies()), 'utf-8');
});
});
// cheerio-httpcliã§ãã°ã€ã³åŸã«åºåããã¯ãããŒæ
å ±ãPuppeteerããèªã¿èŸŒã
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
const cookies = JSON.parse(fs.readFileSync('cookie.json', 'utf-8'));
for (let cookie of cookies) {
await page.setCookie(cookie);
}
// ãããªããã°ã€ã³åŸã®ããŒãžã«ã¢ã¯ã»ã¹ã§ãã
await page.goto('https://need.login.web.service/mypage');
// Puppeteerã§ãã°ã€ã³ãå¿
èŠãªãµã€ãã«ãã°ã€ã³ãã
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
await page.goto('https://need.login.web.service/login', { waitUntil: 'domcontentloaded' });
await page.type('input[name="username"]', 'foo');
await page.type('input[name="password"]', 'password_for_foo');
page.click('input[type="submit"]');
await page.waitForNavigation({ timeout: 60000, waitUntil: 'domcontentloaded' });
const cookies = await page.cookies();
fs.writeFileSync('cookie.json', JSON.stringify(cookies), 'utf-8');
// Puppeteerã§ãã°ã€ã³åŸã«åºåããã¯ãããŒæ
å ±ãcheerio-httpcliããèªã¿èŸŒã
client.importCookies(JSON.parse(fs.readFileSync('cookie.json', 'utf-8')));
client.fetch('https://need.login.web.service/mypage', function (err, $, res, body) {
// ãããªããã°ã€ã³åŸã®ããŒãžã«ã¢ã¯ã»ã¹ã§ãã
});
cheerio-httpcliã§ã¯2段éèªèšŒãèšå®ãããŠãããµã€ãã«ã¯åºæ¬çã«ãã°ã€ã³ããããšã¯ã§ããŸããããäžèšã®ã¯ãããŒèªã¿æžãæ©èœãšæäœã®ãã©ãŠã¶æ¡åŒµæ©èœã§ããã¯ãããŒJSONãã¡ã€ã«åºå for Puppeteerã䜿çšããŠãã°ã€ã³åŸã®ããŒãžã«ã¢ã¯ã»ã¹ããããšãã§ãããããããŸããã
importCookies()ã®èª¬æã§ã觊ããŸããããã¯ãããŒæ å ±ã埩å ããã ãã§ãã®ãµã€ãå ã®ããŒãžã«ã¢ã¯ã»ã¹ã§ãããã©ããã¯WEBãµã€ãåŽã®ä»æ§ã«ãããŸãã®ã§ããã®æ¹æ³ã§ç¢ºå®ã«2段éèªèšŒãå¿ èŠãªãµã€ãå ã®ããŒãžã«ã¢ã¯ã»ã¹ã§ãããšã¯éããŸããã
ãã¡å ã§ãã£ãŠã¿ã䟡å€ã¯ããããšãã£ãé¡ã®ãã®ãšãèããã ããã
ãããããšå·¥å€«ãããšWebpackã§åºããããŸãããããã§ãSync
ç³»ã¡ãœããã¯æ£åžžã«åäœããŸãã(å©çšäžå¯)ããŸããWebpackã®éã«å€§éã®warningãçºçããã®ã§ããã®ä»ã®æ©èœã«é¢ããŠãæ£åžžã«åäœãããã¯åãããŸããã詳现ã¯ãã¡ããã芧ãã ããã
ãå©çšã®éã¯èªå·±è²¬ä»»ã§ãé¡ãããŸãã
ãŸããElectronãšããç°å¢ã«èµ·å ããåäœäžè¯ã«é¢ããŠã¯ãã¡ãã£ãšããä¿®æ£ã§è§£æ±ºãããã®ã¯å¯Ÿå¿ããŸãããçŸè¡ã®ä»çµã¿ã倧ããå€ããå¿ èŠãããå Žåã¯å¯Ÿå¿ããªãäºããããŸãããäºæ¿ãã ããã
Node.jsçšã«Webpackã§åºããå Žåãäžéšåäœã«å¶éãçããŸãã該åœããåŠçãçºçããå Žåã¯warningã¡ãã»ãŒãžã衚瀺ãããŸãã
Accept-Languageãªã¯ãšã¹ãããããèªåã§ä»å ãããŸãããå¿ èŠã§ããã°ä»¥äžã®ããã«æåã§èšå®ããŠãã ããã
var client = require('cheerio-httpcli');
client.set('headers', { 'Accept-Language': 'ja,en-US' });
Iconvã¢ãžã¥ãŒã«ã®åçå€æŽã¯ã§ããŸãããiconv-lite
åºå®ãšãªããŸãã
fetchSync()
ãclickSync()
ãªã©ã®éåæã¡ãœããã¯äœ¿çšã§ããŸãã(å®è¡ããŠããšã©ãŒã«ãªããŸã)ã
@typesã§ã¯ãªãcheerio-httpcliæ¬äœã«å®çŸ©ãã¡ã€ã«ãå梱ãããŠããŸãã
import * as client from 'cheerio-httpcli';
client.fetch('http://foo.bar.baz/', (err, $, res, body) => {
...
ãšãã£ã圢ã§TypeScriptããå©çšã§ããŸãã
æåã³ãŒãã®å€å¥ã¯jschardetã§é«ç²ŸåºŠã§å€å¥ã§ããå Žåã¯ãã®æ
å ±ã䜿çšããŸãããããã§ãªãå Žåã¯<head>
ã¿ã°ã®charsetæ
å ±ãåç
§ããŸããåŸè
ã§ã®å€å¥æã«ãããŠcharsetã§æå®ãããæåã³ãŒããšWEBããŒãžã®å®éã®æåã³ãŒããç°ãªãå Žåã¯å€æãšã©ãŒãæååããçºçããŸãã
MIT licenseã§é åžããŸãã
© 2013-2020 ktty1220
FAQs
http client module with cheerio & iconv(-lite) & promise
The npm package cheerio-httpcli receives a total of 373 weekly downloads. As such, cheerio-httpcli popularity was classified as not popular.
We found that cheerio-httpcli demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
At Node Congress, Socket CEO Feross Aboukhadijeh uncovers the darker aspects of open source, where applications that rely heavily on third-party dependencies can be exploited in supply chain attacks.
Research
Security News
The Socket Research team found this npm package includes code for collecting sensitive developer information, including your operating system username, Git username, and Git email.
Security News
OpenJS is warning of social engineering takeovers targeting open source projects after receiving a credible attempt on the foundation.