Crawler Html 3T
Đây là thư viện dùng để bóc tách dữ liệu html
Installation
npm install crawler-html-3t --save
Usage
Class ModelMongoose
- mod_sources
- name_index
- SourcesNews
- Articles
- mod_baogom
- name_index
- mod_acticles
- mod_links
- mod_categories
Class HtmlParser
- GetHtmlDoc
GetHtmlDoc(url,function(error, body, $));
- getTitle
var title = getTitle($);
- getDesc
var description = getDesc($);
- getImage
var url_image = getImage($);
- getListFeed
getListFeed(url_rss,function(error,list_feed));
- getListFeedByBodyXml
getListFeedByBodyXml(bodyXml,function(error,list_feed));