Security News
tea.xyz Spam Plagues npm and RubyGems Package Registries
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
pillage
Advanced tools
Readme
Pillage is a super awesome Node.js library for parsing webpages. It uses a baller algorithm✝ to identify the content region of a webpage with accuracy that's really, really, really, really ... fun. Once we have the content region we can parse out text, images, videos and other media. We also threw in a lot of the easy stuff like OG tags for your convenience.
✝ It basically searches for every text node, then recursively climbs the parent tree, assigning a weighed "score" based on text length to each parent. The value rapidly drops off as we move up the tree. This is done for all text nodes so the weights accumulate to identify the most probable shared parent. Once we have that wrapper we can make assumptions and easily parse out body content.
npm install pillage
var pillage = require('pillage');
// Fetch a URL and process
pillage(url, function(err, result) {
console.log(result);
});
// or, process HTML directly
var result = pillage(html);
console.log(result);
// Here's the object structure that it will return
return {
title: extractTitle(html),
description: extractDescription(html),
text: extractText(html),
images: extractImages(html),
videos: extractVideos(html),
twitterTags: extractTwitterTags(html),
openGraphTags: extractOpenGraphTags(html),
articleTags: extractArticleTags(html),
oEmbed: extractOEmbed(html),
};
MIT
Mike Holly
FAQs
Extracts content from a web page.
The npm package pillage receives a total of 0 weekly downloads. As such, pillage popularity was classified as not popular.
We found that pillage demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
Security News
As cyber threats become more autonomous, AI-powered defenses are crucial for businesses to stay ahead of attackers who can exploit software vulnerabilities at scale.
Security News
UnitedHealth Group disclosed that the ransomware attack on Change Healthcare compromised protected health information for millions in the U.S., with estimated costs to the company expected to reach $1 billion.