![Introducing Enhanced Alert Actions and Triage Functionality](https://cdn.sanity.io/images/cgdhsj6q/production/fe71306d515f85de6139b46745ea7180362324f0-2530x946.png?w=800&fit=max&auto=format)
Product
Introducing Enhanced Alert Actions and Triage Functionality
Socket now supports four distinct alert actions instead of the previous two, and alert triaging allows users to override the actions taken for all individual alerts.
web-auto-extractor
Advanced tools
Readme
Automatically extracts semantically structured information from any HTML webpage.
Supported formats:-
Demo it on tonicdev
Parse any sematically structured HTML and query on it.
import WAE from 'web-auto-extractor'
import request from 'request'
const pageUrl = 'http://southernafricatravel.com/'
request(pageUrl, function (error, response, body) {
let wae = WAE.init(body)
// console.log(wae.parse())
// If the page uses microdata
let waeMicrodata = wae.parseMicrodata()
// See API for more options
// console.log(waeMicrodata.data())
// You can query on the parsed result to look for properties marked up by the page
let images = waeMicrodata.find('telephone')
// console.log(images)
})
var WAE = require('web-auto-extractor').default
npm install web-auto-extractor
You would first need to load in the HTML to get a WAEObject
const wae = WAE.init('<div itemtype="Product">...</div>')
Each WAEObject comes with the following set of methods
NOTE: The result of these functions are cached, so multiple calls to them shouldn't affect performance.
Finds all supported semantically structured information on the HTML in normalized format.
Finds all Microdata information on the page and returns it as a WAEParserObject.
Finds all RDFa-Lite information on the page and returns it as a WAEParserObject.
Finds all JSON-LD information on the page and returns it as a WAEParserObject.
Finds all meta tags information on the page and returns it as a WAEParserObject.
NOTE: The result of these functions are cached, so multiple calls to them shouldn't affect performance.
Gets the normalized result of the parsed format.
Gets the unnormalized flattened result of the parsed format which includes meta information relating to the parsed properties.
Returns a list of elements from .data()
that corresponds to the property with the name [propName]
.
See test cases for more examples.
FAQs
Automatically extracts structured information from webpages
The npm package web-auto-extractor receives a total of 6,418 weekly downloads. As such, web-auto-extractor popularity was classified as popular.
We found that web-auto-extractor demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Socket now supports four distinct alert actions instead of the previous two, and alert triaging allows users to override the actions taken for all individual alerts.
Security News
Polyfill.io has been serving malware for months via its CDN, after the project's open source maintainer sold the service to a company based in China.
Security News
OpenSSF is warning open source maintainers to stay vigilant against reputation farming on GitHub, where users artificially inflate their status by manipulating interactions on closed issues and PRs.