Security News
JavaScript Leaders Demand Oracle Release the JavaScript Trademark
In an open letter, JavaScript community leaders urge Oracle to give up the JavaScript trademark, arguing that it has been effectively abandoned through nonuse.
cheerio-to-text
Advanced tools
Explained by an example:
import fs from "fs"
import cheerio from "cheerio"
import { render } from "cheerio-to-text"
const html = fs.readFileSync("page.html", "utf-8")
console.log(html)
//
// <!doctype html>
// <body>
// <div id="main">
// <p>Para<strong>graph</strong>.</p>
// <ul><li>Foo</li><li>Bar</li></ul><h3>Heading</h3>
// </div>
// </body>
//
const $ = cheerio.load(html)
console.log($("div#main").text())
//
// Paragraph.
// FooBarHeading
//
console.log(render($("div#main")))
//
// Paragraph.
// Foo
// Bar
// Heading
//
Much of the origin of this that GitHub Docs scrapes
every page with got
and cheerio
and then needs to convert that into an
appropriate string of plain text that it can use for searching
with Elasticsearch. Using myCheerioObject.text()
isn't good enough
because it lumps together HTML blocking tags that have no whitespace
between the >
and the next <
.
MIT
Run npm run build:watch
in one terminal the look at
example.mjs
(which you run with node example.mjs
)
jest
Or
jest --watch -t "some test text"
FAQs
Turn a Cheerio object into plain text
The npm package cheerio-to-text receives a total of 677 weekly downloads. As such, cheerio-to-text popularity was classified as not popular.
We found that cheerio-to-text demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
In an open letter, JavaScript community leaders urge Oracle to give up the JavaScript trademark, arguing that it has been effectively abandoned through nonuse.
Security News
The initial version of the Socket Python SDK is now on PyPI, enabling developers to more easily interact with the Socket REST API in Python projects.
Security News
Floating dependency ranges in npm can introduce instability and security risks into your project by allowing unverified or incompatible versions to be installed automatically, leading to unpredictable behavior and potential conflicts.