![Create React App Officially Deprecated Amid React 19 Compatibility Issues](https://cdn.sanity.io/images/cgdhsj6q/production/04fa08cf844d798abc0e1a6391c129363cc7e2ab-1024x1024.webp?w=400&fit=max&auto=format)
Security News
Create React App Officially Deprecated Amid React 19 Compatibility Issues
Create React App is officially deprecated due to React 19 issues and lack of maintenance—developers should switch to Vite or other modern alternatives.
A simple browser/client-side web scraper. Try it out in a REPL: http://www.getgetsy.com
TODOS:
- Support for websites with infinite scroll.
- Support for websites with click pagination.
npm install --save getsy
or yarn add getsy
This library exposes a single function:
getsy(url: string, optionsObject?: options): Promise<Getsy>
parameters:
url
: The url of the website you wish to scrape.
optionsObject
(optional):
corsProxy
(optional string): The endpoint of the corsProxy you wish to use. (Read corsProxy for more info)
resolveURLs
(optional boolean): Wether you want getsy to resolve all relative urls in the resource to absolute urls so they don't fail when they load in another page. (defaults to true)
iframe
: A boolean or object with width and height properties indicating if getsy should start in iframeMode or not. iframe mode will wait for the resource to be mounted in a hidden iframe so you can extract more data through pagination or infinite scrolling. (defaults to false)
The function returns a promise that resolves to a Getsy object on success and rejects if it was unable to load the requested page.
Getsy objects have a method getMe
for scraping the resource's contents. This method is just a wrapper over the jQuery function so you can chain other jQuery methods on it. If you need to use the raw data you can access it's content
property. (More on Getsy below)
import getsy from 'getsy'
getsy('https://en.wikipedia.org/wiki/"Hello,_World!"_program').then(myGetsy => {
console.log(myGetsy.getMe('#firstHeading').text())
})
import getsy from 'getsy'
async function testing() {
const myGetsy = await getsy('https://en.wikipedia.org/wiki/"Hello,_World!"_program')
console.log(myGetsy.getMe('#firstHeading').text())
}
testing()
async function infiniteScrape() {
myGetsy = await getsy('http://scrollmagic.io/examples/advanced/infinite_scrolling.html', { iframe: true })
console.log(`${myGetsy.getMe('.box1').length} boxes.`)
const { succesfulTimes, totalRetries } = await myGetsy.scroll(10)
console.log(`New content loaded ${succesfulTimes} times with ${totalRetries} total retries.`)
console.log(`${myGetsy.getMe('.box1').length} boxes.`) // More content!
}
infiniteScrape()
The Getsy object has the following properties and methods:
corsProxy
: The same one passed from the options object or the default value.
content
: The original string data received from the request.
iframe
: A reference to its iframe element if in iframe mode.
iframeDoc
: A reference to its iframe's document if in iframe mode.
content
: The original string data received from the request.
getMe(selector: string): JQuery
: Query the resource's DOM or the iframe if in iframe mode with a jQuery selector. Returns a JQuery object.
scroll(numberOfTimes: number, element?: HTMLElement, interval?: number, retries?: number): Promise<scrollResolve>
: Scroll to the bottom of an element
(defaults to body) to load new data a specified numberOfTimes
. The interval
(defaults to 2000) is the time in milliseconds that Getsy waits before checking if new content has loaded. If no new content has loaded it will retry as many times as specified by retries
(defaults to 5). If no new content has loaded and scroll
is out of retries then it will resolve the Promise early to avoid waiting for the remaining numberOfTimes
. Note: retries reset to 0 on every succesful content load. Returns a Promise that resolves to an object with the number of .succesfulTimes
that new content was loaded and the .totalRetries
.
hideFrame(): void
: Hides the iframe if applicable.
showFrame(): void
: Shows the iframe if applicable.
This library uses a corsProxy to get by the CORS Origin issue.
If you don't provide one it will default to: https://crossorigin.me/
.
Some node CorsProxy servers:
FAQs
A simple browser/client-side web scraper.
The npm package getsy receives a total of 2 weekly downloads. As such, getsy popularity was classified as not popular.
We found that getsy demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Create React App is officially deprecated due to React 19 issues and lack of maintenance—developers should switch to Vite or other modern alternatives.
Security News
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Security News
The Linux Foundation is warning open source developers that compliance with global sanctions is mandatory, highlighting legal risks and restrictions on contributions.