Socket
Book a DemoInstallSign in
Socket

solidscraper

Package Overview
Dependencies
Maintainers
1
Versions
4
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

solidscraper

This package lets your script scrape web sites. JQuery-Like API.

pipPyPI
Version
0.7.7
Maintainers
1

Solid Scraper

Easy to use JQuery-Like API for Web Scraping/Crawling. It also supports Cookies and custom User Agents. Solidscraper is compatible with Python 2 and 3.

  • “Hello World” Examples

Getting all url of all links:

.. code:: python

import solidscraper as ss

doc = ss.load("https://www.example.com/the/path")

print the list of urls from all elements

print(doc.select("a").getAttribute("href"))

Getting all url of all links inside

s whose class id is ‘links’:

.. code:: python

import solidscraper as ss

doc = ss.load("https://www.example.com/the/path")

print the list of urls from all elements inside

print(doc.select("div #links").then("a").getAttribute("href"))

Getting the text of all elements inside

whose class are ‘info’:

.. code:: python

import solidscraper as ss

doc = ss.load("https://www.example.com/the/path")

print the text of all elements inside

print(doc.select("p .info").then("span").text())

Note: these examples use the python 3 print function, in case you want to run them with python 2, either replace the print() function with the python 2 print statement or add the following import line as the first statement of your code: from __future__ import print_function.

  • “Real World” Examples

The examples folder above_ contains two fully functional examples: one to download tweets by hashtags and another to download complete users timeline (tweets and images). Both scripts were completely built using solidscraper.

.. _folder above: https://github.com/sergioburdisso/solidscraper/tree/master/examples/

Keywords

scrape

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts