Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

redfin-scraper

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

redfin-scraper

A Python library used to scrape data from Redfin.com using the unofficial Stringray API.

  • 0.2.2
  • PyPI
  • Socket score

Maintainers
1

Redfin Scraper

Description

A scalable Python library that leverages Redfin's unofficial Stringray API to quickly scrape real estate data.

One-Time Setup

Zip Code Database

A database of zip codes is required to search for City, State values

It is strongly recommended to download this free version in .csv format

The Config

Parameters for the RedfinScraper class can be controlled using an optional config.json file

Sample Config

Getting Started

Installation

pip3 install -U redfin-scraper

Import Module

from redfin_scraper import RedfinScraper

Initialize Module

scraper = RedfinScraper()

Using The Scraper

Required Setup

scraper.setup(zip_database_path:str, multiprocessing:bool=False)

zip_database_path: Binary path to the zip_code_database.csv

multiprocessing: Allow for multiprocessing

Activating The Scraper

scraper.scrape(city_states:list[str]=None, zip_codes:list[str], sold:bool=False, sale_period:str=None, lat_tuner:float=1.5, lon_tuner:float=1.5)

city_states: List of strings representing US cities formatted as "City, State"

zip_codes: List of strings representing US zip codes

sold: Select whether to scrape for-sale data (default) or sold data

sale_period: Must be selected whenever sold is True (1mo, 3mo, 6mo, 1yr, 3yr, 5yr)

lat_tuner: Represents # of standard deviations beyond the local latitude average that a zip code may exist within

lon_tuner: Represents # of standard deviations beyond the local longitude average that a zip code may exist within

Accessing Prior Scrapes

scraper.get_data(id:str)

id: IDs are indexed at 1 and increase in the format "D00#"

Appendix

Warnings

Multiprocessing can result in the consumption of all available CPU resources for an extended period of time

Unethical use of this library can result in Redfin taking disciplinary action against your IP address

Recommendations

Requests for large amounts of data (# of zip codes > 2,000) should be split into separate requests

The package.log file can be used to investigate unexpected results

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc