Socket
Socket
Sign inDemoInstall

seriesheatmap

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

seriesheatmap

Scraper and heatmap plotter for episode ratings of series on IMDB


Maintainers
1

IMDB Series Rating Scraper

Introduction

This tool scrapes the website https://www.imdb.com for ratings of individual episodes of a series. A csv file is generated to cache the ratings. Using matplotlib, the tool then generates a heatmap representation of all episodes in the series. Because this tools relies on scraping the html tree of the imdb page, it might break anytime. Feel free to message me if the scraper doesn't work anymore or create a pull request with adjusted xpaths.

Examples

Data output

The following table shows data that is generated by the scraper for the first season of Breaking Bad. For the full data output see examples/data/Breaking Bad.csv.

seasonepisodenamerating
11Pilot9.0
12Cat's in the Bag...8.6
13...And the Bag's in the River8.7
14Cancer Man8.2
15Gray Matter8.3
16Crazy Handful of Nothin'9.3
17A No-Rough-Stuff-Type Deal8.8

Heatmap output

The following image shows an example of the heatmap that can be generated. Heatmaps of some example series can be found under examples/img/.

Quickstart

Dependencies

  • Python version Python 3.9.13
  • Python packages see requirements.txt

Setup

  1. Clone this repository
HTTPS$ git clone https://github.com/trflorian/imdb-scraper-heatmap.git
SSH$ git clone git@github.com:trflorian/imdb-scraper-heatmap.git
  1. (Optional) Create a virtual environment for this project
  2. Install the required python packages in your python environment.

$ python -m pip install -r requirements.txt

  1. Run $ python scraper.py to scrape the IMDB website for a specific series.
  2. Run $ python heatmap.py to create a plot for the scraped series.

Usage

$ python .\examples\heatmap.py --help

usage: heatmap.py [-h] [-s] [-d] [-o] [-n NAME]

optional arguments:
  -h, --help            show this help message and exit
  -s, --show            show the heatmap plot instead of saving it
  -d, --dark            use dark mode for the plot style
  -o, --override        override existing plots, only used if show flag is not set
  -n NAME, --name NAME  name of the series, if not set the whole data directory will be scanned

Development

Upload to Pypi

python -m build

python -m twine upload --skip-existing dist/*

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc