Product
Introducing License Enforcement in Socket
Ensure open-source compliance with Socket’s License Enforcement Beta. Set up your License Policy and secure your software!
This tool scrapes the website https://www.imdb.com for ratings of individual episodes of a series. A csv file is generated to cache the ratings. Using matplotlib, the tool then generates a heatmap representation of all episodes in the series. Because this tools relies on scraping the html tree of the imdb page, it might break anytime. Feel free to message me if the scraper doesn't work anymore or create a pull request with adjusted xpaths.
The following table shows data that is generated by the scraper for the first season of Breaking Bad.
For the full data output see examples/data/Breaking Bad.csv
.
season | episode | name | rating |
---|---|---|---|
1 | 1 | Pilot | 9.0 |
1 | 2 | Cat's in the Bag... | 8.6 |
1 | 3 | ...And the Bag's in the River | 8.7 |
1 | 4 | Cancer Man | 8.2 |
1 | 5 | Gray Matter | 8.3 |
1 | 6 | Crazy Handful of Nothin' | 9.3 |
1 | 7 | A No-Rough-Stuff-Type Deal | 8.8 |
The following image shows an example of the heatmap that can be generated.
Heatmaps of some example series can be found under examples/img/
.
Python 3.9.13
requirements.txt
HTTPS | $ git clone https://github.com/trflorian/imdb-scraper-heatmap.git |
---|---|
SSH | $ git clone git@github.com:trflorian/imdb-scraper-heatmap.git |
$ python -m pip install -r requirements.txt
$ python scraper.py
to scrape the IMDB website for a specific series.$ python heatmap.py
to create a plot for the scraped series.$ python .\examples\heatmap.py --help
usage: heatmap.py [-h] [-s] [-d] [-o] [-n NAME]
optional arguments:
-h, --help show this help message and exit
-s, --show show the heatmap plot instead of saving it
-d, --dark use dark mode for the plot style
-o, --override override existing plots, only used if show flag is not set
-n NAME, --name NAME name of the series, if not set the whole data directory will be scanned
python -m build
python -m twine upload --skip-existing dist/*
FAQs
Scraper and heatmap plotter for episode ratings of series on IMDB
We found that seriesheatmap demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Ensure open-source compliance with Socket’s License Enforcement Beta. Set up your License Policy and secure your software!
Product
We're launching a new set of license analysis and compliance features for analyzing, managing, and complying with licenses across a range of supported languages and ecosystems.
Product
We're excited to introduce Socket Optimize, a powerful CLI command to secure open source dependencies with tested, optimized package overrides.