chickenstats
About
chickenstats
is a Python package for scraping & analyzing sports data. With just a few lines of code:
- Scrape & manipulate data from various NHL endpoints, leveraging
chickenstats.chicken_nhl
, which includes
a proprietary xG model for shot quality metrics - Augment play-by-play data & generate custom aggregations from raw csv files downloaded from
Evolving-Hockey (subscription required) with
chickenstats.evolving_hockey
For more in-depth explanations, tutorials, & detailed reference materials, consult the
Documentation.
Compatibility
chickenstats
requires Python 3.10 or greater & runs on the latest stable versions of Linux, Mac, & Windows
operating systems.
Installation
Very simple - install using PyPi. Best practice is to develop in an isolated virtual environment (conda or otherwise),
but who's a chicken to judge?
pip install chickenstats
To confirm installation & confirm the latest version (1.8.0):
pip show chickenstats
Usage
chickenstats
is structured as two underlying modules, each used with different data sources:
chickenstats.chicken_nhl
chickenstats.evolving_hockey
The package is under active development - features will be added or modified over time.
chicken_nhl
chickenstats.chicken_nhl
allows you to scrape play-by-play data and aggregate individual, line, and team statistics.
After importing the module, scrape the schedule for game IDs, then play-by-play data for your team of choice:
from chickenstats.chicken_nhl import Season, Scraper
season = Season(2024)
schedule = season.schedule("NSH")
game_ids = schedule.loc[schedule.game_state == "OFF"].game_id.tolist()
scraper = Scraper(game_ids)
play_by_play = scraper.play_by_play
You can then aggregate the play-by-play data for individual and on-ice statistics with one line of code:
stats = scraper.stats
It's very easy to introduce additional detail to the aggregations, including for teammates on-ice:
scraper.prep_stats(teammates=True)
stats = scraper.stats
There is similar functionality for line and team stats:
scraper.prep_lines(position="f")
forward_lines = scraper.lines
team_stats = scraper.team_stats
For additional information on usage and functionality, consult the relevant
user guide
evolving_hockey
The chickenstats.evolving_hockey
module manipulates raw csv files downloaded from
Evolving-Hockey. Using their original shifts & play-by-play data, users can add additional
information & aggregate for individual & on-ice statistics,
including high-danger shooting events, xG & adjusted xG, faceoffs, & changes.
First, prep a play-by-play dataframe using raw play-by-play and shifts CSV files from the
Evolving-Hockey website:
import pandas as pd
from chickenstats.evolving_hockey import prep_pbp, prep_stats, prep_lines
raw_shifts = pd.read_csv('./raw_shifts.csv')
raw_pbp = pd.read_csv('./raw_pbp.csv')
play_by_play = prep_pbp(raw_pbp, raw_shifts)
You can use the play_by_play dataframe in various aggregations. This will return individual game statistics,
including on-ice (e.g., GF, xGF) & usage (i.e., zone starts), accounting for teammates & opposition on-ice:
individual_game = prep_stats(play_by_play, level='game', teammates=True, opposition=True)
This will return game statistics for forward-line combinations, accounting for opponents on-ice:
forward_lines = prep_lines(play_by_play, level='game', position='f', opposition=True)
For additional information on usage and functionality, consult the relevant
user guide
Help
If you need help with any aspect of chickenstats
, from installation to usage, please don't hesitate to reach out!
You can find me on :material-bluesky: Bluesky at @chickenandstats.com or :material-email:
email me at chicken@chickenandstats.com.
Please report any bugs or issues via the chickenstats
issues page, where you can also post feature requests.
Before doing so, please check the roadmap, there might already be plans to include your request.
Acknowledgements
chickenstats
wouldn't be possible without the support & efforts of countless others. I am obviously
extremely grateful, even if there are too many of you to thank individually. However, this chicken will do his best.
First & foremost is my wife - the lovely Mrs. Chicken has been patient, understanding, & supportive throughout the countless
hours of development, sometimes to her detriment.
Sincere apologies to the friends & family that have put up with me since my entry into Python, programming, & data
analysis in January 2021. Thank you for being excited for me & with me throughout all of this, especially when you've
had to fake it...
Thank you to the hockey analytics community on (the artist formerly known as) Twitter. You're producing
& reacting to cutting-edge statistical analyses, while providing a supportive, welcoming environment for newcomers.
Thank y'all for everything that you do. This is by no means exhaustive, but there are a few people worth
calling out specifically:
I'm also grateful to the thriving community of Python educators & open-source contributors on Twitter. Thank y'all
for your knowledge & practical advice. Matt Harrison (@mharrison)
deserves a special mention for his books on Pandas and XGBoost, both of which are available at his online
store. Again, not exhaustive, but others worth thanking individually:
Finally, this library depends on a host of other open-source packages. chickenstats
is possible because of the efforts
of thousands of individuals, represented below: