chickenstats
About
chickenstats
is a Python package for scraping & analyzing sports data. With just a few lines of code:
- Scrape & manipulate data from various NHL endpoints, leveraging
chickenstats.chicken_nhl
, which includes
a proprietary xG model for shot quality metrics - Augment play-by-play data & generate custom aggregations from raw csv files downloaded from
Evolving-Hockey (subscription required) with
chickenstats.evolving_hockey
For more in-depth explanations, tutorials, & detailed reference materials, consult the
Documentation.
Compatibility
chickenstats
requires Python 3.10 or greater & runs on the latest stable versions of Linux, macOS, & Windows
operating systems.
Installation
Very simple - install using PyPi. Best practice is to develop in an isolated virtual environment (conda or otherwise),
but who's a chicken to judge?
pip install chickenstats
To confirm installation & confirm the latest version (1.7.8):
pip show chickenstats
Usage
chickenstats
is structured as two underlying modules, each used with different data sources:
chickenstats.chicken_nhl
chickenstats.evolving_hockey
The package is under active development - features will be added or modified over time.
chicken_nhl
The chickenstats.chicken_nhl
module scrapes & manipulates data directly from various NHL endpoints,
with outputs including schedule & game results, rosters, & play-by-play data.
The below example scrapes the schedule for the Nashville Predators, extracts the game IDs, then
scrapes play-by-play data for the first ten regular season games.
from chickenstats.chicken_nhl import Season, Scraper
season = Season(2023)
nsh_schedule = season.schedule('NSH')
nsh_schedule_reg = nsh_schedule.loc[nsh_schedule.game_state == "OFF"].reset_index(drop=True)
game_ids = nsh_schedule_reg.game_id.tolist()[:10]
scraper = Scraper(game_ids)
play_by_play = scraper.play_by_play
evolving_hockey
The chickenstats.evolving_hockey
module manipulates raw csv files downloaded from
Evolving-Hockey. Using their original shifts & play-by-play data, users can add additional
information & aggregate for individual & on-ice statistics,
including high-danger shooting events, xG & adjusted xG, faceoffs, & changes.
import pandas as pd
from chickenstats.evolving_hockey import prep_pbp, prep_stats, prep_lines
raw_shifts = pd.read_csv('./raw_shifts.csv')
raw_pbp = pd.read_csv('./raw_pbp.csv')
play_by_play = prep_pbp(raw_pbp, raw_shifts)
individual_game = prep_stats(play_by_play, level='game', teammates=True, opposition=True)
forward_lines = prep_lines(play_by_play, level='game', position='f', opposition=True)
Acknowledgements
chickenstats
wouldn't be possible without the support & efforts of countless others. I am obviously
extremely grateful, even if there are too many of you to thank individually. However, this chicken will do his best.
First & foremost is my wife - the lovely Mrs. Chicken has been patient, understanding, & supportive throughout the countless
hours of development, sometimes to her detriment.
Sincere apologies to the friends & family that have put up with me since my entry into Python, programming, & data
analysis in January 2021. Thank you for being excited for me & with me throughout all of this, especially when you've
had to fake it...
Thank you to the hockey analytics community on (the artist formerly known as) Twitter. You're producing
& reacting to cutting-edge statistical analyses, while providing a supportive, welcoming environment for newcomers.
Thank y'all for everything that you do. This is by no means exhaustive, but there are a few people worth
calling out specifically:
I'm also grateful to the thriving community of Python educators & open-source contributors on Twitter. Thank y'all
for your knowledge & practical advice. Matt Harrison (@mharrison)
deserves a special mention for his books on Pandas and XGBoost, both of which are available at his online
store. Again, not exhaustive, but others worth thanking individually:
Finally, this library depends on a host of other open-source packages. chickenstats
is possible because of the efforts
of thousands of individuals, represented below: