Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Herbie is a python package that downloads recent and archived numerical weather prediction (NWP) model output from different cloud archive sources. NWP data is distributed in GRIB2 format which Herbie reads using xarray+cfgrib. Herbie also provides some extra features to help visualize and extract data.
Herbie helps you discover, download, and read data from:
Much of this data is made available through the NOAA Open Data Dissemination (NODD) program (formerly the Big Data Program) which has made weather data more accessible than ever before.
The easiest way to instal Herbie and its dependencies is with Conda from conda-forge.
conda install -c conda-forge herbie-data
You may also create the provided Conda environment, environment.yml
.
# Download environment file
wget https://github.com/blaylockbk/Herbie/raw/main/environment.yml
# Modify that file if you wish.
# Create the environment
conda env create -f environment.yml
# Activate the environment
conda activate herbie
Alternatively, Herbie is published on PyPI and you can install it with pip, but it requires some dependencies that you will have to install yourself:
When those are installed within your environment, then you can install Herbie with pip.
# Latest published version
pip install herbie-data
# ~~ or ~~
# Most recent changes
pip install git+https://github.com/blaylockbk/Herbie.git
# Dependecies for extra features
pip install herbie-data[extra]
graph TD;
d1[(HRRR)] -..-> H
d2[(RAP)] -.-> H
d3[(GFS)] -..-> H
d33[(GEFS)] -.-> H
d4[(IFS)] -..-> H
d44[(AIFS)] -..-> H
d5[(NBM)] -.-> H
d6[(RRFS)] -..-> H
d7[(RTMA)] -.-> H
d8[(URMA)] -..-> H
H((Herbie))
H --- .inventory
H --- .download
H --- .xarray
style H fill:#d8c89d,stroke:#0c3576,stroke-width:4px,color:#000000
from herbie import Herbie
# Herbie object for the HRRR model 6-hr surface forecast product
H = Herbie(
'2021-01-01 12:00',
model='hrrr',
product='sfc',
fxx=6
)
# Look at file contents
H.inventory()
# Download the full GRIB2 file
H.download()
# Download a subset, like all fields at 500 mb
H.download(":500 mb")
# Read subset with xarray, like 2-m temperature.
H.xarray("TMP:2 m")
Herbie downloads model data from the following sources, but can be extended to include others:
Having trouble using Herbie or have a question? ❔ GitHub Discussions/Ask For Help
Just want to talk about Herbie or have an idea? 💬 GitHub Discussions
See something that might be wrong? 🚑 GitHub Issues
Want to contribute? Great! I'd love your help.
If Herbie played an important role in your work, please tell me about it! Also, consider including a citation or acknowledgement in your article or product.
Suggested Citation
Blaylock, B. K. (YEAR). Herbie: Retrieve Numerical Weather Prediction Model Data (Version 20xx.x.x) [Computer software]. https://doi.org/10.5281/zenodo.4567540
Suggested Acknowledgment
A portion of this work used code generously provided by Brian Blaylock's Herbie python package (Version 20xx.x.x) (https://doi.org/10.5281/zenodo.4567540)
During my PhD at the University of Utah, I created, at the time, the only publicly-accessible archive of HRRR data. Over 1,000 research scientists and professionals used that archive.
Blaylock B., J. Horel and S. Liston, 2017: Cloud Archiving and Data Mining of High Resolution Rapid Refresh Model Output. Computers and Geosciences. 109, 43-50. https://doi.org/10.1016/j.cageo.2017.08.005.
Herbie was then developed to access HRRR data from that archive and was first used on the Open Science Grid.
Blaylock, B. K., J. D. Horel, and C. Galli, 2018: High-Resolution Rapid Refresh Model Data Analytics Derived on the Open Science Grid to Assist Wildland Fire Weather Assessment. J. Atmos. Oceanic Technol., 35, 2213–2227, https://doi.org/10.1175/JTECH-D-18-0073.1.
In the later half of 2020, the HRRR dataset from 2014 to present was made available through the NODD Open Data Dissemination Program (formerly NOAA's Big Data Program). The latest version of Herbie organizes and expands my original download scripts into a more coherent package with the extended ability to download data for other models from many different archive sources, and it will continues to evolve.
I originally released this package under the name “HRRR-B” because it only worked with the HRRR dataset; the “B” was for Brian. Since then, I have added the ability to download many more models including RAP, GFS, ECMWF, GEFS, and RRFS with the potential to add more models in the future. Thus, this package was renamed Herbie, named after one of my favorite childhood movies.
The University of Utah MesoWest group now manages a HRRR archive in Zarr format. Maybe someday, Herbie will be able to take advantage of that archive.
Thanks for using Herbie, and happy racing!
🏁 Brian
P.S. If you like Herbie, check out my other repos:
Note: Alternative Download Tools
As an alternative to Herbie, you can use rclone to download files from AWS or GCP. I love rclone. Here is a short rclone tutorial
| Visualize Structure | Star History | PyPI Download Statistics
FAQs
Download numerical weather prediction GRIB2 model data.
We found that herbie-data demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.