NGIAB Data Preprocess
This repository contains tools for preparing data to run a next gen simulation using NGIAB. The tools allow you to select a catchment of interest on an interactive map, choose a date range, and prepare the data with just a few clicks!
Table of Contents
- What does this tool do?
- Requirements
- Installation and Running
- Development Installation
- Usage
- CLI Documentation
What does this tool do?
This tool prepares data to run a next gen simulation by creating a run package that can be used with NGIAB.
It uses geometry and model attributes from the v2.2 hydrofabric more information on all data sources here.
The raw forcing data is nwm retrospective v3 forcing data.
- Subset (delineate) everything upstream of your point of interest (catchment, gage, flowpath etc). Outputs as a geopackage.
- Calculates Forcings as a weighted mean of the gridded AORC forcings. Weights are calculated using exact extract and computed with numpy.
- Creates configuration files needed to run nextgen.
- realization.json - ngen model configuration
- troute.yaml - routing configuration.
- per catchment model configuration
- Optionally Runs a non-interactive Next gen in a box.
What does it not do?
Evaluation
For automatic evaluation using Teehr, please run NGIAB interactively using the guide.sh
script.
Visualisation
For automatic interactive visualisation, please run NGIAB interactively using the guide.sh
script
Requirements
- This tool is officially supported on macOS or Ubuntu (tested on 22.04 & 24.04). To use it on Windows, please install WSL.
Installation and Running
(notebook) jovyan@jupyter-user:~$ conda deactivate
jovyan@jupyter-user:~$
python3 -m venv .venv
source .venv/bin/activate
pip install 'ngiab_data_preprocess'
python -m map_app
The first time you run this command, it will download the hydrofabric from Lynker Spatial. If you already have it, place conus_nextgen.gpkg
into ~/.ngiab/hydrofabric/v2.2/
.
Development Installation
Click to expand installation steps
To install and run the tool, follow these steps:
- Clone the repository:
git clone https://github.com/CIROH-UA/NGIAB_data_preprocess
cd NGIAB_data_preprocess
- Create a virtual environment and activate it:
python3 -m venv env
source env/bin/activate
- Install the tool:
pip install -e .
- Run the map app:
python -m map_app
Usage
Running the command python -m map_app
will open the app in a new browser tab.
To use the tool:
- Select the catchment you're interested in on the map.
- Pick the time period you want to simulate.
- Click the following buttons in order:
- Create subset gpkg
- Create Forcing from Zarrs
- Create Realization
Once all the steps are finished, you can run NGIAB on the folder shown underneath the subset button.
Note: When using the tool, the default output will be stored in the ~/ngiab_preprocess_output/<your-input-feature>/
folder. There is no overwrite protection on the folders.
CLI Documentation
Arguments
-h
, --help
: Show the help message and exit.-i INPUT_FEATURE
, --input_feature INPUT_FEATURE
: ID of feature to subset. Providing a prefix will automatically convert to catid, e.g., cat-5173 or gage-01646500 or wb-1234.-l
, --latlon
: Use latitude and longitude instead of catid. Expects comma-separated values via the CLI, e.g., python -m ngiab_data_cli -i 54.33,-69.4 -l -s
.-g
, --gage
: Use gage ID instead of catid. Expects a single gage ID via the CLI, e.g., python -m ngiab_data_cli -i 01646500 -g -s
.-s
, --subset
: Subset the hydrofabric to the given feature.-f
, --forcings
: Generate forcings for the given feature.-r
, --realization
: Create a realization for the given feature.--start_date START_DATE
, --start START_DATE
: Start date for forcings/realization (format YYYY-MM-DD).--end_date END_DATE
, --end END_DATE
: End date for forcings/realization (format YYYY-MM-DD).-o OUTPUT_NAME
, --output_name OUTPUT_NAME
: Name of the output folder.-D
, --debug
: Enable debug logging.--run
: Automatically run Next Gen against the output folder.--validate
: Run every missing step required to run ngiab.-a
, --all
: Run all operations: subset, forcings, realization, run Next Gen
Usage Notes
- If your input has a prefix of
gage-
, you do not need to pass -g
. - The
-l
, -g
, -s
, -f
, -r
flags can be combined like normal CLI flags. For example, to subset, generate forcings, and create a realization, you can use -sfr
or -s -f -r
. - When using the
--all
flag, it automatically sets subset
, forcings
, realization
, and run
to True
. - Using the
--run
flag automatically sets the --validate
flag.
Examples
-
Prepare everything for a nextgen run at a given gage:
python -m ngiab_data_cli -i gage-10154200 -sfr --start 2022-01-01 --end 2022-02-28
-
Subset hydrofabric using catchment ID:
python -m ngiab_data_cli -i cat-7080 -s
-
Generate forcings using a single catchment ID:
python -m ngiab_data_cli -i cat-5173 -f --start 2022-01-01 --end 2022-02-28
-
Create realization using a lat/lon pair and output to a named folder:
python -m ngiab_data_cli -i 54.33,-69.4 -l -r --start 2022-01-01 --end 2022-02-28 -o custom_output
-
Perform all operations using a lat/lon pair:
python -m ngiab_data_cli -i 54.33,-69.4 -l -s -f -r --start 2022-01-01 --end 2022-02-28
-
Subset hydrofabric using gage ID:
python -m ngiab_data_cli -i 10154200 -g -s
python -m ngiab_data_cli -i gage-10154200 -s
-
Generate forcings using a single gage ID:
python -m ngiab_data_cli -i 01646500 -g -f --start 2022-01-01 --end 2022-02-28
-
Run all operations, including Next Gen and evaluation/plotting:
python -m ngiab_data_cli -i cat-5173 -a --start 2022-01-01 --end 2022-02-28