Socket
Socket
Sign inDemoInstall

ngiab-data-preprocess

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

ngiab-data-preprocess

Graphical Tools for creating Next Gen Water model input data.


Maintainers
1

NGIAB Data Preprocess

This repository contains tools for preparing data to run a next gen simulation using NGIAB. The tools allow you to select a catchment of interest on an interactive map, choose a date range, and prepare the data with just a few clicks!

map screenshot

Table of Contents

  1. What does this tool do?
  2. Requirements
  3. Installation and Running
  4. Development Installation
  5. Usage
  6. CLI Documentation

What does this tool do?

This tool prepares data to run a next gen simulation by creating a run package that can be used with NGIAB. It picks default data sources, the v20.1 hydrofabric and nwm retrospective v3 forcing data.

Requirements

  • This tool is officially supported on macOS or Ubuntu (tested on 22.04 & 24.04). To use it on Windows, please install WSL.
  • GDAL needs to be installed.
  • The 'ogr2ogr' command needs to work in your terminal. sudo apt install gdal-bin will install gdal and ogr2ogr on ubuntu / wsl

Installation and Running

# optional but encouraged: create a virtual environment
python3 -m venv env
source env/bin/activate
# installing and running the tool
pip install ngiab_data_preprocess
python -m map_app

The first time you run this command, it will download the hydrofabric and model parameter files from Lynker Spatial. If you already have them, place conus.gpkg and model_attributes.parquet into modules/data_sources/.

Development Installation

Click to expand installation steps

To install and run the tool, follow these steps:

  1. Clone the repository:
    git clone https://github.com/CIROH-UA/NGIAB_data_preprocess
    cd NGIAB_data_preprocess
    
  2. Create a virtual environment and activate it:
    python3 -m venv env
    source env/bin/activate
    
  3. Install the tool:
    pip install -e .
    
  4. Run the map app:
    python -m map_app
    

Usage

Running the command python -m map_app will open the app in a new browser tab. Alternatively, you can manually open it by going to http://localhost:5000 with the app running.

To use the tool:

  1. Select the catchment you're interested in on the map.
  2. Pick the time period you want to simulate.
  3. Click the following buttons in order:
    1. Create subset gpkg
    2. Create Forcing from Zarrs
    3. Create Realization

Once all the steps are finished, you can run NGIAB on the folder shown underneath the subset button.

Note: When using the tool, the output will be stored in the ./output/<your-first-catchment>/ folder. There is no overwrite protection on the folders.

CLI Documentation

Click to expand CLI documentation

Arguments

  • -h, --help: Show the help message and exit.
  • -i INPUT_FILE, --input_file INPUT_FILE: Path to a CSV or TXT file containing a list of catchment IDs, lat/lon pairs, or gage IDs; or a single catchment ID (e.g., cat-5173), a single lat/lon pair, or a single gage ID.
  • -l, --latlon: Use latitude and longitude instead of catchment IDs. When used with -i, the file should contain lat/lon pairs.
  • -g, --gage: Use gage IDs instead of catchment IDs. When used with -i, the file should contain gage IDs.
  • -s, --subset: Subset the hydrofabric to the given catchment IDs, locations, or gage IDs.
  • -f, --forcings: Generate forcings for the given catchment IDs, locations, or gage IDs.
  • -r, --realization: Create a realization for the given catchment IDs, locations, or gage IDs.
  • --start_date START_DATE: Start date for forcings/realization (format YYYY-MM-DD).
  • --end_date END_DATE: End date for forcings/realization (format YYYY-MM-DD).
  • -o OUTPUT_NAME, --output_name OUTPUT_NAME: Name of the subset to be created (default is the first catchment ID in the input file).

Examples

-l, -g, -s, -f, -r can be combined like normal CLI flags. For example, to subset, generate forcings, and create a realization, you can use -sfr or -s -f -r.

  1. Subset hydrofabric using catchment IDs:

    python -m ngiab_data_cli -i catchment_ids.txt -s
    
  2. Generate forcings using a single catchment ID:

    python -m ngiab_data_cli -i cat-5173 -f --start_date 2023-01-01 --end_date 2023-12-31
    
  3. Create realization using lat/lon pairs from a CSV file:

    python -m ngiab_data_cli -i locations.csv -l -r --start_date 2023-01-01 --end_date 2023-12-31 -o custom_output
    
  4. Perform all operations using a single lat/lon pair:

    python -m ngiab_data_cli -i 54.33,-69.4 -l -s -f -r --start_date 2023-01-01 --end_date 2023-12-31
    
  5. Subset hydrofabric using gage IDs from a CSV file:

    python -m ngiab_data_cli -i gage_ids.csv -g -s
    
  6. Generate forcings using a single gage ID:

    python -m ngiab_data_cli -i 01646500 -g -f --start_date 2023-01-01 --end_date 2023-12-31
    

File Formats

1. Catchment ID input:

  • CSV file: A single column of catchment IDs, or a column named 'cat_id', 'catchment_id', or 'divide_id'.
  • TXT file: One catchment ID per line.

Example CSV (catchment_ids.csv):

cat_id,soil_type
cat-5173,some
cat-5174,data
cat-5175,here

Or:

cat-5173
cat-5174
cat-5175

2. Lat/Lon input:

  • CSV file: Two columns named 'lat' and 'lon', or two unnamed columns in that order.
  • Single pair: Comma-separated values passed directly to the -i argument.

Example CSV (locations.csv):

lat,lon
54.33,-69.4
55.12,-68.9
53.98,-70.1

Or:

54.33,-69.4
55.12,-68.9
53.98,-70.1

3. Gage ID input:

  • CSV file: A single column of gage IDs, or a column named 'gage' or 'gage_id'.
  • TXT file: One gage ID per line.
  • Single gage ID: Passed directly to the -i argument.

Example CSV (gage_ids.csv):

gage_id,station_name
01646500,Potomac River
01638500,Shenandoah River
01578310,Susquehanna River

Or:

01646500
01638500
01578310

Output

The script creates an output folder named after the first catchment ID in the input file, the provided output name, or derived from the first lat/lon pair or gage ID. This folder will contain the results of the subsetting, forcings generation, and realization creation operations.

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc