Socket
Socket
Sign inDemoInstall

cdo-api-py

Package Overview
Dependencies
2
Maintainers
1
Alerts
File Explorer

Install Socket

Detect and block malicious and high-risk dependencies

Install

    cdo-api-py

Python interface to the climate data online api


Maintainers
1

Readme

PyPI version

cdo-api-py

Python interface to cdo api, which is described in full detail here Built to allow quick and easy query for weather data to pandas dataframe objects.

Installation

pip install cdo-api-py

or for python3

pip3 install cdo-api-py

Example Use

To start, you'll need a token, which you can request here.

Import a few libraries and instantiate a client. default_units and default_limit are optional keyword arguments.

from cdo_api_py import Client
import pandas as pd
from datetime import datetime
from pprint import pprint
token = "my_token_here"     # be sure not to share your token publicly
my_client = Client(token, default_units='metric', default_limit=1000)

Once a client has been initialized, we can define a few variables to outline what we really want. Since this repo is just a python client to interface with the CDO api, the user has the option to use keyword arguments that are passed directly to the API and aren't detailed here, so you may need to browse the options available for the dataset of choice.

The example we will use is the very common GHCN-Daily (ghcnd) weather set. We have found the north, south, east, and west lat/lon coordinates that describe the bounding box of the general Washington DC area. Next we define the dates we're interested in (optional) and the dataset id. As an added step, we really want specific values from the dataset so lets save those in a list as well as datatypeid (optional).

extent = {
    "north": 39.14,
    "south": 38.68,
    "east": -76.65,
    "west": -77.35,
}

startdate = datetime(2016, 12, 1)
enddate = datetime(2016, 12, 31)

datasetid='GHCND'
datatypeid=['TMIN', 'TMAX', 'PRCP']

Now we pass all these into a single function call to our client my_client to find stations of interest. We can use return_dataframe=True to automatically assemble the information into a dataframe.

stations = my_client.find_stations(
    datasetid=datasetid,
    extent=extent,
    startdate=startdate,
    enddate=enddate,
    datatypeid=datatypeid,
    return_dataframe=True)
pprint(stations)

Now that we have a list of stations that have data useful to us, we can iterate through the list of stations and pass the stationid argument to a get_data_by_station method.

for rowid, station in stations.iterrows():  # remember this is a pandas dataframe!
    station_data = my_client.get_data_by_station(
        datasetid=datasetid,
        stationid=station['id'],
        startdate=startdate,
        enddate=enddate,
        return_dataframe=True,
        include_station_meta=True   # flatten station metadata with ghcnd readings
    )
    pprint(station_data)

We can modify this slightly to concatenate all the small dataframes into one big dataframe and save it as a CSV.

big_df = pd.DataFrame()
for rowid, station in stations.iterrows():  # remember this is a pandas dataframe!
    station_data = my_client.get_data_by_station(
        datasetid=datasetid,
        stationid=station['id'],
        startdate=startdate,
        enddate=enddate,
        return_dataframe=True,
        include_station_meta=True   # flatten station metadata with ghcnd readings
    )
    pprint(station_data)
    big_df = pd.concat([big_df, station_data])

print(big_df)
big_df = big_df.sort_values(by='date').reset_index()
big_df.to_csv('dc_ghcnd_example_output.csv')

see all the example code here: DC weather data example

It may take a bit of manual searching to familiarize yourself with the NOAA CDO offerings, but once you figure out the arguments you'd like to use, this client should make it quite easy to automate weather data retrievals. There are many requirements and limits as to the nature of requests that the server will allow, and this client will automatically determine if a request must be split up into multiple smaller pieces and create them, send them, and piece the results back together into a single coherent response without any additional effort.


You can explore the endpoints available, either at the CDO documentation site or quickly with

pprint(my_client.list_endpoints())

# returned at time of writing
{'data': 'A datum is an observed value along with any ancillary attributes at '
         'a specific place and time.',
 'datacategories': 'A data category is a general type of data used to group '
                   'similar data types.',
 'datasets': 'A dataset is the primary grouping for data at NCDC',
 'datatypes': 'A data type is a specific type of data that is often unique to '
              'a dataset.',
 'locationcategories': 'A location category is a grouping of similar '
                       'locations.',
 'locations': 'A location is a geopolitical entity.',
 'stations': 'A station is a any weather observing platform where data is '
             'recorded.'}

At the time of writing, there are about 11 available datasets, they are ['GHCND', 'GSOM', 'GSOY', 'NEXRAD2', 'NEXRAD3', 'NORMAL_ANN', 'NORMAL_DLY', 'NORMAL_HLY', 'NORMAL_MLY', 'PRECIP_15', 'PRECIP_HLY']. View the full details with:

pprint(my_client.list_datasets())

There are more than 1000 datatypes, but you can see them all with

pprint(my_client.list_datatypes())

TODO:

  • Another example or two for non GHCND
  • Build a gh-pages branch with sphinx

Keywords

FAQs


Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc