ncar-jobqueue
ncar-jobqueue
provides utilities for configuring dask-jobqueue with appropriate default settings for NCAR's clusters.
The following compute servers are supported:
- Cheyenne (cheyenne.ucar.edu)
- Casper (DAV) (casper.ucar.edu)
- Hobart (hobart.cgd.ucar.edu)
- Izumi (izumi.unified.ucar.edu)
Badges
Installation
NCAR-jobqueue can be installed from PyPI with pip:
python -m pip install ncar-jobqueue
NCAR-jobqueue is also available from conda-forge for conda installations:
conda install -c conda-forge ncar-jobqueue
Configuration
ncar-jobqueue
provides a custom configuration file with appropriate default settings for different clusters. This configuration file resides in ~/.config/dask/ncar-jobqueue.yaml
:
ncar-jobqueue.yaml
cheyenne:
pbs:
name: dask-worker-cheyenne
cores: 18
memory: '109GB'
processes: 18
interface: ib0
queue: regular
walltime: '01:00:00'
resource-spec: select=1:ncpus=36:mem=109GB
log-directory: '/glade/scratch/${USER}/dask/cheyenne/logs'
local-directory: '/glade/scratch/${USER}/dask/cheyenne/local-dir'
job-extra: []
env-extra: []
death-timeout: 60
casper-dav:
pbs:
name: dask-worker-casper-dav
cores: 2
memory: '25GB'
processes: 1
interface: ib0
walltime: '01:00:00'
resource-spec: select=1:ncpus=1:mem=25GB
queue: casper
log-directory: '/glade/scratch/${USER}/dask/casper-dav/logs'
local-directory: '/glade/scratch/${USER}/dask/casper-dav/local-dir'
job-extra: []
env-extra: []
death-timeout: 60
hobart:
pbs:
name: dask-worker-hobart
cores: 10
memory: '96GB'
processes: 10
queue: medium
walltime: '08:00:00'
resource-spec: nodes=1:ppn=48
log-directory: '/scratch/cluster/${USER}/dask/hobart/logs'
local-directory: '/scratch/cluster/${USER}/dask/hobart/local-dir'
job-extra: ['-r n']
env-extra: []
death-timeout: 60
izumi:
pbs:
name: dask-worker-izumi
cores: 10
memory: '96GB'
processes: 10
queue: medium
walltime: '08:00:00'
resource-spec: nodes=1:ppn=48
log-directory: '/scratch/cluster/${USER}/dask/izumi/logs'
local-directory: '/scratch/cluster/${USER}/dask/izumi/local-dir'
job-extra: ['-r n']
env-extra: []
death-timeout: 60
Note:
- To configure a default project account that is used by
dask-jobqueue
when submitting batch jobs, uncomment the project
key/line in ~/.config/dask/ncar-jobqueue.yaml
and set it to an appropriate value.
Usage
Note:
⚠️ Online documentation for dask-jobqueue
is available here. ⚠️
Casper
>>> from ncar_jobqueue import NCARCluster
>>> from dask.distributed import Client
>>> cluster = NCARCluster(project='XXXXXXXX')
>>> cluster
PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B)
>>> cluster.scale(jobs=2)
>>> client = Client(cluster)
Cheyenne
>>> from ncar_jobqueue import NCARCluster
>>> from dask.distributed import Client
>>> cluster = NCARCluster(project='XXXXXXXX')
>>> cluster
PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B)
>>> cluster.scale(jobs=2)
>>> client = Client(cluster)
Hobart
>>> from ncar_jobqueue import NCARCluster
>>> from dask.distributed import Client
>>> cluster = NCARCluster(project='XXXXXXXX')
>>> cluster
PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B)
>>> cluster.scale(jobs=2)
>>> client = Client(cluster)
Izumi
>>> from ncar_jobqueue import NCARCluster
>>> from dask.distributed import Client
>>> cluster = NCARCluster(project='XXXXXXXX')
>>> cluster
PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B)
>>> cluster.scale(jobs=2)
>>> client = Client(cluster)
Non-NCAR machines
On non-NCAR machines, ncar-jobqueue
will warn the user, and it will use distributed.LocalCluster
:
>>> from ncar_jobqueue import NCARCluster
.../ncar_jobqueue/cluster.py:17: UserWarning: Unable to determine which NCAR cluster you are running on... Returning a `distributed.LocalCluster` class.
warn(message)
>>> from dask.distributed import Client
>>> cluster = NCARCluster()
>>> cluster
LocalCluster(3a7dd0f6, 'tcp://127.0.0.1:64184', workers=4, threads=8, memory=17.18 GB)