pysqa
High-performance computing (HPC) does not have to be hard. In this context the aim of the Python Simple Queuing System
Adapter (pysqa
) is to simplify the submission of tasks from python to HPC clusters as easy as starting another
subprocess
locally. This is achieved based on the assumption that even though modern HPC queuing systems offer a wide
range of different configuration options, most users submit the majority of their jobs with very similar parameters.
Therefore, in pysqa
users define submission script templates once and reuse them to submit many different tasks and
workflows afterwards. These templates are defined in the jinja2 template language,
so current submission scripts can be easily converted to templates. In addition, to the submission of new tasks to HPC
queuing systems, pysqa
also allows the users to track the progress of their tasks, delete them or enable reservations
using the built-in functionality of the queuing system. Finally, pysqa
enables remote connections to HPC clusters
using SSH including support for two factor authentication via pyauthenticator,
this allows the users to submit task from a python process on their local workstation to remote HPC clusters.
All this functionality is available from both the Python interface
as well as the command line interface.
Features
The core feature of pysqa
is the communication to HPC queuing systems including (Flux,
LFS, MOAB,
SGE, SLURM
and TORQUE). This includes:
QueueAdapter().submit_job()
- Submission of new tasks to the queuing system.QueueAdapter().get_queue_status()
- List of calculation currently waiting or running on the queuing system.QueueAdapter().delete_job()
- Deleting calculation which are currently waiting or running on the queuing system.QueueAdapter().queue_list
- List of available queue templates created by the user.QueueAdapter().config
- Templates to a specific number of cores, run time or other computing resources. With
integrated checks if a given submitted task follows these restrictions.
In addition to these core features, pysqa
is continuously extended to support more use cases for a larger group of
users. These new features include the support for remote queuing systems:
- Remote connection via the secure shell protocol (SSH) to access remote HPC clusters.
- Transfer of files to and from remote HPC clusters, based on a predefined mapping of the remote file system into the
local file system.
- Support for both individual connections as well as continuous connections depending on the network availability.
Finally, there is current work in progress to support a combination of multiple local and remote queuing systems
from within pysqa
, which are represented to the user as a single resource.
Documentation
License
pysqa
is released under the BSD license . It is a spin-off of the
pyiron project therefore if you use pysqa
for calculation which result in a scientific
publication, please cite:
@article{pyiron-paper,
title = {pyiron: An integrated development environment for computational materials science},
journal = {Computational Materials Science},
volume = {163},
pages = {24 - 36},
year = {2019},
issn = {0927-0256},
doi = {https://doi.org/10.1016/j.commatsci.2018.07.043},
url = {http://www.sciencedirect.com/science/article/pii/S0927025618304786},
author = {Jan Janssen and Sudarsan Surendralal and Yury Lysogorskiy and Mira Todorova and Tilmann Hickel and Ralf Drautz and Jörg Neugebauer},
keywords = {Modelling workflow, Integrated development environment, Complex simulation protocols},
}