
Security News
Nx npm Packages Compromised in Supply Chain Attack Weaponizing AI CLI Tools
Malicious Nx npm versions stole secrets and wallet info using AI CLI tools; Socket’s AI scanner detected the supply chain attack and flagged the malware.
opensafely-jobrunner
Advanced tools
A job runner is a service that encapsulates:
project.yaml
configuration file when
requested via a jobs queue; andQuickrefs:
End users will find more information in the OpenSAFELY documentation.
In production, this software runs as a loop on a secure server within the infrastructure of the primary data provider. It polls an OpenSAFELY job server, looking for requests to run jobs.
Jobs belong to a workspace
. This describes the git repo containing the
OpenSAFELY-compliant project under execution; the git branch, and kind of
database to use. The workspace also acts as a kind of namespace for
partitioning outputs of its jobs.
An OpenSAFELY-compliant repo must provide a project.yaml
file which
describes how a requested job should be converted into a command (& arguments)
that can be run in a subprocess on the secure server. It incorporates the idea
of dependencies, so an action that generates a chart might depend on an action
that extracts data from the database for that chart. See the
Actions reference for more information.
An action can define outputs
; these are persisted on disk and made available
to subsequent actions in the workspace, and users who have permission to log
into the server and view the raw files.
The runner takes care of executing dependencies in order. By default, it skips re-running a dependency whose previous run produced output that still exists in the production environment. The runner also reports status back to the job server, redacting possibly-sensitive information.
The runner is bundled as part of the opensafely-cli tool so users can test their actions locally.
The job server serves jobs as JSON in the following format. First, a job must belong to a workspace:
{
"workspace": {
"name": "my workspace",
"repo": "https://github.com/opensafely/job-integration-tests",
"branch": "master",
"db": "full"
}
}
Possible values for "db"
are "full", "slice", and "dummy".
A workspace is a way of associating jobs related to a given combination of branch, repository and database. To enqueue a job, a client POSTs JSON like this:
{
"backend": "tpp",
"action_id": "do_thing",
"workspace_id": 1
}
A job runner is service installed on a machine that has access to a given
backend. It receives jobs from the server and consumes those whose backend
value matches the
value of the current BACKEND
environment variable.
It must also define three environment variables which are an RFC1838 connection
URL; these correspond to the db
requested in the job's workspace definition,
and as such are named FULL_DATABASE_URL
, SLICE_DATABASE_URL
, and
DUMMY_DATABASE_URL
.
When a job is found, the following happens:
PRIVATE_REPO_ACCESS_TOKEN
supplied in the environment.project.yaml
is parsed:
actions
are extracted from this filedocker run
is
executedEvery action defines a list of outputs
which are persisted to a permanent
storage location. The project author must categorise these outputs as either
highly_sensitive
or moderately_sensitive
. Any pseudonymised data which may
be highly disclosive (e.g. without low number redaction) should be classed as
highly_sensitive
; data which the author believes could be released following
review should be classed as moderately_sensitive
. This design allows tiered
levels of permissions for collaborators to review data outputs. For example, the
study author would usually have access to highly_sensitive
material for
debugging; but other collaborators could have access to moderately_sensitive
data to prepare it for release (for which it is planned to add a
minimally_sensitive
category).
Outputs are therefore persisted to filesystem paths according to the following environment variables:
# A location where cohort CSVs (one row per patient) should be
# stored. This folder must exist.
HIGH_PRIVACY_STORAGE_BASE=/home/opensafely/high_security
# A location where script outputs (some for publication) should be
# stored
MEDIUM_PRIVACY_STORAGE_BASE=/tmp/outputs/medium_security
A valid project file looks like this:
version: "3.0"
expectations:
population_size: 1000
actions:
generate_study_population:
run: cohortextractor:latest generate_cohort --study-definition study_definition
outputs:
highly_sensitive:
cohort: output/input.csv
run_model:
run: stata-mp:latest analysis/model.do
needs: [generate_study_population]
outputs:
moderately_sensitive:
model: models/cox-model.txt
figure: figures/survival-plot.png
See the project pipeline documentation for a detailed description of the project.yaml setup.
The cohortextractor
command-line tool imports this library, and implements the action-parsing-and-running functionality as a series of
synchronous docker commands, rather than asynchronously via the job queue.
Please see the additional information.
FAQs
Unknown package
We found that opensafely-jobrunner demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Malicious Nx npm versions stole secrets and wallet info using AI CLI tools; Socket’s AI scanner detected the supply chain attack and flagged the malware.
Security News
CISA’s 2025 draft SBOM guidance adds new fields like hashes, licenses, and tool metadata to make software inventories more actionable.
Security News
A clarification on our recent research investigating 60 malicious Ruby gems.