Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
rapids-dependency-file-generator
Advanced tools
rapids-dependency-file-generator
is a Python CLI tool that generates conda environment.yaml
files and requirements.txt
files from a single YAML file, typically named dependencies.yaml
.
When installed, it makes the rapids-dependency-file-generator
CLI command available which is responsible for parsing a dependencies.yaml
configuration file and generating the appropriate conda environment.yaml
and requirements.txt
dependency files.
dependencies.yaml
Format
rapids-dependency-file-generator
is available on PyPI. To install, run:
pip install rapids-dependency-file-generator
When rapids-dependency-file-generator
is invoked, it will read a dependencies.yaml
file from the current directory and generate children dependency files.
The dependencies.yaml
file has the following characteristics:
dependencies.yaml
FormatThe Examples section below has instructions on where example
dependency.yaml
files and their corresponding output can be viewed.
The dependencies.yaml
file has three relevant top-level keys: files
, channels
, and dependencies
. These keys are described in detail below.
files
KeyThe top-level files
key is responsible for determining the following:
environment.yaml
files and/or requirements.txt
files)dependencies.yaml
file)dependencies
key should be included in the generated filesHere is an example of what the files
key might look like:
files:
all: # used as the prefix for the generated dependency file names for conda or requirements files (has no effect on pyproject.toml files)
output: [conda, requirements] # which dependency file types to generate. required, can be "conda", "requirements", "pyproject", "none" or a list of non-"none" values
conda_dir: conda/environments # where to put conda environment.yaml files. optional, defaults to "conda/environments"
requirements_dir: python/cudf # where to put requirements.txt files. optional, but recommended. defaults to "python"
pyproject_dir: python/cudf # where to put pyproject.toml files. optional, but recommended. defaults to "python"
matrix: # (optional) contains an arbitrary set of key/value pairs to determine which dependency files that should be generated. These values are included in the output filename.
cuda: ["11.5", "11.6"] # which CUDA version variant files to generate.
arch: [x86_64] # which architecture version variant files to generate. This value should be the result of running the `arch` command on a given machine.
includes: # a list of keys from the `dependencies` section which should be included in the generated files
- build
- test
- runtime
build: # multiple `files` children keys can be specified
output: requirements
conda_dir: conda/environments
requirements_dir: python/cudf
matrix:
cuda: ["11.5"]
arch: [x86_64]
py: ["3.8"]
includes:
- build
The result of the above configuration is that the following dependency files would be generated:
conda/environments/all_cuda-115_arch-x86_64.yaml
conda/environments/all_cuda-116_arch-x86_64.yaml
python/cudf/requirements_all_cuda-115_arch-x86_64.txt
python/cudf/requirements_all_cuda-116_arch-x86_64.txt
python/cudf/requirements_build_cuda-115_arch-x86_64_py-38.txt
The all*.yaml
and requirements_all*.txt
files would include the contents of the build
, test
, and runtime
dependency lists from the top-level dependency
key. The requirements_build*.txt
file would only include the contents of the build
dependency list from the top-level dependency
key.
The value of output
can also be none
as shown below.
files:
test:
output: none
includes:
- test
When output: none
is used, the conda_dir
, requirements_dir
and matrix
keys can be omitted. The use case for output: none
is described in the Additional CLI Notes section below.
extras
A given file may include an extras
entry that may be used to provide inputs specific to a particular file type
Here is an example:
files:
build:
output: pyproject
includes: # a list of keys from the `dependencies` section which should be included in the generated files
- build
extras:
table: table_name
key: key_name
Currently the supported extras by file type are:
table
. This may only be provided for the "project.optional-dependencies" table since the key name is fixed for "build-system" ("requires") and "project" ("dependencies"). Note that this implicitly prohibits including optional dependencies via an inline table under the "project" table.channels
KeyThe top-level channels
key specifies the channels that should be included in any generated conda environment.yaml
files.
It might look like this:
channels:
- rapidsai
- conda-forge
In the absence of a channels
key, some sensible defaults for RAPIDS will be used (see constants.py).
dependencies
KeyThe top-level dependencies
key is where the bifurcated dependency lists should be specified.
Underneath the dependencies
key are sets of key-value pairs. For each pair, the key can be arbitarily named, but should match an item from the includes
list of any files
entry.
The value of each key-value pair can have the following children keys:
common
- contains dependency lists that are the same across all matrix variationsspecific
- contains dependency lists that are specific to a particular matrix combinationThe values of each of these keys are described in detail below.
common
KeyThe common
key contains a list of objects with the following keys:
output_types
- a list of output types (e.g. "conda" for environment.yaml
files or "requirements" for requirements.txt
files) for the packages in the packages
keypackages
- a list of packages to be included in the generated output filespecific
KeyThe specific
key contains a list of objects with the following keys:
output_types
- same as output_types
for the common
key abovematrices
- a list of objects (described below) which define packages that are specific to a particular matrix combinationmatrices
KeyEach list item under the matrices
key contains a matrix
key and a packages
key.
The matrix
key is used to define which matrix combinations from files.[*].matrix
will use the associated packages.
The packages
key is a list of packages to be included in the generated output file for a matching matrix.
This is elaborated on in How Dependency Lists Are Merged.
An example of the above structure is exemplified below:
dependencies:
build: # dependency list name
common: # dependencies common among all matrix variations
- output_types: [conda, requirements] # the output types this list item should apply to
packages:
- common_build_dep
- output_types: conda
packages:
- cupy
- pip: # supports `pip` key for conda environment.yaml files
- some_random_dep
specific: # dependencies specific to a particular matrix combination
- output_types: conda # dependencies specific to conda environment.yaml files
matrices:
- matrix:
cuda: "11.5"
packages:
- cudatoolkit=11.5
- matrix:
cuda: "11.6"
packages:
- cudatoolkit=11.6
- matrix: # an empty matrix entry serves as a fallback if there are no other matrix matches
packages:
- cudatoolkit
- output_types: [conda, requirements]
matrices:
- matrix: # dependencies specific to x86_64 and 11.5
cuda: "11.5"
arch: x86_64
packages:
- a_random_x86_115_specific_dep
- matrix: # an empty matrix/package entry to prevent error from being thrown for non 11.5 and x86_64 matches
packages:
- output_types: requirements # dependencies specific to requirements.txt files
matrices:
- matrix:
cuda: "11.5"
packages:
- another_random_dep=11.5.0
- matrix:
cuda: "11.6"
packages:
- another_random_dep=11.6.0
test:
common:
- output_types: [conda, requirements]
packages:
- pytest
The information from the top-level files
and dependencies
keys are used to determine which dependencies should be included in the final output of the generated dependency files.
Consider the following top-level files
key configuration:
files:
all:
output: conda
conda_dir: conda/environments
requirements_dir: python/cudf
matrix:
cuda: ["11.5", "11.6"]
arch: [x86_64]
includes:
- build
- test
In this example, rapids-dependency-file-generator
will generate two conda environment files: conda/environments/all_cuda-115_arch-x86_64.yaml
and conda/environments/all_cuda-116_arch-x86_64.yaml
.
Since the output
value is conda
, rapids-dependency-file-generator
will iterate through any dependencies.build.common
and dependencies.test.common
list entries and use the packages
of any entry whose output_types
key is conda
or [conda, ...]
.
Further, for the 11.5
and x86_64
matrix combination, any build.specific
and test.specific
list items whose output includes conda
and whose matrices
list items matches any of the definitions below would also be merged:
specific:
- output_types: conda
matrices:
- matrix:
cuda: "11.5"
packages:
- some_dep1
- some_dep2
# or
specific:
- output_types: conda
matrices:
- matrix:
cuda: "11.5"
arch: "x86_64"
packages:
- some_dep1
- some_dep2
# or
specific:
- output_types: conda
matrices:
- matrix:
arch: "x86_64"
packages:
- some_dep1
- some_dep2
Every matrices
list must have a match for a given input matrix (only the first matching matrix in the list of matrices
will be used).
If no matches are found for a particular matrix combination, an error will be thrown.
In instances where an error should not be thrown, an empty matrix
and packages
list item can be used:
- output_types: conda
matrices:
- matrix:
cuda: "11.5"
arch: x86_64
py: "3.8"
packages:
- a_very_specific_115_x86_38_dep
- matrix: # an empty matrix entry serves as a fallback if there are no other matrix matches
packages:
Merged dependency lists are sorted and deduped.
Invoking rapids-dependency-file-generator
without any arguments is meant to be the default behavior for RAPIDS developers. It will generate all of the necessary dependency files as specified in the top-level files
configuration.
However, there are CLI arguments that can augment the files
configuration values before the files are generated.
Consider the example when output: none
is used:
files:
test:
output: none
includes:
- test
The test
file generated by the configuration above is useful for CI, but it might not make sense to necessarily commit those files to a repository. In such a scenario, the following CLI arguments can be used:
ENV_NAME="cudf_test"
rapids-dependency-file-generator \
--file-key "test" \
--output "conda" \
--matrix "cuda=12.5;arch=$(arch)" > env.yaml
mamba env create --file env.yaml
mamba activate "$ENV_NAME"
# install cudf packages built in CI and test them in newly created environment...
The --file-key
argument is passed the test
key name from the files
configuration. Additional flags are used to generate a single dependency file. When the CLI is used in this fashion, it will print to stdout
instead of writing the resulting contents to the filesystem.
The --file-key
, --output
, and --matrix
flags must be used together. --matrix
may be an empty string if the file that should be generated does not depend on any specific matrix variations.
Where multiple values for the same key are passed to --matrix
, e.g. cuda_suffixed=true;cuda_suffixed=false
, only the last value will be used.
Where --file-key
is supplied multiple times in the same invocation, the output printed to stdout
will contain a union (without duplicates) of all of the corresponding dependencies. For example:
rapids-dependency-file-generator \
--file-key "test" \
--file-key "test_notebooks" \
--output "conda" \
--matrix "cuda=12.5;arch=$(arch)" > env.yaml
The --prepend-channel
argument accepts additional channels to use, like rapids-dependency-file-generator --prepend-channel my_channel --prepend-channel my_other_channel
.
If both --output
and --prepend-channel
are provided, the output format must be conda.
Prepending channels can be useful for adding local channels with packages to be tested in CI workflows.
Running rapids-dependency-file-generator -h
will show the most up-to-date CLI arguments.
The tests/examples directory has example dependencies.yaml
files along with their corresponding output files.
To create new example
tests do the following:
dependencies.yaml
file in tests/examplesoutput
directories (e.g. conda_dir
, requirements_dir
, etc.) are set to write to output/actual
rapids-dependency-file-generator --config tests/examples/<new_folder_name>/dependencies.yaml
to generate the initial output filesoutput/actual
to output/expected
, so it will be committed to the repository and used as a baseline for future changesFAQs
Tool for generating RAPIDS environment files
We found that rapids-dependency-file-generator demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.