Yet Another Python Experiment Configuration System (yapecs)
yapecs
is a Python library for experiment configuration. It is an
alternative to using JSON or YAML files, or more complex solutions such as
hydra
. With yapecs
,
- Configuration files are written in Python. You do not need to learn new syntax, and your configurations can be as expressive as desired, using, e.g., classes, functions, or built-in types.
- Configuration parameters are bound to the user's module. This reduces code bloat by eliminating the need to pass a configuration dictionary or many individual values through functions.
- Integration is simple, requiring only four or five lines of code (including imports).
Table of contents
Usage
Configuration
Say we are creating a weather
module to predict tomorrow's temperature
given two features: 1) today's temperature and 2) the average temperature
during previous years. Our default configuration file
(e.g., weather/config/defaults.py
) might look like the following.
BATCH_SIZE = 64
LEARNING_RATE = 1e-4
TODAYS_TEMP_FEATURE = True
AVERAGE_TEMP_FEATURE = True
Say we want to run an experiment without using today's temperature as
a feature. We can create a new configuration file (e.g., config.py
) with
just the module name and the changed parameters.
MODULE = 'weather'
TODAYS_TEMP_FEATURE = False
Using yapecs
, we pass our new file using the --config
parameter. For
example, if our weather
module has a training entrypoint train
, we can
use the following.
python -m weather.train --config config.py
You can also pass a list of configuration files. This will apply all
configuration files with a matching MODULE
name, in order.
python -m weather.train --config config-00.py config-01.py ...
Within the weather
module, we make two changes. First, we add the following to module root initialization file weather/__init__.py
.
from .config import defaults
import yapecs
yapecs.configure('weather', defaults)
del defaults
from .config.defaults import *
pass
This assumes that default configuration values are saved in
weather/config/defaults.py
. You can also define configuration values that
depend on other configuration values, and control the import order relative to configuration. Using our weather
module example, we may want to keep track of the total number of features (e.g., to initialize a machine learning model). To do this, we create a file weather/config/static.py
containing the following.
import weather
NUM_FEATURES = (
int(weather.TODAYS_TEMP_FEATURE) +
int(weather.AVERAGE_TEMP_FEATURE))
We update the module root initialization as follows.
...
from .config.defaults import *
from .config.static import *
...
The second change we make is to add --config
as a command-line option. We created a lightweight replacement for argparse.ArgumentParser
, called yapecs.ArgumentParser
, which does this.
Composing configured modules
When working with multiple configurations of the same module, you can load the module multiple times with different configs by using yapecs.compose
.
import yapecs
import weather
weather_compose = yapecs.compose(weather, ['config.py'])
assert weather.TODAYS_TEMP_FEATURE and not weather_compose.TODAYS_TEMP_FEATURE
Hyperparameter search
To perform a hyperparameter grid search, write a config file containing the lists of values to search. Below is an example. Note that we check if weather as the defaults
attribute as a lock on whether or not it is currently being configured. This prevents the progress file from being updated multiple times erroneously.
MODULE = 'weather'
import yapecs
from pathlib import Path
import weather
if hasattr(weather, 'defaults'):
progress_file = Path(__file__).parent / 'grid_search.progress'
learning_rate = [1e-5, 1e-4, 1e-3]
batch_size = [64, 128, 256]
average_temp_feature = [True, False]
LEARNING_RATE, BATCH_SIZE, AVERAGE_TEMP_FEATURE = yapecs.grid_search(
progress_file,
learning_rate,
batch_size,
average_temp_feature)
TODAYS_TEMP_FEATURE = False
You can perform the search by running, e.g.,
while python -m weather --config causal_transformer_search.py; do :; done
This runs training repeatedly, incrementing the progress index and choosing the appropriate config values each time until the search is complete. Running a hyperparameter search in parallel is not (yet) supported.
Application programming interface (API)
yapecs.configure
def configure(
module_name: str,
config_module: ModuleType,
config: Optional[Path] = None
) -> None:
"""Update the configuration values
Arguments
module_name
The name of the module to configure
config_module
The submodule containing configuration values
config
The Python file containing the updated configuration values.
If not provided and the ``--config`` parameter is a command-line
argument, the corresponding argument is used as the configuration
"""
yapecs.compose
def compose(
module: ModuleType,
config_paths: List[Union[str, Path]]
) -> ModuleType:
"""Compose a configured module from a base module and list of configs
Arguments
module
The base module to configure
config_paths
A list of paths to yapecs config files
Returns
composed
A new module made from the base module and configurations
"""
yapecs.grid_search
def grid_search(progress_file: Union[str, os.PathLike], *args: Tuple) -> Tuple:
"""Perform a grid search over configuration arguments
Arguments
progress_file
File to store current search progress
args
Lists of argument values to perform grid search over
Returns
current_args
The arguments that should be used by the current process
"""
yapecs.ArgumentParser
This is a lightweight wrapper around argparse.ArgumentParser
that defines and manages a --config
parameter.
class ArgumentParser(argparse.ArgumentParser):
def parse_args(
self,
args: Optional[List[str]] = None,
namespace: Optional[argparse.Namespace] = None
) -> argparse.Namespace:
"""Parse arguments while allowing unregistered config argument
Arguments
args
Arguments to parse. Default is taken from sys.argv.
namespace
Object to hold the attributes. Default is an empty Namespace.
Returns
Namespace containing program arguments
"""
The following are code repositories that utilize yapecs
for configuration. If you would like to see your repo included, please open a pull request.
emphases
- Crowdsourced and automatic speech prominence estimationpenn
- Pitch-estimating neural networksppgs
- High-fidelity neural phonetic posteriorgramspyfoal
- Python forced alignment