significance-analysis

Significance Analysis for HPO-algorithms performing on multiple benchmarks

0.1.11

PyPI

Maintainers: 1

Readme

Significance Analysis

This package is used to analyse datasets of different HPO-algorithms performing on multiple benchmarks.

Note

As indicated with the v0.x.x version number, Significance Analysis is early stage code and APIs might change in the future.

Documentation

Please have a look at our example. The dataset should have the following format:

system_id (algorithm name)	input_id (benchmark name)	metric (mean/estimate)	optional: bin_id (budget/traininground)
Algorithm1	Benchmark1	x.xxx	1
Algorithm1	Benchmark1	x.xxx	2
Algorithm1	Benchmark2	x.xxx	1
...	...	...	...
Algorithm2	Benchmark2	x..xxx	2

In this dataset, there are two different algorithms, trained on two benchmarks for two iterations each. The variable-names (system_id, input_id...) can be customized, but have to be consistent throughout the dataset, i.e. not "mean" for one benchmark and "estimate" for another. The conduct_analysis function is then called with the dataset and the variable-names as parameters. Optionally the dataset can be binned according to a fourth variable (bin_id) and the analysis is conducted on each of the bins seperately, as shown in the code example above. To do this, provide the name of the bin_id-variable and if wanted the exact bins and bin labels. Otherwise a bin for each unique value will be created.

Installation

Using R, >=4.0.0 install packages: Matrix, emmeans, lmerTest and lme4

Using pip

pip install significance-analysis

Usage

Generate data from HPO-algorithms on benchmarks, saving data according to our format.
Call function conduct_analysis on dataset, while specifying variable-names

In code, the usage pattern can look like this:

import pandas as pd
from signficance_analysis import conduct_analysis

# 1. Generate/import dataset
data = pd.read_csv("./significance_analysis_example/exampleDataset.csv")

# 2. Analyse dataset
conduct_analysis(data, "mean", "acquisition", "benchmark")

For more details and features please have a look at our example.

Keywords

FAQs

What is significance-analysis?

Is significance-analysis well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install