Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Basic framework for training stuff in PyTorch. It's quite tailored to projects
I've been working on lately, so it's meant for personal use. Its sole purpose is
to do away with boilr
plate code, and having it here makes it easier to
share it across projects.
pip install boilr
There's a usage example that can be useful as template. It's a basic VAE for MNIST quickly hacked together. The example files are:
example.py
example_evaluate.py
experiments/mnist_experiment/data.py
experiments/mnist_experiment/experiment_manager.py
models/mnist_vae.py
Install requirements and run the example:
pip install -r requirements.txt
CUDA_VISIBLE_DEVICES=0 python example.py
For evaluation:
CUDA_VISIBLE_DEVICES=0 python example_evaluate.py --ll --ll-samples 100 --load $RUN_NAME
using the name of the folder in output/
generated from running the example.
The following functionalities are available out-of-the-box:
tqdm
. Can be switched off.boilr.nn
and boilr.utils
(most of them for internal use).
In particular boilr.nn.modules
and boilr.utils.viz
might be more generally useful.DataLoader
is necessary, the batch size is easily accessible with args.batch_size
; and when creating the optimizer, the learning rate is args.lr
.boilr.options
for package-wide options. Usually it's not necessary to change them, but they give some more flexibility.There are built-in command-line arguments with default values. These defaults can be easily
overridden programmatically when making the experiment class that subclasses boilr
's.
The built-in arguments are the following:
batch-size
: training batch size (default: None)test-batch-size
: test batch size (default: None)lr
: learning rate (default: None)max-grad-norm
: maximum global norm of the gradient. It is clipped if larger. If None, no clipping is performed. (default: None)seed
: random seed (default: 54321)tr-log-every
: log training metrics every this number of training steps (default: 1000)ts-log-every
: log test metrics every this number of training steps. It must be a multiple of --tr-log-every
(default: 1000)ts-img-every
: save test images every this number of training steps. It must be a multiple of --ts-log-every
(default: same as --ts-log-every
)checkpoint-every
: save model checkpoint every this number of training steps (default: 1000)keep-checkpoint-max
: keep at most this number of most recent model checkpoints (default: 3)max-steps
: max number of training steps (default: 1e10)max-epochs
: max number of training epochs (default: 1e7)nocuda
: do not use cuda (default: False)descr
: additional description for experiment namedry-run
: do not save anything to disk (default: False)resume
: load the run with this name and resume trainingAdditionally, for VAEExperimentManager
, the following arguments are available:
ll-every
: evaluate log likelihood (with the importance-weighted bound) every this number of training steps (default: 50000)ll-samples
: number of importance-weighted samples to evaluate log likelihood (default: 100)boilr.Trainer
, and runs the trainer;See below for more details.
The class boilr.data.BaseDatasetManager
must be subclassed. The subclass must implement
the method _make_datasets
which should return a tuple (train, test)
with the training
and test sets as PyTorch Dataset
s.
A basic implementation of _make_dataloaders
is already provided, but can be overridden to make
custom data loaders.
One of the model classes must be subclassed to inherit core methods in the base implementation boilr.models.BaseModel
.
These models also automatically subclass torch.nn.Module
(so it must implement forward
).
In addition, boilr.models.BaseGenerativeModel
(subclassing BaseModel
) defines a method sample_prior
that must be implemented by subclasses.
One of the base experiment classes in boilr.experiments
must be subclassed. The subclass must implement:
_make_datamanager
to create the dataset manager, which should subclass boilr.data.BaseDatasetManager
;_make_model
to create the model, which should subclass boilr.models.BaseModel
;_make_optimizer
to create the optimizer, which should subclass torch.optim.optimizer.Optimizer
;forward_pass
to perform a simple single-pass model evaluation and returns losses and metrics;test_procedure
to evaluate the model on the test set (usually heavily based on the forward_pass
method).Typically should be overridden:
_define_args_defaults
, _add_args
, and _check_args
(or a subset of these) to manage parsing of command-line arguments;_make_run_description
which returns a string description of the run, used for output folders;save_images
to save output images (e.g. reconstructions and samples in VAEs).May be overridden for additional control:
post_backward_callback
is called by the Trainer
after the backward pass but before the optimization step;get_metrics_dict
translates a dictionary of results to a dictionary of metrics to be logged (by default this simply copies over the keys);train_log_str
and test_log_str
return log strings for test and training metrics.Note: The class VAEExperimentManager
implements default test_procedure
and save_images
methods for variational inference with VAEs.
from boilr import Trainer
from my_experiment import MyExperimentClass
if __name__ == "__main__":
experiment = MyExperimentClass()
trainer = Trainer(experiment)
trainer.run()
If offline evaluation is necessary, boilr.eval.BaseOfflineEvaluator
can be subclassed by implementing:
run
to run the evaluation;_define_args_defaults
, _add_args
, and _check_args
(or a subset of these) to manage parsing of command-line arguments.The method run
can be executed by simply calling the evaluator object.
See example_evaluate.py
.
tensorboard
, but it won't save tensorboard logs.FAQs
Basic framework for training models with PyTorch
We found that boilr demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.