Security News
Fluent Assertions Faces Backlash After Abandoning Open Source Licensing
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Clean deep reinforcement learning codes based on Web MVC architecture with complete unit tests
Implementing deep reinforcement learning algorithms is easy to make up messy codes because interaction loop between an environment and an agent requires a lot of dependencies among classes. Even deep learning requires special skills to build clean codes.
To think out of the box, Web engineers spent years on studying MVC (model-view-controller) architecture to build systems with tidy codes to handle interaction between Web and users. Here, I found that this MVC architecture is very useful insight even for deep reinforcement learning implementation. MVC provides a direction to an architecture with less dependencies, which would be nicer for unit testing.
You can use docker to setup and run experiments.
$ ./scripts/build.sh
Once you built the container, you can start a container with nvidia runtime via ./scripts/up.sh
.
$ ./scripts/up.sh
root@a84ab59aa668:/home/app# ls
Dockerfile README.md example.confing.json graphs mvc scripts tests
LICENSE examples logs requirements.txt test.sh tools
root@a84ab59aa668:/home/app#
You need to install packages written in requirements.txt
and tensorflow.
$ pip install -r requirements.txt
$ pip install tensorflow-gpu tensorflow-probability-gpu
# if you run example scripts
$ pip install pybullet roboschool
If you have a problem of installing tensorflow probability, check tensorflow version.
This repository is also available on PyPI. You can implement extra algorithms built on top of mvc-drl.
$ pip install mvc
:warning: This reposiotry is under development so that interfaces might be frequently changed.
For academic usage, we provide baseline implementations that you might need to compare.
Each point represents an average evaluation reward of 10 episodes. Pretty much same performance has been achieved as a paper of Soft Actor-Critic.
$ python -m examples.ppo --env Ant-v2
$ python -m examples.ddpg --env Ant-v2
$ python -m examples.sac --env Ant-v2 --reward-scale 5
All logging data is saved under logs
directory as csv files and visualization tool data.
Use --log-adapter
option in example codes to switch tensorboard and visdom as visualization (default: tensorboard).
$ tensorboard --logdir logs
To use visdom, you need to fill host information of a visdom server.
$ mv example.config.json config.json
$ vim config.json # fill visdom section
Before running experiments, start the visdom server.
$ visdom
You can visualize with tools/plot_csv.py
by directly pointing to csv files.
$ python tools/plot_csv.py <path to csv> <path to csv> ...
By default, legends are set with paths of files.
If you want to set them manually, use label
option.
$ python tools/plot_csv.py --label=experiment1 --label=experiment2 <path to csv> <path to csv>
To gurantee code quality, all functions and classes including neural networks must have unit tests.
Following command runs all unit tests under tests
directory.
$ ./test.sh
FAQs
Cleanest Deep Reinforcement Learning Implementation Based on Web MVC
We found that mvc demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Research
Security News
Socket researchers uncover the risks of a malicious Python package targeting Discord developers.
Security News
The UK is proposing a bold ban on ransomware payments by public entities to disrupt cybercrime, protect critical services, and lead global cybersecurity efforts.