Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
MLGO is a framework for integrating ML techniques systematically in LLVM. It replaces human-crafted optimization heuristics in LLVM with machine learned models. The MLGO framework currently supports two optimizations:
The compiler components are both available in the main LLVM repository. This repository contains the training infrastructure and related tools for MLGO.
We currently use two different ML algorithms: Policy Gradient and Evolution Strategies to train policies. Currently, this repository only support Policy Gradient training. The release of Evolution Strategies training is on our roadmap.
Check out this demo for an end-to-end demonstration of how to train your own inlining-for-size policy from the scratch with Policy Gradient, or check out this demo for a demonstration of how to train your own regalloc-for-performance policy.
For more details about MLGO, please refer to our paper MLGO: a Machine Learning Guided Compiler Optimizations Framework.
For more details about how to contribute to the project, please refer to contributions.
We occasionally release pretrained models that may be used as-is with LLVM. Models are released as github releases, and are named as [task]-[major-version].[minor-version].The versions are semantic: the major version corresponds to breaking changes on the LLVM/compiler side, and the minor version corresponds to model updates that are independent of the compiler.
When building LLVM, there is a flag -DLLVM_INLINER_MODEL_PATH
which you may
set to the path to your inlining model. If the path is set to download
, then
cmake will download the most recent (compatible) model from github to use. Other
values for the flag could be:
# Model is in /tmp/model, i.e. there is a file /tmp/model/saved_model.pb along
# with the rest of the tensorflow saved_model files produced from training.
-DLLVM_INLINER_MODEL_PATH=/tmp/model
# Download the most recent compatible model
-DLLVM_INLINER_MODEL_PATH=download
Currently, the assumptions for the system are:
Training assumes a clang build with ML 'development-mode'. Please refer to:
The model training - specific prerequisites are:
Pipenv:
pip3 install pipenv
The actual dependencies:
pipenv sync --system
Note that the above command will only work from the root of the repository
since it needs to have Pipfile.lock
in the working directory at the time
of execution.
If you plan on doing development work, make sure you grab the development and CI categories of packages as well:
pipenv sync --system --categories "dev-packages ci"
Optionally, to run tests (run_tests.sh), you also need:
sudo apt-get install virtualenv
Note that the same tensorflow package is also needed for building the 'release' mode for LLVM.
An end-to-end demo using Fuchsia as a codebase from which we extract a corpus and train a model.
FAQs
Tooling for ML in LLVM
We found that ml-compiler-opt demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.